Similar authors to follow
Manage your follows
Customers Also Bought Items By
The design patterns in this book capture best practices and solutions to recurring problems in machine learning. The authors, three Google engineers, catalog proven methods to help data scientists tackle common problems throughout the ML process. These design patterns codify the experience of hundreds of experts into straightforward, approachable advice.
In this book, you will find detailed explanations of 30 patterns for data and problem representation, operationalization, repeatability, reproducibility, flexibility, explainability, and fairness. Each pattern includes a description of the problem, a variety of potential solutions, and recommendations for choosing the best technique for your situation.
You'll learn how to:
- Identify and mitigate common challenges when training, evaluating, and deploying ML models
- Represent data for different ML model types, including embeddings, feature crosses, and more
- Choose the right model type for specific problems
- Build a robust training loop that uses checkpoints, distribution strategy, and hyperparameter tuning
- Deploy scalable ML systems that you can retrain and update to reflect new data
- Interpret model predictions for stakeholders and ensure models are treating users fairly
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build on top of the Google Cloud Platform (GCP). This hands-on guide shows developers entering the data science field how to implement an end-to-end data pipeline, using statistical and machine learning methods and tools on GCP. Through the course of the book, you’ll work through a sample business decision by employing a variety of data science approaches.
Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science.
You’ll learn how to:
- Automate and schedule data ingest, using an App Engine application
- Create and populate a dashboard in Google Data Studio
- Build a real-time analysis pipeline to carry out streaming analytics
- Conduct interactive data exploration with Google BigQuery
- Create a Bayesian model on a Cloud Dataproc cluster
- Build a logistic regression machine-learning model with Spark
- Compute time-aggregate features with a Cloud Dataflow pipeline
- Create a high-performing prediction model with TensorFlow
- Use your deployed model as a microservice you can access from both batch and real-time pipelines
Work with petabyte-scale datasets while building a collaborative, agile workplace in the process. This practical book is the canonical reference to Google BigQuery, the query engine that lets you conduct interactive analysis of large datasets. BigQuery enables enterprises to efficiently store, query, ingest, and learn from their data in a convenient framework. With this book, you’ll examine how to analyze data at scale to derive insights from large datasets efficiently.
Valliappa Lakshmanan, tech lead for Google Cloud Platform, and Jordan Tigani, engineering director for the BigQuery team, provide best practices for modern data warehousing within an autoscaled, serverless public cloud. Whether you want to explore parts of BigQuery you’re not familiar with or prefer to focus on specific tasks, this reference is indispensable.