English | 2024 | EPUB, Converted PDF | 74 MB
Trâm Ngọc Phạm, Gonzalo Herreros González, Viquar Khan, Huda Nofal, 1805127284, 9781805126850, 9781805127284, 978-1805126850, 978-1805127284, B0DFCS5CJZ
Master AWS data engineering services and techniques for orchestrating pipelines, building layers, and managing migrations
Key Features
- Get up to speed with the different AWS technologies for data engineering
- Learn the different aspects and considerations of building data lakes, such as security, storage, and operations
- Get hands on with key AWS services such as Glue, EMR, Redshift, QuickSight, and Athena for practical learning
- Purchase of the print or Kindle book includes a free PDF eBook
Book Description
Performing
data engineering with Amazon Web Services (AWS) combines AWS's scalable
infrastructure with robust data processing tools, enabling efficient
data pipelines and analytics workflows. This comprehensive guide to AWS
data engineering will teach you all you need to know about data lake
management, pipeline orchestration, and serving layer construction.
Through
clear explanations and hands-on exercises, you’ll master essential AWS
services such as Glue, EMR, Redshift, QuickSight, and Athena.
Additionally, you’ll explore various data platform topics such as data
governance, data quality, DevOps, CI/CD, planning and performing data
migration, and creating Infrastructure as Code. As you progress, you
will gain insights into how to enrich your platform and use various AWS
cloud services such as AWS EventBridge, AWS DataZone, and AWS SCT and
DMS to solve data platform challenges.
Each recipe in
this book is tailored to a daily challenge that a data engineer team
faces while building a cloud platform. By the end of this book, you will
be well-versed in AWS data engineering and have gained proficiency in
key AWS services and data processing techniques. You will develop the
necessary skills to tackle large-scale data challenges with confidence.
What you will learn
- Define your centralized data lake solution, and secure and operate it at scale
- Identify the most suitable AWS solution for your specific needs
- Build data pipelines using multiple ETL technologies
- Discover how to handle data orchestration and governance
- Explore how to build a high-performing data serving layer
- Delve into DevOps and data quality best practices
- Migrate your data from on-premises to AWS
Who this book is for
If
you're involved in designing, building, or overseeing data solutions on
AWS, this book provides proven strategies for addressing challenges in
large-scale data environments. Data engineers as well as big data
professionals looking to enhance their understanding of AWS features for
optimizing their workflow, even if they're new to the platform, will
find value. Basic familiarity with AWS security (users and roles) and
command shell is recommended.
Table of Contents
- Managing Data Lake Storage
- Sharing Your Data Across Environments and Accounts
- Ingesting and Transforming Your Data with AWS Glue
- A Deep Dive into AWS Orchestration Frameworks
- Running Big Data Workloads with Amazon EMR
- Governing Your Platform
- Data Quality Management
- DevOps – Defining IaC and Building CI/CD Pipelines
- Monitoring Data Lake Cloud Infrastructure
- Building a Serving Layer with AWS Analytics Services
- Migrating to AWS – Steps, Strategies, and Best Practices for Modernizing Your Analytics and Big Data Workloads
- Harnessing the Power of AWS for Seamless Data Warehouse Migration
- Strategizing Hadoop Migrations – Cost, Data, and Workflow Modernization with AWS