Practical Data Engineering in GCP: Beginner to Advanced
Practical Data Engineering in GCP: Beginner to Advanced
In this course, we will be creating a data lake using Google Cloud Storage and bring data warehouse capabilites to the data lake to form the lakehouse architecture using Google BigQuery. We will be building four no code data pipelines using services such as DataStream, Dataflow, DataPrep, Pub/Sub, Data Fusion, Cloud Storage, BigQuery etc.
The course will follow a logical progression of a real world project implementation with hands on experience of setting up a data lake, creating data pipelines for ingestion and transforming your data in preparation for analytics and reporting.
Chapter 1
We will setup a project in Google Cloud
Introduction to Google Cloud Storage
Introduction to Google BigQuery
Chapter 2 - Data Pipeline 1
We will create a cloud SQL database and populate with data before we start performing complex ETL jobs.
Use DataStream Change Data Capture for streaming data from our Cloud SQL Database into our Data lake built with Cloud Storage
Add a pub/sub notification to our bucket
Create a Dataflow Pipeline for streaming jobs into BigQuery
Chapter 3 - Data Pipeline 2
Introduce Google Data Fusion
Author and monitor ETL jobs for tranforming our data and moving them between different zone of our data lake
We will explore the use of Wrangler in Data Fusion for profiling and understanding our data before we starting performing complex ETL jobs.
Clean and normalise data
Discover and govern data using metadata in Data Fusion
Chapter 4 - Data Pipeline 3
Introduction to Google Pub/Sub
Building a .Net application for publishing data to a Pub/Sub topic
Building a realtime data pipeline for streaming messages to BigQuery
Chapter 5 - Data Pipeline 4
Introduction to Cloud DataPrep
Profile, Author and monitor ETL jobs for tranforming our data using DataPrep
Step by step guide to building four data pipelines in Google Cloud using DataStream, Data Fusion, DataPrep, DataFlow etc
Url: View Details
What you will learn
- How to build No Code/Codeless data pipelines in Google Cloud
- You will learn to build real-world data pipelines usings tools like Data Fusion, DataPrep and Dataflow
- You will learn to transform data using Data Fusion
Rating: 4.1
Level: All Levels
Duration: 4.5 hours
Instructor: Yomi Owoyemi
Courses By: 0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
About US
The display of third-party trademarks and trade names on this site does not necessarily indicate any affiliation or endorsement of hugecourses.com.
View Sitemap