![Johnny Chivers](/img/default-banner.jpg)
- 112
- 1 179 974
Johnny Chivers
United Kingdom
Registrace 2. 05. 2020
Brought to you by Johnny Chivers this channel provides a platform to gain the skills of an AWS data engineer through lessons and vlogs.
I have spent the last decade being immersed in the world of big data working as a consultant for some the globe's biggest companies.
My journey into the world of data was not the most conventional. I started my career working as performance analyst in professional sport at the top level's of both rugby and football. I then transitioned into a carrer in data and computing. This journey culminated in the study of a Masters degree in Software development. Alongside many a professional certification in AWS and MS SQL Server.
AWS, GCP and MS SQL Server have became my areas of expertise over the years. I have had the privilege of traveling the world to help companies develop innovative solutions to their business problems, so they can derive maximum market value.
I am available for both physical and virtual consulting, as well as tech talks.
I have spent the last decade being immersed in the world of big data working as a consultant for some the globe's biggest companies.
My journey into the world of data was not the most conventional. I started my career working as performance analyst in professional sport at the top level's of both rugby and football. I then transitioned into a carrer in data and computing. This journey culminated in the study of a Masters degree in Software development. Alongside many a professional certification in AWS and MS SQL Server.
AWS, GCP and MS SQL Server have became my areas of expertise over the years. I have had the privilege of traveling the world to help companies develop innovative solutions to their business problems, so they can derive maximum market value.
I am available for both physical and virtual consulting, as well as tech talks.
The Top AWS Services A Data Engineer Should Know In 2024
In this video we take a look at top AWS services you should know as a data engineer. We cover a use case from ingestion through to analytics looking at the best ways to orchestrate our pipelines.
SUPPORT THE CHANNEL:
ℹ️ Udemy Practice Exams: www.udemy.com/course/practice-exams-aws-certified-data-analytics-specialty-o/?referralCode=484C33C8FCA5C93803A5
☕ Buy Me A Coffee: www.buymeacoffee.com/johnnychivers
🖥️ My VPN: go.nordvpn.net/aff_c?offer_id=612&aff_id=74288&url_id=14830
▬▬▬▬▬▬ T I M E S T A M P S ⏰ ▬▬▬▬▬▬
00:43 - Ingest
02:22 - Storage
03:28 - Analytics
04:28 - Orchestration
05:29 - Monitoring & Discoverability
06:08 - AI/ML
06:49 - Outro
The video covers realtime ingestion using Amazon Kinesis as well as batch ingestion in with AWS Lambda, AWS Glue and Amazon EMR. We look at how we can store this data in Amazon S3, Amazon DynamoDB and Amazon DynamoDB before using Amazon Quicksight to build dashboards.
😎 About me
I have spent the last decade being immersed in the world of big data working as a consultant for some the globe's biggest companies.My journey into the world of data was not the most conventional. I started my career working as performance analyst in professional sport at the top level's of both rugby and football. I then transitioned into a career in data and computing. This journey culminated in the study of a Masters degree in Software
SUPPORT THE CHANNEL:
ℹ️ Udemy Practice Exams: www.udemy.com/course/practice-exams-aws-certified-data-analytics-specialty-o/?referralCode=484C33C8FCA5C93803A5
☕ Buy Me A Coffee: www.buymeacoffee.com/johnnychivers
🖥️ My VPN: go.nordvpn.net/aff_c?offer_id=612&aff_id=74288&url_id=14830
▬▬▬▬▬▬ T I M E S T A M P S ⏰ ▬▬▬▬▬▬
00:43 - Ingest
02:22 - Storage
03:28 - Analytics
04:28 - Orchestration
05:29 - Monitoring & Discoverability
06:08 - AI/ML
06:49 - Outro
The video covers realtime ingestion using Amazon Kinesis as well as batch ingestion in with AWS Lambda, AWS Glue and Amazon EMR. We look at how we can store this data in Amazon S3, Amazon DynamoDB and Amazon DynamoDB before using Amazon Quicksight to build dashboards.
😎 About me
I have spent the last decade being immersed in the world of big data working as a consultant for some the globe's biggest companies.My journey into the world of data was not the most conventional. I started my career working as performance analyst in professional sport at the top level's of both rugby and football. I then transitioned into a career in data and computing. This journey culminated in the study of a Masters degree in Software
zhlédnutí: 2 656
Video
Amazon Bedrock on AWS [AWS TUTORIAL IN 10MINS]
zhlédnutí 2,8KPřed 8 měsíci
LINKS ℹ️ aws.amazon.com/bedrock/ ℹ️ proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf In this video we take a look at Amazon Bedrock on AWS. We cover the basics of GenAI and how you can get started on Amazon Bedrock using the AWS console to send text prompts at Foundation Models which are exposed via the Amazon Bedrock API. SUPPORT THE CHANNEL: ℹ️ Ude...
My Top 5 Tips For Passing The AWS Certified Data Analytics - Specialty Exam (DAS-C01)
zhlédnutí 3,3KPřed 10 měsíci
LINKS ℹ️ Udemy Practice Exams: www.udemy.com/course/practice-exams-aws-certified-data-analytics-specialty-o/?referralCode=484C33C8FCA5C93803A5 ℹ️ AWS Exam Guide and Questions Downloads: aws.amazon.com/certification/certified-data-analytics-specialty/ ℹ️ AWS Documentation: docs.aws.amazon.com/ In this video I share my top 5 tips for studying and passing the AWS Certified Data Analytics - Special...
What is Amazon DataZone? [AWS TUTORIAL in 12MINS]
zhlédnutí 3,4KPřed 10 měsíci
LINKS ℹ️ docs.aws.amazon.com/datazone/latest/userguide/produce-data-gs.html In this video we take a look at the Amazon DataZone Service available in AWS. Amazon DataZone is a data management service that enables you to catalog, discover, govern, share, and analyze your data. With Amazon DataZone, you can share and access your data across accounts and supported regions. Amazon DataZone simplifie...
AWS Glue Crawler [AWS Console 2023 Full Demo]
zhlédnutí 3,3KPřed 11 měsíci
LINKS ℹ️ GitHub: github.com/johnny-chivers/aws-glue-crawlers ℹ️ AWS Docs: docs.aws.amazon.com/glue/latest/dg/crawler-running.html In this video we cover what an AWS Glue Crawler is and how you can use it to populate the AWS Glue Data Catalog. We cover the basics of the AWS Crawler before diving into a full demo where we register data in S3 with the AWS Glue Data Catlaog using a crawler we defin...
What Table Format Should I Choose For My Data Lake? Hudi | Iceberg | Delta Lake
zhlédnutí 7KPřed 11 měsíci
LINKS TO FULL BLOG: ℹ️ AWS Blog: aws.amazon.com/blogs/big-data/choosing-an-open-table-format-for-your-transactional-data-lake-on-aws/ Using a blog recently posted on AWS I break down and discuss the key considerations when deciding on an open source format for your transactional data lake tables in AWS. We look at the general considerations you should factor into your decision making process be...
Run Spark Jobs On Amazon Athena [FULL TUTORIAL IN 12MINS]
zhlédnutí 3,5KPřed rokem
Have you ever been in a situation where you want to run spark code to analyse data, but don’t want to manage the underlying resources? Then using Amazon Athena’s Spark engine could be the solution for you. Amazon Athena allows you to submit spark code via fully manned spark engine in the form of a notebook. This allows you to carryout data analytics and exploration using Apache Spark without th...
Build Your Own Search Using Amazon OpenSearch Service [FULL COURSE in 15MIN]
zhlédnutí 27KPřed rokem
Want to build your own search solution? The Amazon OpenSearch Service on AWS could be the solution for you. OpenSearch is a distributed, community-driven, Apache 2.0-licensed, 100% open-source search and analytics suite used for a broad set of use cases like real-time application monitoring, log analytics, and website search. OpenSearch provides a highly scalable system for providing fast acces...
Apache Iceberg on AWS with S3 and Athena [FULL COURSE IN 30MIN]
zhlédnutí 18KPřed rokem
Do you face the situation on a daily bases where you data lake queries are slow? updates to the data are nearly impossible? And end users face issues reading or updating data? Then apache iceberg could be the solution you are looking for. Iceberg is an open source table format, that was originally created by Netflix but was handed over to the apache foundation, that allows for fast querying reg...
SQL For AWS Athena [FULL COURSE IN 40mins]
zhlédnutí 16KPřed rokem
In this video I cover how to use SQL with AWSAthena. Using the resources I have uploaded to GitHub we carryout a full tutorial on how to manipulate data and carry out data analytics tasks within the AWS Athena Ecosystem. Don't worry if you are new to SQL, AWS, or Athena I guide you through everything step by step. LINK TO GITHUB TUTORIAL RESOURCES: 💾 Code Repo: github.com/johnny-chivers/sql-for...
PySpark For AWS Glue Tutorial [FULL COURSE in 100min]
zhlédnutí 79KPřed rokem
In this video I cover how to use PySpark with AWS Glue. Using the resources I have uploaded to GitHub we carryout a full tutorial on how to manipulate data and carry out ETL tasks within the AWS Glue Ecosystem. Don't worry if you are new to PySpark, AWS, or Glue I guide you through everything step by step. LINK TO GITHUB TUTORIAL RESOURCES: 💾 Code Repo: github.com/johnny-chivers/pyspark-glue-tu...
AWS EMR Serverless - What is it? [FULL TUTORIAL in 25mins]
zhlédnutí 14KPřed rokem
ℹ️ johnnychivers.co.uk 📁 github.com/johnny-chivers/emr-serverless ☕ www.buymeacoffee.com/johnnychivers 📹czcams.com/video/ygccJS_58jE/video.html (AWS CZcams Video EMR Serverless) 00:37 - What is EMR Serverless? Part 1 00:58 - What is EMR? 01:34 - What is EMR Serverless? Part 2 02:30 - EMR Vs EMR Serverless 03:21 - Glue Vs EMR Serverless 04:40 - Tutorial: Setup Work 13:52 - Tutorial: Create EMR S...
Build An AWS Streaming Fraud Detection App [Full Tutorial using MSK and Kinesis]
zhlédnutí 3,1KPřed rokem
ℹ️ johnnychivers.co.uk 📁 fraud-detection.workshop.aws/en/intro.html 📁 github.com/johnny-chivers/tutorial-kafka-flink-dynamodb ☕ www.buymeacoffee.com/johnnychivers 00:00 - Intro 01:15 - What is the data context 02:43 - Flow of data 04:43 - Main services we are using 04:58 - What are we building 06:41 - Tutorial In this video we build a real time fraud detection app using AWS MSK and AWS Kinesis ...
AWS EMR Tutorial [FULL COURSE in 60mins]
zhlédnutí 57KPřed 2 lety
ℹ️ johnnychivers.co.uk 📁 emr-etl.workshop.aws/setup.html ☕ www.buymeacoffee.com/johnnychivers/e/70388 📁 github.com/johnny-chivers/emrZeroToHero ☕ www.buymeacoffee.com/johnnychivers 01:11 - Set Up Work 07:21 - What Is EMR? 10:29 - Spin Up A Cluster 15:00 - Spark ETL 32:21 - Hive 41:15 - PIG 45:43 - AWS Step Functions 52:09 - EMR Auto Scaling In this video we take a look at AWS EMR and work throu...
AWS Kinesis Tutorial for Beginners [FULL COURSE in 65 mins]
zhlédnutí 59KPřed 2 lety
ℹ️ johnnychivers.co.uk ☕www.buymeacoffee.com/johnnychivers/e/56915 📁 github.com/johnny-chivers/kinesisZeroToHero ☕ www.buymeacoffee.com/johnnychivers 00:09 - What the course will cover 00:54 - Set Up Work 05:43 - Kinesis Streams Theory 09:01 - SDK Vs KPL Theory 10:31 - Kinesis Data Streams Practical 12:03 - Kinesis SDK 15:54 - KPL Practical 22:26 - Lambda Consumer Theory 23:19 - Lambda Consumer...
AWS Glue Tutorial for Beginners [FULL COURSE in 45 mins]
zhlédnutí 244KPřed 2 lety
AWS Glue Tutorial for Beginners [FULL COURSE in 45 mins]
AWS MySQL Aurora Vs RDS - What one should I chose?
zhlédnutí 16KPřed 2 lety
AWS MySQL Aurora Vs RDS - What one should I chose?
Top 5 Trends For Data Engineering In 2022
zhlédnutí 3,8KPřed 2 lety
Top 5 Trends For Data Engineering In 2022
AWS EMR vs AWS SageMaker - What One Should I use?
zhlédnutí 2,1KPřed 2 lety
AWS EMR vs AWS SageMaker - What One Should I use?
AWS Glue ETL Vs EMR - Which one should I use?
zhlédnutí 36KPřed 2 lety
AWS Glue ETL Vs EMR - Which one should I use?
Realtime Streaming With AWS Glue Studio
zhlédnutí 4,4KPřed 2 lety
Realtime Streaming With AWS Glue Studio
Using AWS Aurora For Full Text Search - Complete Tutorial
zhlédnutí 1,5KPřed 2 lety
Using AWS Aurora For Full Text Search - Complete Tutorial
AWS Postgres Aurora Vs RDS - What one should I chose?
zhlédnutí 15KPřed 2 lety
AWS Postgres Aurora Vs RDS - What one should I chose?
My Top 5 Linux Commands On AWS For Data Engineering - Using Cloud9!
zhlédnutí 801Před 2 lety
My Top 5 Linux Commands On AWS For Data Engineering - Using Cloud9!
What Do Cloud Data Engineers Do In AWS?
zhlédnutí 660Před 2 lety
What Do Cloud Data Engineers Do In AWS?
AWS Data Engineering Tutorial for Beginners [FULL COURSE in 90 mins]
zhlédnutí 88KPřed 2 lety
AWS Data Engineering Tutorial for Beginners [FULL COURSE in 90 mins]
How I Architected A Start Up WebApp Using AWS Amplify
zhlédnutí 522Před 2 lety
How I Architected A Start Up WebApp Using AWS Amplify
hI, I've been trying to do an exercise which consist on ingest data from an website (currencies), store them, and then show in a graphic the data collected, that's very simple to say but very difficult for me to do it, do you have any information I will really appreciate it. I have the API key from the source of data
Johnny the speed comes from partition by column we use while creating? Like if I used a different column insyead of date and and used the date related queries , will it still be faster or not?
Is this legit
Thanks! You helped me a lot! 😁
Thank you for this, and you have a most delightful accent.
Is the Gold folder redundant ? Seems like t is not needed. Or will it only be used if data in silver still requires further transformation ?
Sometimes you hide your owner account id, othrtimes you can't be bothered 😆 . Thank you very much for your tutorials. You are the best!
Hi all, when creating iceberg table in Athena , I get " Exception encountered when executing query, this query ran against ...... database, unless qualified by the query . please post the error message on our forum ....., anyone know the solution ?
just amazing 🥳
!!!
For a very large dataset (like around 15 billion rows overall) is it going to give good performance if we use iceberg to select/delete/update ?
Is there a way to overwrite the already present table? I cannot find this option anywhere at all.
line 3:5: mismatched input 'SYSTEM_TIME'. Expecting: 'TIMESTAMP', 'VERSION' I'm getting this error while running the timestamp querry. can you please tell me why?
thanks man
Can we create an iceberg table to S3 using multi region access point?
7:06
Today, Aurora is costlier and Aurora serverless is even costlier!!
Liked and subscribed 🤟
The lambda function is not accepting the python codes as they are of previous version of python. What should I do?
Hi Johnny, really appreciate your video. But when I created crawler in free trail access I am getting below error. Is there anything that you can help me on this? "One crawler failed to create The following crawler failed to create: "crawler_customer_csv" Here is the most recent error message: Account *************** is denied access."
Another good video from the Chiverse.. :)
amazing
You are amazing and a natural teacher !!
Is this guy Scottish or Jamaican? Never heard an accent like this before it’s wild
amazing work
You are amazing❤
best video
thanks for easy to follow video ...looking foreword to more such content on azure
You are Awesome. watching in 2024... ETL steps needs minor updating but I was still able to follow ! Keep up the great work !
thank you so much
Very good video, thank you! God bless - Matthew 11:28
Very thoroughly described. Thank you
Hey Jonny, there were only 2 rows in the bronze/ingest object which you pulled using Firehose, how come there are so many rows after the glue job to silver layer?
Amazing.
Hi thanks for your content. I got the following error while create CFN stack "Please check the role provided or validity of S3 location you provided. We are unable to get the specified fileKey: modules/599e7c685a254c2b892cdbf58a7b3b4f/v1/flink-sql-connector-elasticsearch7_2.11-1.13.2.jar in the specified bucket: ee-assets-prod-us-east-1" Do we need to download the .jar file and upload manually to S3 to make it work ?
any news?
This was an amazing tutorial. I understood every bit of it because of the way it was explained with hands-on. Loved hand typing of all commands which seemed very real world scenario. Thank you so much Johnny!
happy new year
Thanks @Johnny Chivers. This video unlocked a lot of confusions I had with RDS and Aurora. But doesn't Aurora global databases provide fault tolerance against Region outage?
Great tutorial - but a pro tip. You *totally* need a keyboard. Ideally one that really fits your finger size, pressing force and such. I mean it as an advice, not as a rude comment though. Give some a try. :)
I have not heard PIG in forever, really enjoyed that language.
I feel like now I am zero to a noob. It will take sometime to be a hero :)
Around 26 minutes after you queried the deleted data it said it scanned 5.76MB. That seems like a lot for just metadata!
thank you sir
@25:00 "I'll talk about connections quickly", LOL! That's what AWS Glue, Azure Data Factory, SSIS, Informatica, are all about: CONNECTIONS! You are moving data from a source to a target, and to do that, you need to be connected to both, the source and the target. Basically, you are an S3 guy, LOL!
Thanks for the video! What do you think about using ECS/EKS to run your python ETLs inside docker containers? so you can execute your tasks/pods from MWAA after. In case you don't need spark, could be an alternative to EMR and cheaper than Glue.
So clear! Thank you!
Awesome !!! Johnny...
Love your vids. Could you maybe do a vid on airflow hosting on fargate + simple pipeline? Something practical
Dude, your videos are so helpful, I got a Data Engineer job after practicing with your videos and they are still helpful.. More power to you man, I hope you get more success.
Thanks!