DuckDB Tutorial For Beginners

Data Visualisation Tools With DuckDB & MotherDuck - Hex/Preset

What's new in Apache Airflow 2.8?

I Need Your Help..

Filip Turek a jeho Intergalaktický Kebab | Levné vaření

DuckDB Tutorial - DuckDB course for beginners

Data with Marc

zhlédnutí 27 231

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 1. 06. 2024
DuckDB Tutorial - DuckDB course for beginners
Discover the power of DuckDB and why it will be so important to you!
Materials: robust-dinosaur-2ef.notion.si...
Connect with me on LinkedIn 👇
/ marclamberti
TABLE OF CONTENT
0:00 Introduction
0:32 Who am I
0:53 Why DuckDB?
1:52 What is DuckDB?
4:34 What are the limits of DuckDB
5:35 Practice time
5:41 The setup
7:22 Install DuckDB and Imports
8:11 Create a DuckDB connection
9:10 Print the top 10 rows with Pandas
10:37 Print the top 10 rows with DuckDB
11:15 Retrieving data from a request
12:13 Use read_csv to load CSVs
13:03 Using DESCRIBE
13:15 Registering a Dataframe as a Table
14:15 Counting rows with DuckDB
14:18 Data cleansing
14:52 Querying Pandas Dataframes with DuckDB
15:55 Filtering on columns
16:46 What's a view?
17:06 Creating a table with types that skips invalid rows
18:40 The First-Form clause
19:07 Excluding columns with DuckDB
19:40 The Column Expression
21:52 Create a View
22:34 GROUP BY ALL
22:48 Querying the View
23:05 Revenue by City
23:24 Writing Parquet files with DuckDB
24:05 Querying Parquet files with DuckDB
24:35 DuckDB with S3
25:08 DuckDB Extensions
25:15 Conclusion

Komentáře • 43

@MHelmers77 Před 10 měsíci ⁺¹²
Would love a full Udemy Course on DuckDB & dbt from you!
@shneor.e Před 8 měsíci ⁺¹
I would buy it
@gatono25 Před rokem ⁺¹
That's game changing, very nice video, Marc, Thank you.
@AliMasri91 Před rokem
Thank you! Great content as always!
@adiliophi Před 10 měsíci
Awesome content! Thanks a lot!
@robertbutscher6824 Před 4 měsíci
great tutorial, thanks for inspiration and your interstings lessons
@dataslinger6379 Před rokem
Thanks as always for the great content Marc! Just as a sanity check for myself, OLAP Databases don't necessarily need to be columnar based correct? Recent big data databases use columnar based storage, but any database that processes analytical workloads is considered an OLAP database even if it is row based correct?
@tech-n-data Před 8 měsíci ⁺¹
This is a great intro video, thank you.
@MarcLamberti Před 8 měsíci ⁺¹
Thank you ❤️
@jerrynwaeze9269 Před rokem
Hi This is the best video on duckdb atm. How do you get resources on it? I heard you can connect to a postgres db using a Postgres scanner. How does that work?
@interestingamerican3100 Před rokem ⁺¹
Dude.....I like your style! Subscribed and hit the bell....
@MarcLamberti Před rokem
Thank you 🙏
@stnico Před rokem
Great, thanks!
@MarcLamberti Před rokem
You’re welcome ☺️
@sigkalbar Před rokem
Awesome ❤
@MarcLamberti Před rokem
Thank you 🙏
@mathieujasinski2383 Před 9 měsíci
I did smashed the like "booton", thanks brother, very helpful content!
@MarcLamberti Před 9 měsíci
Thank you 🙏
@olga6163 Před 9 měsíci
Thanks, great lesson! Could you help to fix an error, it occurs when I try to run the sales table that we`ve created. The error is "Conversion Error: Could not convert DATE (0019-04-19) to nanoseconds" Don`t understand why it`s trying to get nanoseconds with the date format. Thanks!
@davidjackson7675 Před rokem
Thanks.
@ragibshahriar7959 Před 5 měsíci ⁺¹
Please make more videos
@leonvictorrichard3959 Před 7 měsíci
Would love an dbt+ duckdb course from u on Udemy. Big fan of yours 🎉
@LeSaboteur3981 Před 4 měsíci
like always... way to few likes for such a great video!🎉
@aruniyer1833 Před 9 měsíci
I am getting this error
Conversion Error: Could not convert DATE (0019-04-19) to nanoseconds and run conn.execute("from sales").df()
when i create a sales table. Not sure if you have seen it?
@MarcLamberti Před 9 měsíci ⁺¹
Let me double check that
@emilychase4610 Před 9 měsíci
I'm getting this error too!
@aruniyer1833 Před 9 měsíci
I was able to solve it by dropping the non date values
df = df.loc[df['Order Date'] != 'Order Date']
df['Order Date'] =pd.to_datetime(df['Order Date'],format='%m/%d/%y %H:%M')
In the sql block i made this change to match df format strptime("Order Date", '%Y-%m-%d %H:%M:%S')
Looks like the trycast is not working or some bug
@CaribouDataScience Před rokem
Why not use the pandas SQL functions?
@LeSaboteur3981 Před 4 měsíci
for example i don't think it supports the nice extended Syntax like EXCLUTE. but the main reason is speed. duckDB will execute way faster, pandas SQL is still pandas in the end. (correct me if i'm wrong)
@durgeshkshirsagar5160 Před 6 měsíci
4:28 it is written redhshit :D Please don't mind. Thanks for the video.
@MarcLamberti Před 6 měsíci ⁺¹
🫣🥹🥹🥹🥹
@rashadnelson1873 Před rokem
Oh man, oooofff! I tried following your video here and your blog with the stock analysis data. I've run into too many errors unfortunately. That's the thing about these dependencies - there's too many moving parts where errors are persistent. I can sort through them, but it would take up too much time. We all got work to do :-).
For data analysis work, my go to still at this point is using an AWS S3 bucket to load the raw data, use an AWS Glue to create a database and an AWS Glue Crawler to upload the data while creating a table within that database. I can also use AWS Glue Studio to convert the data from .cvs format to .parquet format. From there, I can use AWS Athena to query the data.
@durgeshkshirsagar5160 Před 6 měsíci
Except that baziz, everything is perfectly fine.😀
@ththththththththth Před 8 měsíci
The files are not available anymore :(
@MarcLamberti Před 8 měsíci
What files?
@ththththththththth Před 8 měsíci
@@MarcLamberti In order to work with this tutorial, you provided the sales dataset in the materials. There is a link in the description. On this page, there is a link to the kaggle dataset that you used in the video. However, this one got removed. So it is not possible anymore to follow this tutorial :(
@MarcLamberti Před 8 měsíci
oh oh, let me check if I can fix that
@ththththththththth Před 8 měsíci
@@MarcLamberti Thank you! Let us know if it works again
@MarcLamberti Před 8 měsíci
Here we go: www.kaggle.com/datasets/kushagra1211/usa-sales-product-datasetcleaned
@davidjackson7675 Před rokem
When I ran this Python/Duckdb it only shape returned (186862, 6):
# with duckdb
cur_time = time.time()
df = conn.execute("""
SELECT *
FROM '/kaggle/input/sales-product-data/*.csv'
""").df()
print(f"time: {(time.time() - cur_time)}")
print(df.shape)

Další v pořadí

Automatické přehrávání

DuckDB Tutorial For Beginners

DuckDB Tutorial For Beginners

Data Visualisation Tools With DuckDB & MotherDuck - Hex/Preset

Data Visualisation Tools With DuckDB & MotherDuck - Hex/Preset

What's new in Apache Airflow 2.8?

What's new in Apache Airflow 2.8?

I Need Your Help..

I Need Your Help..

Filip Turek a jeho Intergalaktický Kebab | Levné vaření

Filip Turek a jeho Intergalaktický Kebab | Levné vaření

My car can change the color 🎀😍 #tiktok #elsarca

My car can change the color 🎀😍 #tiktok #elsarca

pip install duckdb

pip install duckdb

Analyze MILLIONS of points in SECONDS (on your computer) with DuckDB for GIS

Analyze MILLIONS of points in SECONDS (on your computer) with DuckDB for GIS

Learn JSON files in 10 minutes! 📄

Learn JSON files in 10 minutes! 📄

Pertandingan Embedded DB & Data Processing - SQLite, DuckDB, Pandas, Polars

Pertandingan Embedded DB & Data Processing - SQLite, DuckDB, Pandas, Polars

I've been using Redis wrong this whole time...

I've been using Redis wrong this whole time...

DuckDB & Python | End-To-End Data Engineering Project (1/3)

DuckDB & Python | End-To-End Data Engineering Project (1/3)

Data Engineer Project: An end-to-end Airflow data pipeline with BigQuery, dbt Soda, and more!

Data Engineer Project: An end-to-end Airflow data pipeline with BigQuery, dbt Soda, and more!

maths study advice that no one tells you | oxford maths degree

maths study advice that no one tells you | oxford maths degree

Eminem - Houdini [Official Music Video]

Eminem - Houdini [Official Music Video]

Would you like a delicious big mooncake? #shorts#Mooncake #China #Chinesefood

Would you like a delicious big mooncake? #shorts#Mooncake #China #Chinesefood

FOOTBALL WITH PLAY BUTTONS ▶️ #roadto100m

FOOTBALL WITH PLAY BUTTONS ▶️ #roadto100m

Zlato, máte na to! 🥇💪🇨🇿

Zlato, máte na to! 🥇💪🇨🇿

The GIRL's skills are AMAZING🔪🎋#camping #survival #bushcraft #outdoors #lifehack

The GIRL's skills are AMAZING🔪🎋#camping #survival #bushcraft #outdoors #lifehack

How I prepare to meet the brothers Mbappé.. 🙈 @KylianMbappe

How I prepare to meet the brothers Mbappé.. 🙈 @KylianMbappe

Augbert Invented Physics

Augbert Invented Physics

Sigma Girl Education #sigma #viral #comedy

Sigma Girl Education #sigma #viral #comedy