459
3 739 932

S2024 #21 - Yellowbrick Data Warehouse System (CMU Advanced Database Systems)

1:21:10

S2024 #20 - DuckDB Embedded Database System (CMU Advanced Database Systems)

1:02:13

S2024 #19 - Snowflake Data Warehouse Internals (CMU Advanced Database Systems)

1:20:55

S2024 #18 - Databricks Photon / Spark SQL (CMU Advanced Database Systems)

1:07:30

S2024 #17 - Google BigQuery / Dremel (CMU Advanced Database Systems)

1:08:59

S2024 #15 - Query Optimizer Implementation 3 (CMU Advanced Database Systems)

43:13

S2024 #22 - Amazon Redshift Data Warehouse System (CMU Advanced Database Systems)

Andy Pavlo (www.cs.cmu.edu/~pavlo/)
Slides: 15721.courses.cs.cmu.edu/spring2024/slides/22-redshift.pdf
Notes: 15721.courses.cs.cmu.edu/spring2024/notes/22-redshift.pdf
15-721 Advanced Database Systems (Spring 2024)
Carnegie Mellon University
15721.courses.cs.cmu.edu/spring2024/

zhlédnutí: 3 685

Video

S2024 #21 - Yellowbrick Data Warehouse System (CMU Advanced Database Systems)

1:21:10

S2024 #21 - Yellowbrick Data Warehouse System (CMU Advanced Database Systems)

zhlédnutí 2,4KPřed 3 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/21-yellowbrick.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/21-yellowbrick.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #20 - DuckDB Embedded Database System (CMU Advanced Database Systems)

1:02:13

S2024 #20 - DuckDB Embedded Database System (CMU Advanced Database Systems)

zhlédnutí 5KPřed 3 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/20-duckdb.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/20-duckdb.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #19 - Snowflake Data Warehouse Internals (CMU Advanced Database Systems)

1:20:55

S2024 #19 - Snowflake Data Warehouse Internals (CMU Advanced Database Systems)

zhlédnutí 4,9KPřed 3 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/19-snowflake.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/19-snowflake.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #18 - Databricks Photon / Spark SQL (CMU Advanced Database Systems)

1:07:30

S2024 #18 - Databricks Photon / Spark SQL (CMU Advanced Database Systems)

zhlédnutí 3,3KPřed 3 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/18-databricks.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/18-databricks.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #17 - Google BigQuery / Dremel (CMU Advanced Database Systems)

1:08:59

S2024 #17 - Google BigQuery / Dremel (CMU Advanced Database Systems)

zhlédnutí 3,1KPřed 3 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/17-bigquery.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/17-bigquery.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #15 - Query Optimizer Implementation 3 (CMU Advanced Database Systems)

43:13

S2024 #15 - Query Optimizer Implementation 3 (CMU Advanced Database Systems)

zhlédnutí 1,3KPřed 4 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/15-optimizer3.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/15-optimizer3.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #14 - Query Optimizer Implementation 2 (CMU Advanced Database Systems)

1:20:31

S2024 #14 - Query Optimizer Implementation 2 (CMU Advanced Database Systems)

zhlédnutí 1,5KPřed 4 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/14-optimizer2.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/14-optimizer2.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #13 - Query Optimizer Implementation 1 (CMU Advanced Database Systems)

1:23:31

S2024 #13 - Query Optimizer Implementation 1 (CMU Advanced Database Systems)

zhlédnutí 2,6KPřed 4 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/13-optimizer1.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/13-optimizer1.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #12 - Database Networking Protocols (CMU Advanced Database Systems)

1:22:56

S2024 #12 - Database Networking Protocols (CMU Advanced Database Systems)

zhlédnutí 2,2KPřed 4 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/12-networking.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/12-networking.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #11 - User-Defined Function Optimizations (CMU Advanced Database Systems)

1:20:25

S2024 #11 - User-Defined Function Optimizations (CMU Advanced Database Systems)

zhlédnutí 1,4KPřed 4 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/11-udfs.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/11-udfs.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #10 - Multi-Way Join Algorithms / Worst-Case Optimal Joins (CMU Advanced Database Systems)

1:09:43

S2024 #10 - Multi-Way Join Algorithms / Worst-Case Optimal Joins (CMU Advanced Database Systems)

zhlédnutí 1,9KPřed 5 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/10-multiwayjoins.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/10-multiwayjoins.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #09 - Parallel Hash Join Algorithms (CMU Advanced Database Systems)

1:25:37

S2024 #09 - Parallel Hash Join Algorithms (CMU Advanced Database Systems)

zhlédnutí 2,3KPřed 5 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/09-hashjoins.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/09-hashjoins.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #08 - Query Scheduling & Coordination (CMU Advanced Database Systems)

1:24:31

S2024 #08 - Query Scheduling & Coordination (CMU Advanced Database Systems)

zhlédnutí 2,1KPřed 5 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/08-scheduling.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/08-scheduling.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #07 - JIT Query Compilation & Code Generation (CMU Advanced Database Systems)

1:21:57

S2024 #07 - JIT Query Compilation & Code Generation (CMU Advanced Database Systems)

zhlédnutí 2,6KPřed 5 měsíci

Andy Pavlo (www.cs.cmu.edu/~pavlo/) Slides: 15721.courses.cs.cmu.edu/spring2024/slides/07-compilation.pdf Notes: 15721.courses.cs.cmu.edu/spring2024/notes/07-compilation.pdf 15-721 Advanced Database Systems (Spring 2024) Carnegie Mellon University 15721.courses.cs.cmu.edu/spring2024/

S2024 #06 - Vectorized Query Execution Using SIMD (CMU Advanced Database Systems)

1:18:40

S2024 #06 - Vectorized Query Execution Using SIMD (CMU Advanced Database Systems)

zhlédnutí 3KPřed 5 měsíci

S2024 #06 - Vectorized Query Execution Using SIMD (CMU Advanced Database Systems)

S2024 #05 - Query Execution & Processing Part 2 (CMU Advanced Database Systems)

1:24:55

S2024 #05 - Query Execution & Processing Part 2 (CMU Advanced Database Systems)

zhlédnutí 2,8KPřed 5 měsíci

S2024 #05 - Query Execution & Processing Part 2 (CMU Advanced Database Systems)

S2024 #04 - Query Execution & Processing Part 1 (CMU Advanced Database Systems)

1:23:44

S2024 #04 - Query Execution & Processing Part 1 (CMU Advanced Database Systems)

zhlédnutí 4,1KPřed 6 měsíci

S2024 #04 - Query Execution & Processing Part 1 (CMU Advanced Database Systems)

S2024 #03 - Data Formats & Encoding Part 2 (CMU Advanced Database Systems)

1:21:48

S2024 #03 - Data Formats & Encoding Part 2 (CMU Advanced Database Systems)

zhlédnutí 4,2KPřed 6 měsíci

S2024 #03 - Data Formats & Encoding Part 2 (CMU Advanced Database Systems)

S2024 #02 - Data Formats & Encoding Part 1 (CMU Advanced Database Systems)

1:23:08

S2024 #02 - Data Formats & Encoding Part 1 (CMU Advanced Database Systems)

zhlédnutí 7KPřed 6 měsíci

S2024 #02 - Data Formats & Encoding Part 1 (CMU Advanced Database Systems)

S2024 #01 - Modern OLAP Database Systems (CMU Advanced Database Systems)

1:09:05

S2024 #01 - Modern OLAP Database Systems (CMU Advanced Database Systems)

zhlédnutí 12KPřed 6 měsíci

S2024 #01 - Modern OLAP Database Systems (CMU Advanced Database Systems)

S2024 #00 - Course Overview & Logistics (CMU Advanced Database Systems)

33:18

S2024 #00 - Course Overview & Logistics (CMU Advanced Database Systems)

zhlédnutí 13KPřed 6 měsíci

S2024 #00 - Course Overview & Logistics (CMU Advanced Database Systems)

F2023 #25 - Potpourri: Redis, CockroachDB, Snowflake, MangoDB, TabDB (CMU Intro to Database Systems)

1:21:48

F2023 #25 - Potpourri: Redis, CockroachDB, Snowflake, MangoDB, TabDB (CMU Intro to Database Systems)

zhlédnutí 6KPřed 8 měsíci

F2023 #25 - Potpourri: Redis, CockroachDB, Snowflake, MangoDB, TabDB (CMU Intro to Database Systems)

F2023 #24 - SingleStore Database Overview (CMU Intro to Database Systems)

1:15:35

F2023 #24 - SingleStore Database Overview (CMU Intro to Database Systems)

zhlédnutí 3,1KPřed 8 měsíci

F2023 #24 - SingleStore Database Overview (CMU Intro to Database Systems)

F2023 #23 - Distributed Data Warehouse OLAP Databases (CMU Intro to Database Systems)

1:23:01

F2023 #23 - Distributed Data Warehouse OLAP Databases (CMU Intro to Database Systems)

zhlédnutí 4,4KPřed 8 měsíci

F2023 #23 - Distributed Data Warehouse OLAP Databases (CMU Intro to Database Systems)

Chroma Vector Database: Retrieval for LLMs (Hammad Bashir + Liquan Pei)

1:00:43

Chroma Vector Database: Retrieval for LLMs (Hammad Bashir + Liquan Pei)

zhlédnutí 2,8KPřed 8 měsíci

Chroma Vector Database: Retrieval for LLMs (Hammad Bashir Liquan Pei)

F2023 #22 - Distributed Transaction Processing Databases (CMU Intro to Database Systems)

1:24:36

F2023 #22 - Distributed Transaction Processing Databases (CMU Intro to Database Systems)

zhlédnutí 3,9KPřed 8 měsíci

F2023 #22 - Distributed Transaction Processing Databases (CMU Intro to Database Systems)

pgvector: Stylish Hierarchical Navigable Small World Indexes (Jonathan Katz)

1:10:37

pgvector: Stylish Hierarchical Navigable Small World Indexes (Jonathan Katz)

zhlédnutí 3,5KPřed 8 měsíci

pgvector: Stylish Hierarchical Navigable Small World Indexes (Jonathan Katz)

F2023 #21 - Intro to Distributed Databases (CMU Intro to Database Systems)

1:21:21

F2023 #21 - Intro to Distributed Databases (CMU Intro to Database Systems)

zhlédnutí 6KPřed 8 měsíci

F2023 #21 - Intro to Distributed Databases (CMU Intro to Database Systems)

F2023 #20 - Database Recovery (CMU Intro to Database Systems)

1:24:05

F2023 #20 - Database Recovery (CMU Intro to Database Systems)

zhlédnutí 3,1KPřed 8 měsíci

F2023 #20 - Database Recovery (CMU Intro to Database Systems)

Komentáře

@greielts75331 Před 19 hodinami
The audio is shit. DJ for shit?
@ibrahimrabbani94 Před dnem
Is there a discord channel for CMU 15-721?
@ibrahimrabbani94 Před dnem
Thank you for the lecture! In the degenerate worst case where every tuple in relations R and S has the same value for the join key, Sort-Merge Join's merge cost is M + N where M and N are the number of pages in relations R and S respectively. Since this looks like a Block Nested-Loop Join, why can't we optimize this to M + CEIL(M/B-2)xN where B is the number of available pages in the buffer pool?
@himurakno Před 2 dny
Isn't Graph dbms research what keeps this topic hot? While I agree that neo4j is not good, there are multiple graph dbms integrating these techniques. I particular, duckdb pgq paper uses exactly the same techniques kuzu is using.
@rongtang4385 Před 4 dny
Well explained, thanks
@rachelryan5231 Před 5 dny
Legend 🤣🤣🤣🤣
@guilaidai7596 Před 5 dny
Indian English is hard to listen😂
@vaibhaves2111 Před 8 dny
toilet paperz?
@AbhishekRaj-do8kk Před 8 dny
Great lecture! Is there any advantage of having a sorted dict over an unsorted one in Dictionary compression?
@LtdJorge Před 10 dny
AMD doesn’t downclock on AVX-512, they took their time to support it but did it right (it was a double pump design at first with dual 256 bit registers and a 512 one now). I think Intel doesn’t downclock mow, or at least not as bad (it was really bad, read on Cloudflare blog when used for terminating TLS), but their support for it is somewhat worse than AMD’s. Also AVX-512 is super fragmented :(
@hugolatendresse7617 Před 12 dny
So why does the same query runs faster in DuckDB than in SQLite? Is it any of those answers given by ChatGPT? Columnar Storage Format: DuckDB uses a columnar storage format, which is more efficient for analytical queries that require scanning large amounts of data. This format allows for better data compression and faster data retrieval, especially for operations like aggregations and joins. SQLite uses a row-oriented storage format, which can be less efficient for these types of queries as it retrieves entire rows even if only a few columns are needed. Vectorized Execution: DuckDB employs a vectorized execution engine, which processes data in chunks (vectors) rather than row-by-row. This approach takes advantage of modern CPU architectures and allows for better CPU cache utilization and SIMD (Single Instruction, Multiple Data) optimizations. SQLite processes data row-by-row, which can be slower for large datasets. Parallel Processing: DuckDB supports parallel query execution, allowing it to utilize multiple CPU cores to perform operations concurrently. SQLite is designed for simplicity and portability, and while it can handle concurrent reads, its support for parallel query execution is limited. Optimized Query Planning: DuckDB includes advanced query optimization techniques that can generate more efficient execution plans for complex queries. SQLite has a simpler query optimizer, which might not produce as efficient plans for certain types of queries. Built-In Indexing and Compression: DuckDB automatically applies various indexing and compression techniques to improve query performance without requiring explicit indexing from the user. SQLite requires manual indexing, and its compression techniques are not as advanced as those in DuckDB.
@sehajpreetsingh6266 Před 14 dny
Never knew Alex Honnold taught computer science.
@LetianRuan Před 15 dny
Great course! Thanks for your open source.
@quang.luu.179 Před 16 dny
👍👍👍
@indavarapuaneesh2871 Před 20 dny
It seems like Postgres is due for big architectural overhaul. 1. Moving away from per process arch 2. Using Direct IO instead of OS page cache.
@lesleydowney6688 Před 20 dny
CZcams University lol
@mystmuffin3600 Před 22 dny
22:25 "We don't need to have a latch for the whole page table. Assuming it's fixed size, we can have latches for individual pages/locations of page table" Okay, if the latter is possible, why concern ourselves with multiple buffer pools? If these fine-grained latches for individual pages are still a bottleneck, then no matter how many buffer pools we segment our memory into, we will still suffer from contention...
@Avinashk-gq3pl Před 23 dny
being from India. Hearing about a professor who carries a kinef in bus for travelling is as fascinating story.
@rayudua.l.p1905 Před 26 dny
Thank you for making this awesome course public 🫡
@mystmuffin3600 Před 26 dny
28:30 Why would there be contention over data structures which are internal to the OS?
@JohnSundberg Před 26 dny
@49:12 I think a set MUST have unique values, as stated in the video "CANNOT have duplicate values", however - I have seen many tables of data with duplicate data.
@fakh99 Před 27 dny
حلو الراب ده
@jauhararifin10 Před 28 dny
In the 25:55, it should be "WAL: before a page is written, pageLSN <= flushLSN"
@aakarshanraj1176 Před měsícem
could not find a way to get record id of page and slot id of tuple in mysql.
@yashthakkar4499 Před měsícem
B+ tree animation link for the curious www.cs.usfca.edu/~galles/visualization/BPlusTree.html
@yashthakkar4499 Před měsícem
i am going to go with bushy tree lol.
@break1145 Před měsícem
I thought my earphone or network was broken before viewing comments LOL
@user-vg7os9hf6u Před měsícem
what is it that they are smashing at the end?
@kevinkristensen8939 Před měsícem
Thanks for this! I've always found the semistructured stuff hard to understand. I just want to point out, though, that the example in the referenced paper for shredding has different values in the columnar decomposition. In particular, for value 'en' in Name.Language.Code, the repetition level is 2, because it is a repetition of the 2nd repeated field (according to the paper).
@energy-tunes Před měsícem
best db courses ever
@user-lv2ht3qv2l Před měsícem
1:00:36
@user-lv2ht3qv2l Před měsícem
thanks a lot
@llight1635 Před měsícem
great course
@NostraDavid2 Před měsícem
Motherfuckers. So calculus ended up in SQL anyway, eh? See 1:00:00. And those SQL guys said that Codd's ALPHA was too complicated??? "Subqueries are powerful" my ass. Only because they couldn't be arsed to implement something better (which they did anyway, but their version ended up being corrupt anyway).
@NostraDavid2 Před měsícem
Hah, the query planner turned it into a regular join, as it should.
@NostraDavid2 Před měsícem
To answer the RANK query question: you can't do a GROUP BY instead, because SQL is inconsistent doodoo.
@NostraDavid2 Před měsícem
Oh man, when I thought I couldn't dislike the inconsistencies about SQL any more, I find a new example why it's a shitty language.
@NostraDavid2 Před měsícem
Oh gods, SQL's natural join compares the NAMES of the columns? That's awful and another point of evidence why SQL <> Relational Model. It's why Codd hammered on the idea of using shared Domains to join on, not shared column names. SQL, what a joke! 😂
@NostraDavid2 Před měsícem
Yes, in the Relational Model there are no duplicates within any single relation, and if you join two relations the result is a new relation which as any other relation does not contain duplicate rows. That's why there is no popular RDBMS in existence, since Postgres, DB2, Oracle, etc all allow duplicate rows and thus are not truly relational.
@NostraDavid2 Před měsícem
Fun fact: E. F. "Ted" Codd, aka the Coddfather, invented the Relational Model (relations, tuples, domains; primary key, foreign key), but also the first query language for his model (ALPHA), the term "data model" and the term OLAP. He was also highly critical of SQL (calling it Fatally Flawed back in 1985) because it broke a bunch of consistency, which STILL hasn't really been fixed (like allowing duplicate rows, and returning anything thats not a relation (like a single row, a column or a single scalar/cell value). I've read all the publicly available letters he wrote BTW. Good stuff. Even his criticisms on the Entity-Relation Model from the 1976 (?) by Peter Chen, IIRC.
@7th_CAV_Trooper Před měsícem
Concurrency control, fk yeah! Lol
@digitulized459 Před měsícem
Either that blockchain guy is a troll or that was one of the most entitled douches I've ever seen at a lecture.
@m.imranzaheer1368 Před měsícem
superb bro. Loved ur lecture
@chenqiang19860101 Před měsícem
For the log structure, if we still need an index for look up, how to save the index? How updating that index does not end up in random io stuff?
@aliasonline1493 Před měsícem
really well explained! thank you!
@akashkulkarni832 Před měsícem
what is the outro song??
@njgarg Před měsícem
Why is the "lost updates" anomaly missing in the discussion of isolation levels?
@tylerrongione6696 Před měsícem
this is f*cking awesome
@millouwmills367 Před 3 dny
BEEP
@njgarg Před měsícem
Great lecture.. but for this specific lecture, camera is moving too much and also quality is not HD.
@indavarapuaneesh2871 Před 2 měsíci
insightful lecture
@jauhararifin10 Před 2 měsíci
In 1:20:32, Oracle/MySQL and Postgres don't use memory as the primary storage, do they? And with that, Oracle/MySQL still beat most in-meomry DBMS? Is it because their WAL was disabled for this benchmark?

CMU Database Group

Komentáře