A Deep Dive in How Slow SELECT * is

Sdílet
Vložit
  • čas přidán 16. 05. 2024
  • Fundamentals of Database Engineering udemy course (link redirects to udemy with coupon)
    database.husseinnasser.com
    In a row-store database engine, rows are stored in units called pages. Each page has a fixed header and contains multiple rows, with each row having a record header followed by its respective columns. When the database fetches a page and places it in the shared buffer pool, we gain access to all rows and columns within that page. So, the question arises: if we have all the columns readily available in memory, why would SELECT * be slow and costly? Is it really as slow as people claim it to be? And if so why is it so? In this post, we will explore these questions and more.
    0:00 Intro
    1:49 Database Page Layout
    5:00 How SELECT Works
    10:49 No Index-Only Scans
    18:00 Deserialization Cost
    21:00 Not All Columns are Inline
    28:00 Network Cost
    36:00 Client Deserialization
    / how-slow-is-select
    Fundamentals of Backend Engineering Design patterns udemy course (link redirects to udemy with coupon)
    backend.husseinnasser.com
    Fundamentals of Networking for Effective Backends udemy course (link redirects to udemy with coupon)
    network.husseinnasser.com
    Follow me on Medium
    / membership
    Introduction to NGINX (link redirects to udemy with coupon)
    nginx.husseinnasser.com
    Python on the Backend (link redirects to udemy with coupon)
    python.husseinnasser.com
    Become a Member on CZcams
    / @hnasr
    Buy me a coffee if you liked this
    www.buymeacoffee.com/hnasr
    Arabic Software Engineering Channel
    / @husseinnasser
    🔥 Members Only Content
    • Members-only videos
    🏭 Backend Engineering Videos in Order
    backend.husseinnasser.com
    💾 Database Engineering Videos
    • Database Engineering
    🎙️Listen to the Backend Engineering Podcast
    husseinnasser.com/podcast
    Gears and tools used on the Channel (affiliates)
    🖼️ Slides and Thumbnail Design
    Canva
    partner.canva.com/c/2766475/6...
    Stay Awesome,
    Hussein
  • Věda a technologie

Komentáře • 47

  • @hnasr
    @hnasr  Před rokem +4

    Fundamentals of Database Engineering udemy course (link redirects to udemy with coupon)
    database.husseinnasser.com

  • @Rettou74
    @Rettou74 Před rokem +21

    What would be nice is to have some numbers to see the real impact on performance and to know which of this factors are more crucial

  • @Epistemer
    @Epistemer Před rokem +20

    Hussein I truly look up to you ❤

  • @jswlprtk
    @jswlprtk Před rokem +16

    You Sir, are an inspiration for me ❤

  • @shiewhun1772
    @shiewhun1772 Před rokem +1

    35 seconds in. Noticed the background is less noisy - love that it's just three books. And the lighting is softer. This is nice. I thought the sword was gone till I googled the word Musashi and realized the spirit of the Samurai is still very much here. This is nice too.
    Back to SELECT * . Hopefully this is the video where it finally sinks in for me what your *fetish* for "SELECT *" is. I'm a big fan, Hussein, I can say you played a big part in my having the career and approach to learning that I have now. But as anybody who's watched numerous videos you've made on databases would notice, you have a thing for SELECT * :)

  • @bashardlaleh2110
    @bashardlaleh2110 Před rokem +1

    thank you for sharing these valuable info

  • @dyto2287
    @dyto2287 Před rokem +5

    Bigger problem than it being slow is that it could cause issues with your code when you rollback failed deployments after db migrations because db migrations could add new columns that your older version of the code does not recognise and fail to scan.

  • @user-bg5vp5hn2w
    @user-bg5vp5hn2w Před 11 měsíci

    Thanks you for providing such high quality vids 🙏

  • @damjandjordjevic1994
    @damjandjordjevic1994 Před rokem +1

    I enjoy the details you get into.

  • @farzadmf
    @farzadmf Před rokem +2

    Thanks for another great video!

  • @stevez5134
    @stevez5134 Před rokem +5

    The man’s a hero

  • @HarshKapadia
    @HarshKapadia Před rokem +1

    Kids these day will not call you cringe. They will call you 'an amazing person who teaches us so much and gets us interested in learning the fundamentals'.

  • @codinggavin
    @codinggavin Před 11 měsíci

    Hussein you are such an inspiration ❤️

  • @pnworks9186
    @pnworks9186 Před rokem

    Thank you very much sir. This is a very detailed explanation.

  • @dariusdoku4320
    @dariusdoku4320 Před rokem

    Great video sir, you teach the good stuff

  • @eulerpi7042
    @eulerpi7042 Před rokem +7

    Hi Hussein, what you think about ORM? is it worth to used? the pros and cons? any video about it? I would love to watch it thanks :))

  • @Pranav-pl7eg
    @Pranav-pl7eg Před 9 měsíci +1

    Best CHANNEL

  • @user-ok4fx3kl6f
    @user-ok4fx3kl6f Před 9 měsíci +1

    Hey Hussein, I have one question-
    How is select * different from select col1, col2, col3,col4 where id =1? - say I have 100 columns in my DB or some very high number of columns
    At the end it has to still figure out the page where col1,col2,col3,col4 are , right ?
    The only part that would be less is- deserialisation cost, n/w cost . But the main part of searching the remaining the columns from heap still lies there even if we are selecting few columns as opposed to *.
    Is my understanding correct ?
    can you shed some light ?

  • @ayushpandey1148
    @ayushpandey1148 Před rokem +1

    Would love to have the video reg File block and byte. How Read operation is done and its underlying logic

  • @hunter_-ur5mn
    @hunter_-ur5mn Před rokem

    great and informative watch..

  • @dasten123
    @dasten123 Před rokem +8

    But if you actually need everything, SELECT * is not slower than selecting every column explicitly, right?
    So SELECT * isn't slow. Selecting in general is slow.

    • @davincis1
      @davincis1 Před rokem +2

      You never use * . Unless you want to get hacked and show all nice customer data that should be hidden :) . As well, in many cases you not need all columns as it increases the time you get the data

    • @Mohamedrasvi
      @Mohamedrasvi Před rokem +5

      This is exactly my thinking. Select * is not slower than selecting explicitly by specifying all columns. The video is talking about selection unnecessary columns.

  • @RahulAhire
    @RahulAhire Před 11 měsíci

    I'd love to if you can review Citus postgres distribution

  • @hemant_pande
    @hemant_pande Před rokem +4

    Hi Hussein, what would be the impact of using an guid as primary key column vs using an autoincrememt as a primary key column?

    • @ayaanqui
      @ayaanqui Před rokem

      I'd like to know this too

    • @drpstar
      @drpstar Před rokem +2

      Read his blog on postgresql vs mysql. There he has mentioned some details on this topic.

  • @apoorvgupta2039
    @apoorvgupta2039 Před rokem +1

    I listen to your shows when going for a walk and damn so much info i gain during that 40 mins.

  • @mariumbegum7325
    @mariumbegum7325 Před 11 měsíci

    Great video 😀

  • @sriteja2510
    @sriteja2510 Před rokem +1

    Hi Hussein, Even the queries selecting few columns like select a,b,c from t1 where grade>90 still needs to fetch the pages from Disk Randomly
    how is it greatly different from select *

    • @drpstar
      @drpstar Před rokem

      But there is still that unnecessary IO cost of large columns ( text, blob), serialization and de serialization cost, CPU cost for compression and not to mention networking one.

    • @saikatduttaece50
      @saikatduttaece50 Před 11 měsíci

      @@drpstar What is we actually need that large column too.

  • @mtnrabi
    @mtnrabi Před 11 měsíci +1

    Question: if I do “select *” and in the where clause I put a indexed column - what’s even the benefit of using the index if eventually the db will have to do a table scan?

    • @pixaim69
      @pixaim69 Před 8 měsíci +2

      The dB will do a seek and not a scan.

  • @_dnL
    @_dnL Před rokem +3

    "A page is.. and a block.. is the most overloaded term in SE." 😅

  • @satwikburman6841
    @satwikburman6841 Před 9 měsíci

    The guilt trip whenever I am gonna do a select * from now on 😂

  • @haythamasalama0
    @haythamasalama0 Před rokem +2

    Great video ✨

  • @abdirahmann
    @abdirahmann Před rokem +2

    at this point hussein is a DBA or even better cause he can build scalable backends too, change my mind 😆

  • @BabakKeyvani0
    @BabakKeyvani0 Před rokem

    "Next time you do a 'select * ...' think about the suffering you're causing to all this equipments" 😂😅 26:06

  • @gokukakarot6323
    @gokukakarot6323 Před 8 měsíci

    The videos are kind of good, but man I feel so bad for not being able to sit through this dramatic explanation of things.
    It’s either my ADHD or just that my grandfather just gets to the point much faster than this.
    It’s like if buffering had a modern look

  • @mhcbon4606
    @mhcbon4606 Před rokem

    title is confusing. Star operator is not the suspect here.. but the deep dive is interesting nevertheless, although, a bit chatty imho.

  • @peppybocan
    @peppybocan Před rokem +1

    Deserialisation is not a problem.

  • @squirrel1620
    @squirrel1620 Před rokem +1

    Its slowww... Looking at you Entity Framework, or pretty much any ORM 😅

    • @NathanHedglin
      @NathanHedglin Před rokem +1

      Haha Entity Frameworks Core is MUCH faster.

    • @wolfVFXmc
      @wolfVFXmc Před 11 měsíci +1

      You can select what data you wanna return from the database in EF core

  • @vmarzein
    @vmarzein Před rokem

    test