when indexes are useless | The Backend Engineering Show

Sdílet
Vložit
  • čas přidán 6. 09. 2024

Komentáře • 45

  • @hnasr
    @hnasr  Před 2 lety +5

    head to database.husseinnasser.com to get a discount coupon for my Introduction to Database Engineering course

  • @carlosdelgado5632
    @carlosdelgado5632 Před 2 lety +27

    I happen to realize about this about 2 weeks ago when I created an index on a column with "status" and saw no increase of performance at all, as the matter fact it actually decreased so I guess I learned it the hard way 😂

  • @squirrel1620
    @squirrel1620 Před 2 lety +17

    I was just troubleshooting a long running query and found that one of the tables being joined was missing an index. The SQL EXPLAIN showed exactly what index should be added. Added it and bam, 40s query down to 1s

  • @talhaabdurrahman9407
    @talhaabdurrahman9407 Před 2 lety +11

    the way you said bahrain made me so happy

  • @marsilinouzaky2748
    @marsilinouzaky2748 Před 2 lety +7

    Great video Hussein, hope u have a great vacation.
    I think two more ways indexes can be useless if it's on a column that changes value very very often where maintaining a rebalanced tree will become challenging and introduce a lot of overhead. Also if the rows are not that many then index might not provide a lot of benefits.

    • @mohamedmohamedy3085
      @mohamedmohamedy3085 Před 2 lety

      exactly.
      one of the reasons we generate random test data, is to force the DB engine to use the index.

  • @emjaytripleo
    @emjaytripleo Před 2 lety +2

    Just bought your course Yesterday, looking forward to it

  • @Gabriel-sc3mg
    @Gabriel-sc3mg Před 7 měsíci

    and how does indexes works when we make a join by a PK it will search the results by Index and then search in the other table? great video btw

  • @lakhveerchahal
    @lakhveerchahal Před 2 lety +3

    I'm not sure of this but this might be the case (it can be wrong) -
    When u only do
    select * from employees where condition;
    This * can cause the db to do heap scan instead of index scan because it anyways need to fetch every column from heap, so to skip two hops it goes to table scan in the first place.
    Please correct me if it's wrong.

    • @ili101
      @ili101 Před 2 lety +2

      Doesn't sound logical to me. If you have million rows and you do "SELECT * WHERE id = 42069" I don't see way it will scan million rows instead of using the index then getting this 1 row from the heap. From what I know select * is only bad for network traffic and CPU/RAM/IO for fetching/formatting the output. Except probably if you select *only* from the indexes then you don't need the heap at all and skip this extra step?

    • @marsilinouzaky2748
      @marsilinouzaky2748 Před 2 lety +1

      The index doesn't store all columns values so it doesn't really matter what columns you're looking for, instead what matters in this case is the where clause column, the DB will scan this index based on the where column, get the primary key, then it has access to all columns related to this primary key, so specifying one column or * won't make much of a difference if you have an index that will be used to filter your rows.
      Someone can correct me if I'm wrong :)

    • @lakhveerchahal
      @lakhveerchahal Před 2 lety

      Yes I guess, it'll still use the index scan 99% of the time, it's just bad for the I/O.

    • @mohamedmohamedy3085
      @mohamedmohamedy3085 Před 2 lety

      selecting only happens after finding the rows.

  • @VivekYadav-ds8oz
    @VivekYadav-ds8oz Před 2 lety +3

    I had questions regarding the propagation of query from backend to database - why do we introduce so much overhead in this part?
    We use textual commands and pass it to the database for lexing *for every single query* (queries get cached but this only helps if query is exactly the same), then parsing, then actually doing its actual job. But since most of the query is programmatically generated, why don't we have a binary format for this, where we just send a blob C-style structure over the network. It would have all the information database needs without needing to filter through the human junk.
    Also, I am not an engineer (yet!) so no real world experience, but do databases run on different machines than the server? For all examples and usecases encountered by me, they've always been on the same machine, so using the network (especially TCP for localhost networking!) seems unnecessarily expensive. There should be an alternative method communication that doesn't require network but uses some Unix-y message passing of some sorts.

    • @rishabhanand4270
      @rishabhanand4270 Před 2 lety +8

      yes, databases are best kept away from your server. This way it's decoupled and you can scale each other independently. A server should do exactly what it's name suggests: serve your clients. Whatever requests your client has, it should note it down and send it to the "kitchen" where "food" is prepped, cooked and sent to the clients. Sometimes "food" is also cached :).
      Coming to the network problem: it's not always a full TCP connection establishment, there is a connection pool from your server to the database which is almost always active and your server can just use these existing connections to quickly send over the raw query. Yes, there is still a little overhead but it's still better than keeping your database on the server itself when we talk about scale.
      Finally, we don't send "parsed" queries over the network because then you are parsing it on your Server. As explained by Hussein in this video, the parsing is dynamic, i.e, the database looks at a query and comes up with the most efficient plan based on statistics which it stores in the database itself. So you're loosing all the good database optimizations if you just send the parsed query directly to the database. Also, if tomorrow the database changes it's parser (maybe an upgrade), you will have to upgrade all your server instances to have the same version of the parser. It's a headache nobody wants. Let the database do it's job of caching, looking at your query to plan and executing your query.

    • @bepamungkas
      @bepamungkas Před 2 lety

      To beat the server query planner, you'll need actual knowledge of the data. AFAIK, only SQLite allows you to do that by embedding the entire logic into its driver; making it a classic serverless RDBMS.
      note: Of course there is cloud serverless db like cosmos or aurora, but those are a bit more obscure in implementation due to several abstraction layer. SQLite use simplified model where db interface is treated as part of the application itself.
      Regarding your last question: Yes. It is advisable to separate db with backend due to predictability factor. Roundtrip latency is generally an acceptable cost compared to resource exhaustion during peak load (since on single-machine scenario, every additional load will be multiplied by at least two: backend processing and db access). And latency is usually much more predictable compared to load.
      Having said that, for localhost you could use UNIX socket- or named pipes on Windows- to avoid TCP "overhead". However, on obscure edge cases where you need do bug hunting, TCP-based connection generally easier to debug. And using TCP from the get go means you're not locked to single scenario, or worse, made wrong assumptions due to difference in system behavior.

  • @RustinRoberts
    @RustinRoberts Před 10 měsíci

    Great video. This helped my understand indexing a bit deeper. Thank you

  • @noorulamin2652
    @noorulamin2652 Před 2 lety +1

    explained very well thank you Hussein Nasir for such awesome content we really appreciate your content & hard work keep it up 👌

  • @sirajul-anik
    @sirajul-anik Před 2 lety

    on point 2, i had figured out this one a few months ago. I was totally shocked.

  • @bashardlaleh2110
    @bashardlaleh2110 Před 2 lety +1

    The composite index thing is really interesting

  • @rishiagrawal4260
    @rishiagrawal4260 Před 2 lety +1

    can you please make a video on HLS ? and more importantly how you explore/read untouched technology ?

    • @hnasr
      @hnasr  Před 2 lety +1

      I think this will help czcams.com/video/1-KmLc0c2sk/video.html

  • @shiskeyoffles
    @shiskeyoffles Před 2 lety +1

    Similar situation!
    Saving in watch later

  • @charliesta.abc123
    @charliesta.abc123 Před 2 lety +1

    Enjoy your vacation bro, we will all be here when you return.

  • @johnnychang3456
    @johnnychang3456 Před 2 lety +1

    Another one is when you do a LIKE query with percent sign on the left of your where clause. E.g, select sth from xx where xxx LIKE %q%
    Since % means all possibilities, the database has no way to apply any index scan.

    • @mohamedmohamedy3085
      @mohamedmohamedy3085 Před 2 lety

      also, iLike with/without the leading % will cause a full table scan.

  • @marawanlotfy1477
    @marawanlotfy1477 Před 2 lety +1

    hi hussein
    love from egypt man ❤️

  • @mihaidorobantu8574
    @mihaidorobantu8574 Před 2 lety

    Regarding the first name case - can we not just create an index on a normalized version of the first name table?

  • @HSBTechYT
    @HSBTechYT Před 2 lety

    I had a question about composite indexes in nosql/mongo
    How does mongo know what's left and what's right ?
    And let's say I have a index for userid property but _id (inbuilt) is also indexed right ?
    So will this affect performance?

  • @jasdeepsinghgrover2470

    When the index is implied but not mentioned.. Like object type(having index) is car and we are searching for BMW but we never mentioned object type to be car in the query.

  • @dixztube
    @dixztube Před 2 lety

    Trying to learn kotlin by just reading the documentation. Ugggggh. I like JavaScript but need to leave my comfort zone.
    Hope your family is doing well. Thanks for a great channel !

  • @henrydesousa8264
    @henrydesousa8264 Před 2 lety

    Regards from Costa Rica.

  • @wilsonovasea
    @wilsonovasea Před 2 lety

    good video.. even better to display with 1.5 speed

  • @emptymeta5849
    @emptymeta5849 Před 2 lety

    Yo, big fan of your outage content.
    Could we get some content on the recent roblox haloween outage?
    Have you heard anything about that?
    Edit: Didn't see the part about vacation, hopefully we get some good outage content when hes back

  • @ahmedtawil7705
    @ahmedtawil7705 Před 2 lety

    How you make it ?!!
    really when i face a problem , i just find that you talking about in newly videos!! 😂😂
    Thank you Hussein

    • @hnasr
      @hnasr  Před 2 lety

      I read minds 😍

  • @kooshasangari2555
    @kooshasangari2555 Před 2 lety

    ⭐ Thanks Man, You are #1 ⭐

  • @rishavagarwal6531
    @rishavagarwal6531 Před 2 lety

    Thank you sir for the video :)

  • @hitmusicworldwide
    @hitmusicworldwide Před 2 lety

    Hey! Let us know how life is back home ! That sounds fantastic!

  • @erlangparasu6339
    @erlangparasu6339 Před 2 lety

    please give sql code or benchmark result when it useless

  • @rahulbera454
    @rahulbera454 Před 2 lety

    500th like ❤️

  • @Vijay-Yarramsetty
    @Vijay-Yarramsetty Před 2 lety

    commenting for the sake of CZcams algorithm, to help him monetize from the video.

  • @mughees52
    @mughees52 Před 2 lety

    Welcome back to bahrain in Advance

  • @electronlibre4163
    @electronlibre4163 Před 2 lety

    Don't we say indices instead of indexes? 😁