Why Isn't My Query Using an Index? Databases for Developers: Performance #5

Sdílet
Vložit
  • čas přidán 17. 06. 2020
  • You've created your index. All it's columns appear in your query's where clause. There are no functions on them. But the database has still decided to go for a full table scan!
    Why?!
    This video explores some of the reasons why using an index may be slower than a full table scan. Including one of the most important reason: the clustering factor.
    Need help with SQL?
    Ask us over on AskTOM: asktom.oracle.com
    Twitter: / chrisrsaxon
    Daily SQL Twitter tips: / sqldaily
    All Things SQL blog: blogs.oracle.com/sql/
    Test your SQL Skills on the Oracle Dev Gym: devgym.oracle.com/
    ============================
    The Magic of SQL with Chris Saxon
    Copyright © 2020 Oracle and/or its affiliates. Oracle is a registered trademark of Oracle and/or its affiliates. All rights reserved. Other names may be registered trademarks of their respective owners. Oracle disclaims any warranties or representations as to the accuracy or completeness of this recording, demonstration, and/or written materials (the “Materials”). The Materials are provided “as is” without any warranty of any kind, either express or implied, including without limitation warranties or merchantability, fitness for a particular purpose, and non-infringement.
  • Věda a technologie

Komentáře • 30

  • @tanujgupta143
    @tanujgupta143 Před 2 lety +3

    I am happy that I found this channel

  • @mdo2480
    @mdo2480 Před 4 lety +5

    Very very useful. Big thanks to you Chris. I love the way you explain things.

  • @VEERSINGH-er9so
    @VEERSINGH-er9so Před 3 lety +3

    If I understood correctly then , on a very basic level, choosing between index and table scan depends on how many blocks both the access methods are likely to traverse for a particular query

  • @mexicanmomo
    @mexicanmomo Před rokem

    Very helpful. One question, when you say avoid using hints in production query, do you specifically say about index hints or other hints like direct path insert etc.

    • @TheMagicofSQL
      @TheMagicofSQL  Před rokem

      While you should aim to avoids hints in general, as you say there are hints that affect SQL behaviour such as append (direct path). These types of hints can affect transaction processing (i.e. the code you write), so you should explicitly code them. You should still use these rarely though.
      You should avoid hints that force a particular access method (e.g. index vs full table scan), join method/order or query transformations. If you're trying to get a specific plan, it's better to use SQL plan management (baselines).

  • @pankajjha1550
    @pankajjha1550 Před 3 lety

    Clustering Factor was a riddle to me, not any more. Thank you.

  • @ppaolucc
    @ppaolucc Před 6 měsíci

    Hi Chris, very useful. Thank you. Just one consideration, if it is worth do add it: another factor that could affect the optimizer choice on whether to use FTs or the Index (I'm excluding PK access here) could also be the DB_BLOCK_SIZE in use? I mean, suppose you have a DB Block Size greater than 8K: say 16K or even 32K (even though this last one is more used in DWH), the Optimizer could opt, as a matter of cost, in using Full Table Scan as in few IOs you read few blocks each containing potentially more rows (since the DB Block Size is enough large ans supposing rows to search are as much as contiguous in the blocks potentially red). Could it be? I'm also excluding the handling of DB_FILE_MULTIBLOCK_READ_COUNT (in earlier releases was used to modify it). Does it make sense?

    • @TheMagicofSQL
      @TheMagicofSQL  Před 6 měsíci

      Not really. The maximum I/O size depends on the platform - by default Oracle Database determines this. This maximum is the same whatever block size you use. If you have bigger blocks, the database will fetch fewer each time.
      Think of it like this: say you've got buckets holding coins. Each bucket only stores coins of one value. So there's 1p coins in one bucket and £1 coins in another. Each time you get money from a bucket, the maximum value is £10. So when taking from the penny bucket, you could get up to 1,000 coins each time. From the £1 bucket you could get a maximum of 10 coins each time.

  • @MeerAthil
    @MeerAthil Před 4 lety

    Good presentation, to reduce high clustering factor#, do we need to reorg the table and rebuild index?

    • @TheMagicofSQL
      @TheMagicofSQL  Před 4 lety +2

      Rebuilding indexes has NO effect on the clustering factor; to change it you need to move rows in the table

    • @MeerAthil
      @MeerAthil Před 4 lety

      @@TheMagicofSQL thanks for your reply,👍

  • @Hellscream1830
    @Hellscream1830 Před 2 lety

    Hi Chris! Excellent video on this topic! How about the importance of single block I/O vs multiblock I/O? Is it ever a good idea to play with DB_FILE_MULTIBLOCK_READ_COUNT parameter?

    • @TheMagicofSQL
      @TheMagicofSQL  Před 2 lety

      Relatively low - I wouldn't start fiddling with that parameter

    • @Hellscream1830
      @Hellscream1830 Před 2 lety

      @@TheMagicofSQL Thank you for the reply!

  • @nidhikhaitan8114
    @nidhikhaitan8114 Před 8 měsíci

    Thanks for such a wonderful explanation with example. Is it possible to calculate the number of block accesses with and without
    index. Does last_cr_buffer_gets represent the number of blocks read?
    I have a table with B-Tree index with height 2, total blocks - 305, leaf blocks - 154. The clustering_factor is equal to the number of blocks. I need to compare the number of blocks reading for fetching 90% of the records. Eg. select * from table_name where pk_column > 100, in table containing 80000 records.

    • @TheMagicofSQL
      @TheMagicofSQL  Před 8 měsíci

      Thanks.
      In theory you can calculate the accesses needed with an index. In practice it's difficult because you need to know exactly how many index entries you'll read (and thus index blocks) and how many table block reads this requires. The easiest way is to run the query & get the plan with the buffer information.
      Does last_cr_buffer_gets represent the number of blocks read? Yes
      I need to compare the number of blocks reading for fetching 90% of the records.
      As you have perfect clustering this will be roughly 2 (the height) + 80% of the index leaf blocks + 80% of the table blocks. That said, a full table scan will almost always be more efficient than an index when fetching that large a fraction of the table. I wouldn't even bother trying to make the calculation.

  • @manjushatonape2099
    @manjushatonape2099 Před rokem

    Hello thanks for these videos,
    How Bitmap index works ? If column has ‘yes or no’ values (50%) each.

    • @TheMagicofSQL
      @TheMagicofSQL  Před rokem

      I discuss the differences between BTree and bitmap indexes at blogs.oracle.com/sql/post/how-to-create-and-use-indexes-in-oracle-database#choose

  • @moazzamansari6941
    @moazzamansari6941 Před 2 lety

    Have you ever taken the Hotsos Course?

  • @KenBellows
    @KenBellows Před 3 lety +1

    So to make sure I understand, it sounds like you're saying that if three matching rows reside in the same block, an index-based query wil get the same block three times, thus performing triple the amount of work as a table scan (within the context of that one block). Is that right? If so, I'm kind of surprised that no hybrid solution is used, something like "when you get a block based on an index hit, also scan it for other matches avoid repeated gets". I'm sure I'm not the first genius to think of this, so there are probably reasons why this is a bad idea... any insight?

    • @TheMagicofSQL
      @TheMagicofSQL  Před 3 lety +4

      Only if the database has to fetch other rows in between these three.
      Say you have 9 rows with values 1-9, spread across 3 data blocks with three rows in each.
      Say the rows are stored in the data blocks "in order" - block 1 = 1,2,3, block 2 = 4,5,6, and block 3 = 7,8,9. Consecutive index entries match are also consecutive in the table. So to get the rows 1-3, you only need to access block one. And it only does this once, not three times.
      Now imagine the rows are in the blocks in this sequence: block 1 = 1,4,7 block 2 = 2,5,8, and block 3 = 3,6,9. To get the values 1-3, you need to visit all three data blocks. To get row 4, the database goes back to block 1 and fetches it again.
      Does this clarify it for you?

    • @KenBellows
      @KenBellows Před 3 lety

      @@TheMagicofSQL Ahh I see. So the point is that the db performs gets in the order that it retrieves pointers from the index, so if several successive matches are stored in the same block, it should be able to gather them all at once, correct? How about the case where block 1=1,2,3 but your query is for, e.g., odd numbers, such that you match 1 and 3, but not the 2. Does it matter that 1 and 3 are not directly adjacent within the block? Or as long as the index finds 1 and 3 in succession, will it still be able to use block 1 all at once without needing a repeated get?

    • @TheMagicofSQL
      @TheMagicofSQL  Před 3 lety +2

      Yes, it's the order of retrieving index pointers that affects table block fetches.
      As long as consecutive pointers from the index reference the same data block, the database only fetches the table block once.
      So if you have the condition col in ( 1, 3 ), skipping over 2 as your example then the database only fetches the table data block once.

    • @KenBellows
      @KenBellows Před 3 lety

      @@TheMagicofSQL That makes a ton of sense. Thanks!

  • @IMdAbdulquadirKhan
    @IMdAbdulquadirKhan Před 3 lety +1

    How to upload CSV in Oracle live sql

    • @TheMagicofSQL
      @TheMagicofSQL  Před 3 lety +1

      There's no CSV upload option available on Live SQL. If you have Oracle Database questions unrelated to a video, please ask them on asktom.oracle.com