Why Isn't My Query Using an Index? Databases for Developers: Performance #5

The Magic of SQL

zhlédnutí 12 271

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 17. 06. 2020
You've created your index. All it's columns appear in your query's where clause. There are no functions on them. But the database has still decided to go for a full table scan!
Why?!
This video explores some of the reasons why using an index may be slower than a full table scan. Including one of the most important reason: the clustering factor.
Need help with SQL?
Ask us over on AskTOM: asktom.oracle.com
Twitter: / chrisrsaxon
Daily SQL Twitter tips: / sqldaily
All Things SQL blog: blogs.oracle.com/sql/
Test your SQL Skills on the Oracle Dev Gym: devgym.oracle.com/
============================
The Magic of SQL with Chris Saxon
Copyright © 2020 Oracle and/or its affiliates. Oracle is a registered trademark of Oracle and/or its affiliates. All rights reserved. Other names may be registered trademarks of their respective owners. Oracle disclaims any warranties or representations as to the accuracy or completeness of this recording, demonstration, and/or written materials (the “Materials”). The Materials are provided “as is” without any warranty of any kind, either express or implied, including without limitation warranties or merchantability, fitness for a particular purpose, and non-infringement.
Věda a technologie

Komentáře • 30

@tanujgupta143 Před 2 lety ⁺³
I am happy that I found this channel
@mdo2480 Před 4 lety ⁺⁵
Very very useful. Big thanks to you Chris. I love the way you explain things.
@TheMagicofSQL Před 4 lety
Thanks, glad you found this useful :)
@VEERSINGH-er9so Před 3 lety ⁺³
If I understood correctly then , on a very basic level, choosing between index and table scan depends on how many blocks both the access methods are likely to traverse for a particular query
@TheMagicofSQL Před 3 lety ⁺¹
Yep, you got it!
@mexicanmomo Před rokem
Very helpful. One question, when you say avoid using hints in production query, do you specifically say about index hints or other hints like direct path insert etc.
@TheMagicofSQL Před rokem
While you should aim to avoids hints in general, as you say there are hints that affect SQL behaviour such as append (direct path). These types of hints can affect transaction processing (i.e. the code you write), so you should explicitly code them. You should still use these rarely though.
You should avoid hints that force a particular access method (e.g. index vs full table scan), join method/order or query transformations. If you're trying to get a specific plan, it's better to use SQL plan management (baselines).
@pankajjha1550 Před 3 lety
Clustering Factor was a riddle to me, not any more. Thank you.
@TheMagicofSQL Před 3 lety
Great :)
@ppaolucc Před 6 měsíci
Hi Chris, very useful. Thank you. Just one consideration, if it is worth do add it: another factor that could affect the optimizer choice on whether to use FTs or the Index (I'm excluding PK access here) could also be the DB_BLOCK_SIZE in use? I mean, suppose you have a DB Block Size greater than 8K: say 16K or even 32K (even though this last one is more used in DWH), the Optimizer could opt, as a matter of cost, in using Full Table Scan as in few IOs you read few blocks each containing potentially more rows (since the DB Block Size is enough large ans supposing rows to search are as much as contiguous in the blocks potentially red). Could it be? I'm also excluding the handling of DB_FILE_MULTIBLOCK_READ_COUNT (in earlier releases was used to modify it). Does it make sense?
@TheMagicofSQL Před 6 měsíci
Not really. The maximum I/O size depends on the platform - by default Oracle Database determines this. This maximum is the same whatever block size you use. If you have bigger blocks, the database will fetch fewer each time.
Think of it like this: say you've got buckets holding coins. Each bucket only stores coins of one value. So there's 1p coins in one bucket and £1 coins in another. Each time you get money from a bucket, the maximum value is £10. So when taking from the penny bucket, you could get up to 1,000 coins each time. From the £1 bucket you could get a maximum of 10 coins each time.
@MeerAthil Před 4 lety
Good presentation, to reduce high clustering factor#, do we need to reorg the table and rebuild index?
@TheMagicofSQL Před 4 lety ⁺²
Rebuilding indexes has NO effect on the clustering factor; to change it you need to move rows in the table
@MeerAthil Před 4 lety
@@TheMagicofSQL thanks for your reply,👍
@Hellscream1830 Před 2 lety
Hi Chris! Excellent video on this topic! How about the importance of single block I/O vs multiblock I/O? Is it ever a good idea to play with DB_FILE_MULTIBLOCK_READ_COUNT parameter?
@TheMagicofSQL Před 2 lety
Relatively low - I wouldn't start fiddling with that parameter
@Hellscream1830 Před 2 lety
@@TheMagicofSQL Thank you for the reply!
@nidhikhaitan8114 Před 8 měsíci
Thanks for such a wonderful explanation with example. Is it possible to calculate the number of block accesses with and without
index. Does last_cr_buffer_gets represent the number of blocks read?
I have a table with B-Tree index with height 2, total blocks - 305, leaf blocks - 154. The clustering_factor is equal to the number of blocks. I need to compare the number of blocks reading for fetching 90% of the records. Eg. select * from table_name where pk_column > 100, in table containing 80000 records.
@TheMagicofSQL Před 8 měsíci
Thanks.
In theory you can calculate the accesses needed with an index. In practice it's difficult because you need to know exactly how many index entries you'll read (and thus index blocks) and how many table block reads this requires. The easiest way is to run the query & get the plan with the buffer information.
Does last_cr_buffer_gets represent the number of blocks read? Yes
I need to compare the number of blocks reading for fetching 90% of the records.
As you have perfect clustering this will be roughly 2 (the height) + 80% of the index leaf blocks + 80% of the table blocks. That said, a full table scan will almost always be more efficient than an index when fetching that large a fraction of the table. I wouldn't even bother trying to make the calculation.
@manjushatonape2099 Před rokem
Hello thanks for these videos,
How Bitmap index works ? If column has ‘yes or no’ values (50%) each.
@TheMagicofSQL Před rokem
I discuss the differences between BTree and bitmap indexes at blogs.oracle.com/sql/post/how-to-create-and-use-indexes-in-oracle-database#choose
@moazzamansari6941 Před 2 lety
Have you ever taken the Hotsos Course?
@TheMagicofSQL Před 2 lety
No
@KenBellows Před 3 lety ⁺¹
So to make sure I understand, it sounds like you're saying that if three matching rows reside in the same block, an index-based query wil get the same block three times, thus performing triple the amount of work as a table scan (within the context of that one block). Is that right? If so, I'm kind of surprised that no hybrid solution is used, something like "when you get a block based on an index hit, also scan it for other matches avoid repeated gets". I'm sure I'm not the first genius to think of this, so there are probably reasons why this is a bad idea... any insight?
@TheMagicofSQL Před 3 lety ⁺⁴
Only if the database has to fetch other rows in between these three.
Say you have 9 rows with values 1-9, spread across 3 data blocks with three rows in each.
Say the rows are stored in the data blocks "in order" - block 1 = 1,2,3, block 2 = 4,5,6, and block 3 = 7,8,9. Consecutive index entries match are also consecutive in the table. So to get the rows 1-3, you only need to access block one. And it only does this once, not three times.
Now imagine the rows are in the blocks in this sequence: block 1 = 1,4,7 block 2 = 2,5,8, and block 3 = 3,6,9. To get the values 1-3, you need to visit all three data blocks. To get row 4, the database goes back to block 1 and fetches it again.
Does this clarify it for you?
@KenBellows Před 3 lety
@@TheMagicofSQL Ahh I see. So the point is that the db performs gets in the order that it retrieves pointers from the index, so if several successive matches are stored in the same block, it should be able to gather them all at once, correct? How about the case where block 1=1,2,3 but your query is for, e.g., odd numbers, such that you match 1 and 3, but not the 2. Does it matter that 1 and 3 are not directly adjacent within the block? Or as long as the index finds 1 and 3 in succession, will it still be able to use block 1 all at once without needing a repeated get?
@TheMagicofSQL Před 3 lety ⁺²
Yes, it's the order of retrieving index pointers that affects table block fetches.
As long as consecutive pointers from the index reference the same data block, the database only fetches the table block once.
So if you have the condition col in ( 1, 3 ), skipping over 2 as your example then the database only fetches the table data block once.
@KenBellows Před 3 lety
@@TheMagicofSQL That makes a ton of sense. Thanks!
@IMdAbdulquadirKhan Před 3 lety ⁺¹
How to upload CSV in Oracle live sql
@TheMagicofSQL Před 3 lety ⁺¹
There's no CSV upload option available on Live SQL. If you have Oracle Database questions unrelated to a video, please ask them on asktom.oracle.com

Další v pořadí

Automatické přehrávání

Count Millions of Rows Fast with Materialized Views: Databases for Developers: Performance #6