Hash Match Join Internals in SQL Server
Vložit
- čas přidán 8. 07. 2024
- Hash Match Joins can join almost any data thrown at them. In this video learn about how the hash match join algorithm works and what seeing hash match joins in your execution plans means for performance of your queries.
Subscribe and turn on notifications to never miss a weekly video: czcams.com/users/DataWithBer...
Related blog post about Hash Match Joins:
bertwagner.com/2019/01/02/vis...
Be sure to check out part 1 on nested loops joins: • The Importance of Nest...
And part 2 on merge joins: • Merge Join Internals i...
Going further in-depth with hash match joins:
sqlserverfast.com/epr/hash-ma...
Follow me on Twitter:
/ bertwagner - Věda a technologie
Bert this is the easiest and shortest explanation of Hash Match I've ever seen. Thank you very much!
Man, thank you for this series, videos are fantastic! Thank you for saying it as it is, not being vague or keeping it abstract. Your videos have high learning value! Keep up the good work!
Wow, commitment, doing a video on the 2nd!! Hope you had a great festive period. Have a great 2019. Thanks for this series. Internals are always VERY interesting and exciting. You've set the bar pretty high; can't wait to see what comes next!!
This video is so good. I was confused in my college database class. But you explain everything clearly! Thank you so much for making this video!
I'm taking a database course. Your video is pretty helpful to help me understand this stuff.
Thank you, this made it much easier to understand.
PERFECT ANIMATION!!!
Great video!
great video !
Amazing thank you
Also a hash match join can run in batch which can make it quite performant...
Thank you!!
Excellent
ok so it seems the advantage of a hash join over a nested loop is that once you've identified a match you don't have to keep iterating since all tuples of that join coindition are in that one bucket is that correct?
great thank you!
وك
3:18 though
May somebody explain to me why in SQL Server, hash join doesn't preserve order. It clearly can but I don't know why it always needs to re-sort after a hash join.
What's the time complexity of this? Wikipedia actually doesn't have one on its entry and I couldn't find an authoritative answer elsewhere either.
It seems to me that we have to hash everything in Table1, which is Theta(n). Then, for each record in Table2, we need to do a lookup of the hash structure. So, in the worst case, we need to look at every key in the hash structure. So, assuming both tables have the same number of elements, would it be O(N^2)?
Good question. I believe it is O(m + n) because of the following logic: the process to create the build phase's in-memory hash map is O(m) since all m records must be traversed, the process to iterate over the the probe phase's records and compute each hash is O(n) since all n records must be traversed, and the probe phase's searches against the in-memory hash map are O(1) because searching the in-memory hash map is analogous to searching an array (O(1)) and not a linked list (O(n)). Since we throw out O(1), we're left with O(m + n).
k j n