Interview Question: Merge K Sorted Arrays
Vložit
- čas přidán 13. 09. 2024
- Coding interview question from www.byte-by-byt...
In this video, I show how merge k sorted arrays in to a single sorted array.
Do you have a big interview coming up with Google or Facebook? Do you want to ace your coding interviews once and for all? If so, Byte by Byte has everything that you need to go to get your dream job. We've helped thousands of students improve their interviewing and we can help you too.
Stuck on Dynamic Programming? Check out our free ebook: www.dynamicprogrammingbook.com
50 Practice Questions: www.byte-by-by...
You can also find me on
Twitter: / bytebybyteblog
Facebook: / bytebybyteblog
Email: sam@byte-by-byte.com
This one explanation helped me change my approach to studying algorithms and data structures. A must do after this problem is - Merge K sorted linked lists.
I couldn't stop myself from appreciating your work. Working through all possible approaches really helps. !!! Thank you
thank you! I'm so glad you find the videos helpful :)
You seem to have a knack of explaining the solution really well. All the best for your Byte By Byte endeavor.
anurag dani thanks!
Your teaching skills are out of this world. People tend to forget, it is harder to teach a concept than be good at. Well done.
On a side note, how do you do this in Javascript ? Where you don't have a priority queue.
Thank you!
And unfortunately in some cases you have to implement it yourself. You can also achieve the same effect by maintaining a list in sorted order. It takes O(log n) to insert a new item and O(1) to find the max
You can use heap to implement a priority queue
Awesome solution. crisp and up to the point. I couldn't find a better explanation anywhere.
Pramod Sivaraju thank you! I'm glad you found it helpful :)
Short answer: Keep track of the index with some sort of heap or priorityqueue so we know where to find the next item to insert.
you are the best man ....you start from brute force ...thats the best thing in all ur videos !! :) Thnx man keep up the good work
Great job! just please note that at 8:57 you stated that removing an item from a priority is O(1), which is in reality O(logk)
Yep, thanks
@@ByteByByte than what would be the really time complexity of this algo , plz tell ?
This video is super helpful. Thank U!
kn logk can also be achieved by simply merging these arrays in pair wise fashion just as we do in merge sort
This was my initial solution. So, not bad for me at all. ^_^
@@tyler_ua6593 This is incorrect. If you essentially use the "merge" part of the merge sort algorithm the first time around you will have a two lists each of size `n`, so you will do `2n` work. The second time around, the first list is `2n`, the second list is `n`, so you do `3n` work. In other words, `2n + 3n + 4n ... (k - 1)n`; you end up with exactly O(k^2n).
nice for loop used as a while loop.......there are so many things to learn from you :D
PriorityQueue do have a remove method, the only difference between poll and remove is, remove throws a exception if queue is empty.
Ah yeah good point. It's a little unclear looking at the documentation because remove() is overloaded (docs.oracle.com/javase/7/docs/api/java/util/PriorityQueue.html), so in the main documentation it shows remove(Object o), which removes o if it exists in the PQ and I didn't look any further into the inherited methods, but you're right.
In python:
import heapq
from functools import total_ordering
ls_of_ls = [[1, 4, 7], [2, 5, 8], [3, 62, 98], [2, 3, 99], [-1, 2, 10]]
@total_ordering
class Item(object):
def __init__(self, data, k):
self.data = data
self.k = k
def __eq__(self, item):
return self.data == item.data
def __gt__(self, item):
return self.data > item.data
def __repr__(self):
return str(self.k)
def merge_arrays(ls_of_ls):
heap = []
for k, ls in enumerate(ls_of_ls):
heapq.heappush(heap, Item(ls[0], k))
pointers = [1 for _ in range(len(ls_of_ls))]
final_ls = []
while len(heap) > 0:
item_to_add = heapq.heappop(heap)
final_ls.append(item_to_add.data)
if pointers[item_to_add.k]
Would love for you to add a PR to the git repo when you have time! github.com/samgh/Byte-by-Byte-Solutions
@8:56 I suppose both adding and removing elements from a min heap takes log(n) time. Because you fill the place of removed element from the rightmost leaf node and then heapify. Am I missing something?
removing the minimum element from a heap is a constant operation O(1), because the min element is always at the head of the heap, but if you're looking for a specific value and removing it, then yes, it will take a O(log(n))
A brilliant explanation!! Thank you
Removing from the priority takes O(log n) time, not constant time. Instead of pushing and popping from priority queue, you should replace the element instead.
Excellent :)
Isn't it true that creation of the Queue alone is kn * log(k), then as well as pulling the items off is also kn * log(k), which makes the total cost 2kn * log(k). So this makes it more expensive than kn * k (I used 3 for k and 3 for n as the example and compared the results) ? Or am I missing something ?
So just using the original arrays, and keeping a position per array would be faster.
Communicated an idea is one thing, teach and idea is a complete different, You talk to yourself.
Really well explained! Thanks. *Thumbs Up*
Thank you!
Thanks for the excellent explanation.. great work 😊
you're welcome!
There's also a way to merge the (n) sorted arrays in O(nk) time. Simple merge two sorted arrays (O(k + k) time). Then merge the result with the third array (O(k + k + k) time).... and so on. Result is O(nk) time.
That certainly works, but it's a much less efficient solution.
And your computation is not correct. One of the arrays you're merging gets longer each time, so you end up with a time complexity of O((k+k) + (k+2k) + (k+3k) + ... + (k+nk)), which equals O(k*n^2) (www.wolframalpha.com/input/?i=sum+n%3D1+to+n+(k+%2B+n*k))
Please make a video for sql interview
Thanks much - your explanation is very helpful. Please add some videos on Dynamic Programming :)
You're welcome. Have you checked out my ebook? www.dynamicprogrammingbook.com
I just downloaded it, thanks much!
good job. could you do a video explain why the time complexity of constructing a PQ, is O(n)?
The time complexity of constructing it should be O(n log n). Each insertion takes O(log n) and there are n insertions
Awesome !!! , your videos really helped a lot , keep uploading :)
nikithachandrasena great I'm glad you're finding them helpful!
Toss all the elements in a min heap O(n) time and then pop off all the elements from the heap for O(n), making it O(n) time and O(n) space.. would this be a better approach?
No popping and adding to the min heap for every element would take nlogn, n being every element, since you are not removing anything and scaling with n. Here he optimizes it to nlogk because he is keeping the size of the heap to size k at max, where k is the amount of arrays. You might as well just use a sort at that point which would be n+ nlogn = nlogn for a less optimized version.
I'm sorry if this might sound stupid, but why do we need comparable again? Arent we just picking elements in zig-zag pattern.?
No we need to know which element is the smallest, so we're putting them into a priority queue
Great, thanks! But why store index & value if you can get a value by index?
You don't have to I just think it makes it a little simpler
What keyboard are you using?
very well
Hi.. awesome explanation. Just a bit confused here, how did you create a 2-D input array?
The input is given. However, it would be pretty easy to generate with some sort of loop
Byte By Byte the loop will be again n^2. So won’t the complexity increase?
Depends on how you define n. If n is the total size of the resulting array, then no. If n is the size of the individual component arrays, then yes
This works perfectly
int[] arr5 = {1,2,3};
int[] arr6 = {2,5};
int[][] arr8 = {arr5, arr6};
awesome
Also, could you also add a video explaining merge k sorted lists using a similar approach
It's exactly the same whether they're lists or arrays
Yes worked with some tweaks
Here is my solution
class Solution {
public ListNode mergeKLists(ListNode[] lists) {
ListNode res=null;
System.out.println("lists.length "+ lists.length);
if(lists == null || lists.length== 0)
return res;
PriorityQueue pq = new PriorityQueue();
for(int i=0; i obj.value) return 1;
return 0;
}
}
so k is the count of sorted lists and n is the length of each sub list within lists ? Am I right
n should be the total number of elements when we're talking about the complexity
Sam, how is this solution different as opposed to adding all the elements of all arrays to a priority queue, removing them and populating the results array.
What would be the time complexity of the solution you're proposing?
Byte By Byte it should be kn * log kn which should be the total number of elements in all arrays
Byte By Byte Got it thanks Sam :)
actually this would have a much higher space complexity( O(kn) ), because we are basically storing all the elements in the heap, while with the solution given in the video, we are only storing k nodes in the heap at most so O(k).
the python code of this task:
#
def mergeofsortedarrays(lis):#lis - the list of lists
def addtoqueue(q,x): #add a new item to queue to keep the queue sorted
j,m=x #m is the item index in the list lis[j]
if m>len(lis[j])-1:#the item index m out of bound
return
value=lis[j][m]#the value of added item
if len(q)==0:#the queue is empty
q.append((j,m,value))
else:#find the position where to insert a new item
i=0
while ivalue:#the value of inserted item less than value in position i
i+=1
q.insert(i,(j,m,value))
result=[] #resulted merged list
queue=[] #create the empty queue
for j in range(len(lis)):#start of queue - filling with the first items of all the lists
if len(lis[j])>0:
addtoqueue(queue,(j,0))
while queue: # popping the (last) item of queue and adding it to result
j,m,value=queue.pop()
result.append(value)
addtoqueue(queue,(j,m+1)) #refilling the queue with the next element of the corresponding list
return result
Good one.. Thanks..
you're welcome!
can u pls make one video on heap sort also? keep it up
Just posted one recently on implementing Priority Queues, which is basically heap sort www.byte-by-byte.com/priorityqueue/
could you also demo a recursive solution?
I don't really see why you'd want to do this recursively
With countinting sorting it can be done in O(k*n) complexity.
Nope!
@@subham-raj Proof ?
how compareTo() is called here?
The built-in PriorityQueue class will use the compareTo() method to compare its containing objects and reorder them as needed (obj1.compareTo(obj2) == 1 would mean that obj1 > obj2, and so on). PriorityQueue is generic. It does not know how 2 objects of QueueNode relate to each other, unless we implement that method.
mentioning HEAP would make it more relatable
how?
Byte By Byte I think generally when practicing, beginners use terms such as min-heap and max-heap
class QueueNode implements Comparable {
int array;
int index;
int value;
public QueueNode(int array, int index, int value) {
this.array = array;
this.index = index;
this.value = value;
}
@Override
public int compareTo(QueueNode n) {
if (this.value > n.value)
return 1;
if (this.value < n.value)
return -1;
return 0;
}
}
public class MergeKSortedArrays {
public int[] merge(int[][] arrays) {
PriorityQueue pq = new PriorityQueue();
int size = 0;
for (int i = 0; i < arrays.length; i++) {
size += arrays[i].length;
if (arrays[i].length > 0) {
pq.add(new QueueNode(i, 0, arrays[i][0]));
}
}
int[] result = new int[size];
for (int i = 0; !pq.isEmpty(); i++) {
QueueNode n = pq.poll();
result[i] = n.value;
int newIndex = n.index + 1;
if (newIndex < arrays[n.array].length) {
pq.add(new QueueNode(n.array, newIndex, arrays[n.array][newIndex]));
}
}
return result;
}
public static void main(String[] args) {
MergeKSortedArrays m = new MergeKSortedArrays();
int[][] arrays = { { 1, 1, 2, 9 }, { 3, 9, 10 }, { 1, 3, 9 } };
int[] a = m.merge(arrays);
for (int i : m.merge(arrays)) {
System.out.println(i);
}
}
}
too much talk