Top K Frequent Elements (LeetCode 347) | Full solution with examples | Interview | Study Algorithms
Vložit
- čas přidán 24. 07. 2024
- Finding the top k frequent elements in an array is not similar to finding the top k students in a class. We need to understand the problem statement clearly, what it is expected. In this video we look at example test cases on how to determine the top k frequent elements efficiently using the bucket sort algorithm. All of this along with a dry-run of code in JAVA.
Actual problem on LeetCode: leetcode.com/problems/top-k-f...
Chapters:
00:00 - Intro
01:06 - Problem statement and description
03:29 - How to approach the problem?
06:49 - Solving for efficiency
11:32 - Dry-run of Code
14:13 - Final Thoughts
📚 Links to topics I talk about in the video:
Brute Force Method: • Brute Force algorithms...
Bucket Sort Algorithm: • Brute Force algorithms...
Other problems on LeetCode: • Leetcode Solutions
📘 A text based explanation is available at: studyalgorithms.com
Code on Github: github.com/nikoo28/java-solut...
Test-cases on Github: github.com/nikoo28/java-solut...
📖 Reference Books:
Starting Learn to Code: amzn.to/36pU0JO
Favorite book to understand algorithms: amzn.to/39w3YLS
Favorite book for data structures: amzn.to/3oAVBTk
Get started for interview preparation: amzn.to/39ysbkJ
🔗 To see more videos like this, you can show your support on: www.buymeacoffee.com/studyalg...
🎥 My Recording Gear:
Recording Light: amzn.to/3pAqh8O
Microphone: amzn.to/2MCX7qU
Recording Camera: amzn.to/3alg9Ky
Tablet to sketch and draw: amzn.to/3pM6Bi4
Surface Pen: amzn.to/3pv6tTs
Laptop to edit videos: amzn.to/2LYpMqn
💻 Get Social 💻
Follow on Facebook at: / studyalgos
Follow on Twitter at: / studyalgorithms
Follow on Tumblr at: / studyalgos
Subscribe to RSS feeds: studyalgorithms.com/feed/
Join fan mail: eepurl.com/g9Dadv
#interview #leetcode #algorithms
@10:10 in the bucket 1, shouldn;t it be a 4
you are absolutely correct. Sorry for the error
Didn't understand Neetcode so came here. This is very well explained. Instantly subscribed.
thanks for the sub
I just found your channel. Both you & neetcode do amazing work. Thank you so much for these!
Out of all the videos I watched over this problem, yours is the one I was able to truly understand. Thank you!
So happy you feel that.
Please dont stop teaching. Crystal Clear explaination bhaiya
Kya clear explanation hai. Thank you Nikhil !!!
You are such a hardworking! appreciate your content.
So nice of you
the problem is finding the top 2,so according to leetcode if we have two values with same frequency we should return only the first one
I was also thinking that😂😂😂
Exactly.According to the Leetcode, "It is guaranteed that the answer is unique" means that there is no ambiguity in identifying the k most frequent elements in the array.
that's why his solution was accepted I think. Because his res array is of k length and if two elements have same frequency then it will be more than k and it will give an error.
It's certainly clever. But when k is small, and n is large, its wasteful of both time and space (or at least of time and memory allocations). When k == n, it's at least wasteful of space.
Essentially, it suffers from the same class of problem as bucket sort does. It's great when the data is evenly distributed. But can have some real drawbacks when the data is not. Unfortunately, for this problem, the data cannot be evenly distributed. Here's why:
Consider the case where n == k (or is very close), one of two "edge cases" are possible:
a) each element occurs once, so you have a bucket for every possible frequency between 0 and n, but the only frequency that gets used is 1. Because k cannot exceed n, if all elements are to be included then all must be of equal frequency, hence, only 1 of the buckets will get used.
b) there is only one element, and it occurs n times. Again, you will use only 1 bucket. The bucket for the "n" frequency. And again, the extra buckets are pointless.
Knowing this, we can see that it will never be possible to actually use all of the buckets, because there simply aren't enough locations in n for all the frequencies this approach accounts for.
I believe (although I don't feel like doing the math right this second), that the absolute best you could hope for would be that sqrt(n) buckets get used.
Now that we know that even if k == 1 and n is extremely large, we won't be able to use all the frequency buckets. k has no impact on that. In fact, the larger n becomes, the more "wasted" buckets there will be since the ratio of a value, v to its square decreases as v grows. The progression from 2 is 1/2, 1/3, 1/4, 1/5, 1/6, etc. So as n reaches max the number of buckets that will not be used is 1 - (1 / 63^n) assuming a 64 bit machine. And that's a lot of buckets.
As I said, same issues as bucket sort. Great if the data actually fills all the buckets. Unfortunately, given the constraints of this problem, you'll never fill all the buckets and I suspect that's why it wasn't included in the editorial.
Just want to say: We're splitting hairs here (as quite frankly, the most readable solution is a count with a sort and then taking a k sized slice and that's only barely slower than a heap in the worst case and about the same in the average case.) Quicksort and quickselect have always been complex. I've been doing this 25 years-I know no one who could implement either without a quick refresher and a little debugging. It was included in the editorial because its useful to know and understand. But in real life, you'd use an existing implementation.
Great explanation u r amazing dude ❤😊keep it up
Wonderful explanation!
Understood, thanks for the content!
bhisaab kya samjhaya hai, ekdam goated bhai
🤘🏻
ur just so underrated dude
Nice and simple thank you Sir
Great explanation of the logic. I am purely on python not java, but the way you explained this, i won't hv difficulty implementing it in python, since the logic is clear. Btw you've explained the logic better than neetcode
good job well explained :)
superb explanation, thank you, I hope you have the leetcode blind 75 solutions
How is this not O(n+k) because of the nested for loop?
Thank you so much
12:45 I think you are reffering frequency values as keys which should ideally be values , if you see frequencies have 2 coming twice which should not be the case if they are keys which are meant to be unique
i don't think the solution is going to be [1,2,3] since the loop is going to stop iterating as soon as counter becomes more than or equal to k , since k is equal to 2 and you are starting counter from 0 , adding 1 at res[0] and then 2 at res[2] as soon as it get counter = 2 , its gonna stop and the output will be [1,2]
Thanks a ton!
You're welcome!
Thank you for your extremely clear and concise video. Please rest assured that the CZcams algorithim will catch notice of your quality, and your channel will gain very quick and upward traction.
Can you please make a video on 658. Find K Closest Elements too ?
Sure..gradually though :)
Just curious, doesn't bucket sort have n^2 at the worst case and only n at the average case? While a heap would have n log k at the worst case? Shouldn't a heap be more efficiency?
It depends on your input constraints…with a smaller range, you can expect better time complexity.
Hi, what if the nums=[-1,-1] at that time hashMap = {-1 : 2} but bucket Array starts from 0? how to handle this test case? Thanks.
can you please elaborate?
You find the index using frequency not the key. In your case the frequency of -1 is 2 , So -1 is inserted at index 2.
in the final example, the result array is defined as int[] res = new int[k] where k = 2. So only 2 elements can be added. However, the answer is [1, 2, 3]. Wont this throw index out of bounds for this example?
can you give me a sample test case?
@@nikoo28the same one in the video. [1,1,1,1,2,2,3,3,4] gave an indexoutof bounds. Try it.
thanks, I fixed the code in the github link now. Basically add all elements to a list, and then return it as an array.
This particular test case is kinda unique, the value of k=2 but we have 3 elements. Hence, needed to handle it separately. Sorry for the confusion.
Great explanation but I have never seen List initialized like an array. Is there any alternative to do that? I understand now how it works and why it is needed but it's just not that intuitive to me. Probably I am dumb. Probably Map would be more intuitive to me
no approach is dumb, just a preference...as long as you work within the expected time limits...
brother the way u solve the problem is like ABCD. How to create that thinking in DSA.
It is so wonderful once you start piecing things together :)
how do you handle -ve numbers
That will be a different problem
Amazing
Thank you! Cheers!
First view, first like, first comment
Awesome Explanation Nikhil. Thank you so much for time and effort and sharing your knowledge. I have tested your code with this input int[] arr = new int[]{1, 1, 1, 1, 2, 2, 3, 3, 4,4}; out output should be [1] [2,3,4] but i found an error since you have int[] res=new int[k]; , so we need to change this line as int[] res=new int[nums.length];
What is your value of k in your test case?
@@nikoo28 Hi Nikhil. K value is 2. Please correct me if my understanding is wrong.
@@mamu11111 that is a very good catch, and I verified it myself. Thanks for pointing that out, I will correct it. :) and I think even LeetCode does not have that test case 👍
your changes are wrong. the question is for top k frequent elements, thats why your test case is not valid for the question.
❤
Why is the solution O(n) and yet there was a nested loop at the end? i don't understand
Just because there is a nested loop does not mean a time complexity of O(n ^ 2).
You need to think how many iterations will happen. In the last loop, you can have a maximum of n iterations when all elements of array are different and the value of k=n
Hence the time complexity will be O(n)
If we get three numbers in the result it is throwing index out of bounds as the size of the array has been limited to K.
Is your testcase within the problem constraints?
14:03 you are creating an array of size k then how can you add 3 elements if k is 2 as stated in example 6:09
where am i adding 3 elements?
why create bucket of length nums.length+1? why not just nums.length?
Because of 0 based indexing.
Because the numbers in a given array appears at least once, therefore creating bucket of length nums.length for an array of a single element would have only one element of index 0 which means elements with 0 frequency (e.g. array = [1], k=1) this would create a bucket of a single element (bucket[0]) with index 0, which means there can be only elements with 0 frequency that can be stored there which we don't need.
your line of code in the dry run, when populating the result array :
res[counter++] = integer;
^ shouldnt the above line just be: res[counter] = integer without incrementing counter first? when you do counter++, res[1] will be populated.
i think populating the result array should be:
for (Integer integer : bucket[pos]) {
res[counter] = integer;
counter++;
}
please let me know what you think
counter++ is post increment it doesn't matter if its
res[counter++] = integer;
or
res[counter] = integer;
counter++; both are same
so at first iteration counter will be 0 then after that it will increment by 1
bhai [1,1,1,1,2,2,3,3,4] and k = 2, test case hi galat hai kyuki answer unique nahi hai, it is clearly mentioned in constraints, It is guaranteed that the answer is unique. toh 2 and 3 ki freq same nahi ho sakti aur agar hogi toh k ki value 3 hogi.
Nikhil, your code would not work for the test case you mentioned:
[1,1,1,1,2,2,3,3,4] & k=2
This code is getting submitted on Leetcode because there it is mentioned that unique answers only.
But in the about test case:
We should get [1,2,3] as ans for k=2.
You cannot assume the res array of size k since there might be duplicacy.
Otherwise solution works fine for the Leetcode problem.
Here is the Code which will cover duplicacy as well.
class Solution {
public static int[] topKFrequent(int[] nums, int k) {
int n = nums.length;
List[] bucket = new ArrayList[n + 1];
HashMap frequencyMap = new HashMap();
ArrayList resultList = new ArrayList();
for (int num : nums) {
frequencyMap.put(num, frequencyMap.getOrDefault(num, 0) + 1);
}
for (int i = 0; i {
bucket[frequency].add(element);
});
for (int i = n; i >= 0; i--) {
if (bucket[i] != null) {
resultList.addAll(bucket[i]);
if (resultList.size() >= k) {
break;
}
}
}
int[] result = new int[resultList.size()];
for (int i = 0; i < result.length; i++) {
result[i] = resultList.get(i);
}
return result;
}
I have silly doubt here. You are saying that for your test case ans is 1,2,3 . here is three element. and size of the res array is 2 cause k is 2. Thats makes me confused . It might be stupid question to ask!
There are 3 types of elements -> 1, 2 and 3
We need only top k (2) frequent elements. So I only give answer as 1 and 2
You are returning 2 elements.
this can be solved by PriorityQueue also
yes
this solution will certainly not work for the input nums= [-1,-1] & k =1
thanks for the test case. I had missed these cases while making the video. However, if you check the code on Github, I have updated it to handle such cases. :)
Hope it helps
6:37 the test case, you have taken to demonstrate the problem is not correct because according to the problem statement the answer should be unique
yes, I realized it a while ago. Have fixed the code in github link to handle that particular case. Thanks for pointing that out :)
But hope you get the idea, how to solve the problem.
it is better if u use a mic
Am I missing some thing here? The same code is giving ArrayIndexOutOfBoundsException for input {1,1,1,1,2,2,3,3,4}, 2 in my IDE in the last for loop but it is accepted in Leet code.
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 2 out of bounds for length 2
at TopKFrequentElements.topKFrequent(TopKFrequentElements.java:30)
at TopKFrequentElements.main(TopKFrequentElements.java:39)
try having a look again, maybe you are missing something
@sakishakkari You are correct. For that test case, this code does throw an exception since int[] res = new int[k]