Partition to K Equal Sum Subsets - Backtracking - Leetcode 698 - Python
VloĆŸit
- Äas pĆidĂĄn 24. 07. 2024
- đ neetcode.io/ - A better way to prepare for Coding Interviews
đŠ Twitter: / neetcode1
đ„· Discord: / discord
đź Support the channel: / neetcode
â BLIND-75 PLAYLIST: âą Two Sum - Leetcode 1 -...
đĄ CODING SOLUTIONS: âą Coding Interview Solut...
đĄ DYNAMIC PROGRAMMING PLAYLIST: âą House Robber - Leetco...
đČ TREE PLAYLIST: âą Invert Binary Tree - D...
đĄ GRAPH PLAYLIST: âą Course Schedule - Grap...
đĄ BACKTRACKING PLAYLIST: âą Word Search - Backtrac...
đĄ LINKED LIST PLAYLIST: âą Reverse Linked List - ...
đĄ BINARY SEARCH PLAYLIST: âą Binary Search
đ STACK PLAYLIST: âą Stack Problems
Problem Link: Partition to K Equal Sum Subsets
0:00 - Read the problem
3:09 - Explain Solution #1
5:42 - Explain Solution #2
9:32 - Coding Explanation
leetcode 698
This question was identified as an interview question from here: github.com/xizhengszhang/Leet...
#sorted #array #python
Disclosure: Some of the links above may be affiliate links, from which I may earn a small commission. - VÄda a technologie
Correction: The time complexity of the 2nd solution is actually O(2^ (k*n)), because if we have K trees stacked on top of each other, the new height of the tree is K * n.
Yes more python
Remember big o is asymptotic, meaning the variables k and n approach infinity. In that regime, O(2^(kn))
Wait isn't k^n better than 2^(kn) ?
I do not think this is correct.. It should be O(K.N . 2^N) as the max depth for one tree is N. 2^N and then if we will do K subsets then it will take K.N. 2^N. The trees being stacked doesn't mean they are on top of each other as they are technicall seperated as we find the second subset after we find the first subset so they wont overlap
Time limit exceeded for this code in Python3
You're explanation made it seem so clear and simple. Thank you so much!
The additional optimizations in the end are a nice bonus and probably also important. I imagine they would help a lot with follow-up questions if they're not part of your initial solution.
Thank God we have CZcams and you!
Finally found someone who could explain me this problem.
It's giving timeout for the above code, by adding minor conditions we can pass through this
since we already sorted the array, the duplicated numbers are also stuck together. We can tweak the for a loop a little bit to avoid duplicated backtracking
# Avoid duplicated backtracking
if i > 0 and not visited[i -1] and nums[i] == nums[i - 1]:
continue
Hope this helps
Full Code:
def canPartitionKSubsets(self, nums: List[int], k: int) -> bool:
n = len(nums)
nums.sort(reverse=True)
total = sum(nums)
if total % k != 0:
return False
target = int(total / k)
used = [False] * n
def dfs(index, total, k):
if k == 0:
return True
if total == 0:
return dfs(0, target, k - 1)
for i in range(index, n):
if i > 0 and not used[i - 1] and nums[i] == nums[i - 1]:
continue
if used[i] or total - nums[i] < 0:
continue
used[i] = True
if dfs(i + 1, total - nums[i], k):
return True
used[i] = False
return False
return dfs(0, target, k)
nice point was getting tle after this it passed
I think leetcode add some extra test case so that the video code doesn't pass anymore. Alternative solution may consider memoization
Why does that line help? How does it work?
if i-1 not used, means the same value for current i won't be choosed for kth partition @@zzzsun1272
I saw your post yesterday can you please make video on how to calculate time complexity of recursive problems, i am really having a hard time to calculate it
can bit masking be applied their, as i saw in some solution but didn't get it
Keep doing more videos bro...I always loved your explanation and I am preparing for job interviews...I will let you know if I get a job...Please keep making such good content for free.
job mila
10/10 explanation! Thanks!
How can we also store the solution in a 2d list instead of just returning true or false ? Please help.
I think this question is the same as "matchsticks to square" question with a twist that instead of just 4 we can have k partitions. And that is the O(N ^ K) solution that you mention at the beginning. It passes 151/159 test cases before running into TLE
From LeetCode discussions/submissions, I found a brilliant solution to prune the cases causing TLE
Just add 1 line of code in the matchstick approach:
def canPartitionKSubsets(self, nums: List[int], k: int) -> bool:
nums.sort(reverse=True)
ts = sum(nums)
each_partition = sum(nums) // k
if len(nums) < k or ts % k != 0 or list(filter(lambda x: x > each_partition, nums)):
return False
partition = [0] * k
def _make_partition(i):
if i == len(nums):
return True
for j in range(k):
if partition[j] + nums[i] > each_partition:
continue
partition[j] += nums[i]
if _make_partition(i + 1):
return True
partition[j] -= nums[i]
"""
The following piece of code holds the key to avoid TLE by pruning
What this basically means is that since we have sorted the nums, if a partition totally empty
with current nums[i], there is no point in trying the further elements which anyways will be >= nums[i]
"""
if partition[j] == 0:
break
return _make_partition(0)
@@VarunMittal-viralmutant Great! I was trying to think on how to optimize! but looks like your code is mising a partition[j] -= nums[i] inside the for loop. Works perfect otherwise!
@@adityats259 Yup !! Thanks for pointing that out. Fixed now :)
Could you clarify why we need to do that cleanup? In this case, arenât we iterating over the used array only in one direction? Iâm struggling to visualize a case in which setting back to False would mean the difference between the function breaking or not
3 4 8 k = 3
if you consider this case you can see that the sum is equal to 15 i.e we have to devide it into 3 parts of 5 sum each...but we can't so after checking each condition and running loop it will trigger return false;
Implemented same algorithm in Java. only 53 test cases passed - Rest are failing with time limit exceeded. Looks like TC is not k * 2^n. Its 2 ^ (k*n)
I tried this code in python, still time limit exceeded, passed 152/159
true.
I got a Time Limit Exceed by this method.... exactly the same code and I don't know why
In neetCode Pro can we get the bitmask solution ?
To fix the new TLE error change the first conditional statement in the for loop to this:
if used[j] or subsetSum + nums[j] > target or (j > 0 and nums[j] == nums[j-1] and not used[j-1]):
continue
Aaaaaaaaaaahh, nice.
how and why the second condition?
@@sumishajmani705 the subsetSum + nums[j] > target ? He explains it in the video -> we want to make sure that if we add nums[j[ to our current subsetSum it doesn't go over the target. If it goes over our target, it won't be valid
@@Laura-ke4hr I think he's asking about the other condition (j > 0 and nums[j] == nums[j-1] and not used[j-1]):
can you please explain this condition
@@sanjanar9198 its about the repetitive case where the array contains some duplicate values.
I think you completed giving solutions to blind curated 75 rt now can you please start sean prasad leetcode patterns
Great explanation! This one is very similar to LC#473 but more complicated.
it is as complicated as the LC#473 match sticks one, there is no different
it's the same exact solution except k = 4
Loving your content :D
Problem link is missing from the description.
Great explanations
brilliant explanation
Hey great solution.
But the tree that u drew doesn't correspond to the solution u wrote.
There are 2 recursive approaches (that is know of), for subsets using recursion
1. take/don't take approach (classical)
2. brute force dfs approach (dfs)
the tree u drew is of "take/don't take approach" but u coded it in "dfs approach"
This tbh. Took me a bit to comprehensive. Glad I'm not the only one
@@MinhNguyen-lz1pg he usually does this when we have the take/don't take approach which always confused tf out of me. This should just be the take/no take approach
Where is ur base case for i?
Those who want backtracking solution with memoization.. they can refer this code....
class Solution:
def canPartitionKSubsets(self, nums: List[int], k: int) -> bool:
used = [False]*len(nums)
total_sum = sum(nums)
if total_sum % k != 0:
return False
target = total_sum // k
nums.sort(reverse = True)
#sorting the array in descending order
#so if first value is greater than target, it will not be included in any subset
#so we cant partition the entire array into k equal sum subsets
if nums[0] > target:
return False
dp = {}
def backtrack(i,k,rem):
#since subproblem depends on used indices of array
#if same subproblem occurs again just return dp value
if tuple(used) in dp:
return dp[tuple(used)]
if k == 0:
return True
if rem == 0:
partition = backtrack(0,k-1,target)
dp[tuple(used)] = partition
return partition
for j in range(i,len(nums)):
if not used[j] and rem-nums[j] >= 0:
used[j] = True
if backtrack(j+1,k,rem-nums[j]):
return True
used[j] = False
dp[tuple(used)] = False
return False
return backtrack(0,k,target)
according to the constrains, the length of the nums can be 10^4. Which means our used array will be of same size. So is it ideal to convert the used array to tuple again and again (which is an O(N) operation in itself) and caching such a huge tuple which will take up a lot of memory?
2 points -- a problem, a suggestion.
Problem
=======
You suggested to sort numbers in desc order to speed up the algo. I notice, if not desc sorted, algo may sometimes give correct result based on incorrect data (non-disjoint subsets), or may give incorrect result.
For instance, if sorted -- 1 2 2 3 3 4 5 -- subsets would be (1 2 2), (5), (1 4), (2 3). Numbers 1 and 2 are part of multiple subsets. Yet algo returns true. But data is not as per expectations.
So, desc sort is a mandatory requirement -- 5 4 3 3 2 2 1. Subsets would be (5) (4 1) (3 2) (3 2)
If not done, 2nd best solution could be :
Let the recursive algo run until all combinations of numbers are examined -- i.e. remove the (k == 0) check. Collect all subsets of numbers, possibly non-disjoint. For numbers 1 2 2 3 3 4 5 we would get non-disjoint subsets (1 2 2) (5) (1 4) (2 3) (2 3) (2 3) (2 3) -- 7 subsets. Now, repeat the same recursive algo on the subsets. In 1st run, input is the list of numbers, and we sum the numbers, and add to a subset when they add up to the subset size. In 2nd run, input is the list of subsets, and we reject non-disjoint subsets. When algo ends, we would've subset size (subset size = sumOfNumbers/k)
Please comment
Getting TLE, need to add one more check of skipping duplicates
Why those people put Medium tag in hard problems
tis is amazing...
Wow, just wow. What an explanation. How can I become a better coder like you?
Most of the solutions are already in leetcode comments or in leetcode solution tab if available. He just dumbs it down so a lot more people understand them :)
@@obiwankenobi07 lol
Cannot pass all the test cases with this method. Is there way to improve it further
yes there is
I think this solution needs 'proving'; i.e. how do we know that when finding the first subset for our target sum (S/k), we are on the right track? I mean it's possible to find a subset with sum == S/k, but still going down to find others, we hit a dead end (which means we need to try all the possible combinations of the candidate subsets). Unless we prove that the approach is valid (finding the first subset with the target sum is ok), this solution may be wrong.
@@GoogleAccount-wy2hc Not necessary. Lets say we have nums as [9,10,1,7,2,7,1,1,1,3] and k as 3. The target becomes 42/3 = 14. One way to get 14 is with [7,2,1,1,3]. However this only leaves [9,10, 1, 7, 1] from where you can't get 14. However, if you split it as [10,3,1] [7,7] [9,1,1,1,2] you can get three splits of target.
It isn't clear in the explanation, but the code doesn't assume that the first set of numbers that sum to target will be the ones used in the final solution. By marking used as false after doing the recursive call, the code leaves the door open to alternative combinations. You are right in assuming that a greedy solution would not work for this problem.
Example: nums=[1,1,1,1,2,2,2,2] k=4
Note, seems they've added new test cases so this backtracking solution without memoization will cause TLE.
Yeah
class Solution {
public boolean canPartitionKSubsets(int[] nums, int k) {
int sum=0;
for(int i=0; i=0; i--){
if(nums[i] != -1 && sum + nums[i]
Sorting the array and returning false if array value exceeds the target is also accepted in leetcode.
@@rashidixit1168 what is array value here?
@@suryanshsingh2559
When array elements are sorted if at certain point a particular element can not be included because it is greater than target then solution will be false because no other element in further array can not be included due to the fact that further elements are even more larger.
Here is my solution for the same:
bool solve(vector& m,int t,int op,int ind,int pno,int k)
{
if(ind==m.size())
{
if(pno==k)
return true;
if(op==t)
return solve(m,t,0,0,pno+1,k);
return false;
}
// vector tmp(m.begin(),m.end());
// tmp.push_back(op);
// tmp.push_back(ind);
// tmp.push_back(pno);
// if(mp.find(tmp)!=mp.end())
// {
// //cout
why can't we count the number of subsets whose sum is equal to the sum(nums)/ k? and if that count is greater than or equal to K we return true else false.
Thanks!
Thank you so much!!!!
U a God
Backtracking without memoization doesn't work in leetcode anymore
One doubt - Since we have 2 choices at every backtracking step - Include or dont iclude the current element - Why haven't we done this:
``` if backtrack(j+1, k, subsetSum + nums[j]) or backtrack(j+1, k, subsetSum): ```
Do we never look into this path - backtrack(j+1, k, subsetSum)?
@neetCode
That path is taken care by for loop
The thing about backtracking is that the decision is not explicit in the code. Basically you have something like
// include the element
if backtrack(...) return True // process next nodes for this given path
// if we land here, some step in our decision tree didn't work out, so reverse our decision (do not include current element) and try to include the next element.
We're applying this logic recursively to try out every possibility at any point in the path and then "roll back" if there was an error in some path.
coming up with the solution directly is difficult
Backtracking without memoization, I think it will be TLE for this kind of knapsack problem.
But I think we can't do memoization, every call results in a different state.
Sorry, I got my solution passed after adding memoization.
Basically [visited, start_index] is the key, true/false is the value.
To save 'visited' as int in key, I used bit-masking, since the constraints already say length is max 16 digits.
@@dataman4503 could you please actually share the code? I am not sure I see the memo here. Since each iteration over k is different, where is the duplicate work
@@orellavie6233 I tried to post my solution link here , but it keeps getting deleted.
The key is [visited, start_index] and not [visited, start_index, currSum], because for a (visited,startIndex), Currsum will always be same. So that is actually not required.
@@dataman4503 I understood what you wrote from the top comment, but I still can't see how the start idx is shared between different k.. we skip each number we already checked, so it already memoized. Could you please provide a few steps of your algo with an example, even with the one in the video
what if the test case is:- [1,2,3,4,5,6,7,8,9,10] and k=5
going by ur method the first subset with 11 sum will be [1,2,3,5] and we will have subsets less than 5 whose sum is 11.
But there can be 5 subsets with 11 sum as [1,10],[2,9],[3,8],[4,7],[5,6].Hope u get my point.So this method will give false but it should be true.
Yeah same thing i am also getting for this tc
yes I think this method will work for few test cases
I got this method working
Just you need to randomly shuffle the array few times and check for all shuffled array
@@codewithamir21 man that will lead to a large time complexity
Can anyone help me to write a code to print the K subsets with equal sum..
Condition is that we can only a element once in a subset..
Input- 1 2 2 3 3 4 5
K=4
def canPartitionKSubsets(nums, k):
total = sum(nums)
if total%k!=0:
return False
m = total//k
nums.sort(reverse = True)
visited = [False]*len(nums)
def dfs(ind, s, k, res, output):
if k == 0:
return output
if s == m:
return dfs(0, 0, k-1, [], output+[res])
for i in range(ind, len(nums)):
if i > 0 and not visited[i - 1] and nums[i] == nums[i - 1]:
continue
if visited[i] or s+nums[i]>m:
continue
visited[i] = True
output = dfs(i+1, s+nums[i], k, res+[nums[i]], output)
if output:
return output
visited[i] = False
return False
return dfs(0, 0, k, [], [])
Another optimization.
We dont need to look for k==0. we can simply look for k==1.
came up with method 1. TLE
Maybe there could be 1 more optimization in the beginning:
"""
If any of the individual elements is greater than the target sum,
we can never generate equal sum partition as nums are all +ve
"""
if list(filter(lambda x: x > target, nums)):
return False
Does anyone have the solution for this problem with memoization?
Here is a crude memoization added and code passed on LeetCode.
I am maintaining `used` array of 0s/1s and using that and subSetSum as memoization key:
nums.sort(reverse=True)
ts = sum(nums)
each_partition = sum(nums) // k
if ts % k != 0 or list(filter(lambda x: x > each_partition, nums)):
return False
# To keep track of which elements we have already chosen
used = [0] * len(nums)
memoize = {}
def dfs(i, k, subSetSum):
if k == 0:
return True
if ("".join([str(d) for d in used]), subSetSum) in memoize:
return memoize[("".join([str(d) for d in used]),subSetSum)]
if subSetSum == each_partition:
partition = dfs(0, k-1, 0)
memoize[("".join([str(d) for d in used]), subSetSum)] = partition
return partition
for j in range(i, len(nums)):
if used[j] or subSetSum+nums[j] > each_partition:
continue
used[j] = 1
partition = dfs(j+1, k, subSetSum+nums[j])
if partition:
memoize[("".join([str(d) for d in used]), subSetSum+nums[j])] = True
return True
used[j] = 0
memoize[("".join([str(d) for d in used]), subSetSum)] = False
return False
return dfs(0, k, 0)
But instead check my other comment(in this question) about this question being exactly same as 'czcams.com/video/hUe0cUKV-YY/video.html' with some optimization to prune the decision tree. That code runs with 92% efficiency
Here is a more efficient solution -> this creates 4 subset buckets and then places the item in each as long as the target sum is met or less than it. Sorting is important here
class Solution:
def canPartitionKSubsets(self, nums: List[int], k: int) -> bool:
total_sum = sum(nums)
target_sum = total_sum // k
if total_sum % k != 0 or max(nums) > target_sum:
return False
nums.sort(reverse=True)
subset_sums = [0] * k
def backtrack(index):
if index == len(nums):
return len(set(subset_sums)) == 1
print(subset_sums)
for i in range(k):
if subset_sums[i] + nums[index]
if you're getting TLE its means you're calling duplicates backtrack we need to skip duplicate backtrack
sort the array
add if(visited[i] || i > 0 && nums[i] == nums[i - 1] && !visited[i - 1]) continue; inside for loop or you can use dp with bitmask because max length is 16
thanks, this helped.
Use a visited set in DFS, hum...
It doesn't even pass the test cases
Delete the black set by checking if the len(cur) == 0 after backtracking, then break.
what about a testcase like 1 1 1 1 3 3 3 3, and k = 4.
This backtracking solution works fine in 2021. However, in 2022, the website adds a new test case, with which this solution exceed the time limit.
The new test case is:
nums = [2,9,4,7,3,2,10,5,3,6,6,2,7,5,2,4]
k = 7
To make it not TLE we can use memoization on the "used" state. Don't need to memoize k because the target is always the same for all ks, so just memoizing "used" will make it not TLE. You can use a hashmap for memoizing "used". For efficiency you can use a bitmask for used since the max values is 16 so it fits in an int.
@@alexm1930 Hey, can you explain this, or can you share your code please
@@sanjanar9198 Hey my code is in C++ but here it is:
class Solution {
unordered_map cache;
public:
bool canPartitionKSubsets(vector& nums, int k) {
int sum = 0;
for(int num : nums)
{
sum += num;
}
if(sum % k)
{
return false;
}
int target = sum / k;
uint16_t used = 0; // use a mask so it's easy to cache, max is 16 from constraints
// should be able to cache just on the used state since the target is always the
// same no matter what k we are computing for
return canPartitionKSubsets(nums, used, target, 0, k, 0);
}
bool canPartitionKSubsets(vector& nums, uint16_t used, int target, int i, int k, int sum) {
// base case we don't have to check final k since it should be valid due to preprocessing
if(k == 1)
{
return true;
}
if(cache.count(used))
{
return cache[used];
}
if(sum == target) // reached target for this k, try k - 1
{
return canPartitionKSubsets(nums, used, target, 0, k - 1, 0);
}
for(int j = i; j < nums.size(); j++)
{
uint16_t mask = (1 target)
{
continue;
}
// 2 recursive choices:
// use this number
used |= mask;
if(canPartitionKSubsets(nums, used, target, j + 1, k, sum + nums[j]))
{
return true;
}
// clear bit from mask
used &= ~mask;
// skip this number; no need to call recursive function for skiping
}
cache[used] = false;
return false;
}
};
very nice solution, TO prevent TLE we need to use Memoization,
class Solution:
def canPartitionKSubsets(self, nums: List[int], k: int) -> bool:
if sum(nums)%k:
return False
target_sum=sum(nums)//k
visited_indexes=[False for k in range(len(nums))]
nums.sort(reverse=True)
cache={}
def find_partition(left_index, k, sums):
if k==0:
return True
if tuple(visited_indexes) in cache:
return cache[tuple(visited_indexes)]
if sums==target_sum:
cache[tuple(visited_indexes)]=find_partition(0, k-1, 0)
return cache[tuple(visited_indexes)]
if sums>target_sum:
cache[tuple(visited_indexes)]=False
return cache[tuple(visited_indexes)]
for i in range(left_index, len(nums)):
if not visited_indexes[i]:
visited_indexes[i]=True
if find_partition(i+1, k, sums+nums[i]):
return True
visited_indexes[i]=False
cache[tuple(visited_indexes)]=False
return cache[tuple(visited_indexes)]
return find_partition(0, k, 0)
java solution
```java
class Solution {
public boolean canPartitionKSubsets(int[] nums, int k) {
int n= nums.length;
int target = 0;
for(int i = 0 ; i < n ; i++)
{
target += nums[i];
}
// if sum of the array cannot be divided into k buckets then return false.
if(target%k != 0) return false;
else target = target/k ;
//Array to keep track of what elements we have used
boolean[] arr = new boolean[n];
Arrays.fill(arr , false);
return solve(arr , nums , 0 , 0 , target , k);
}
//k = number of subsets to be made
//target is the desired sum of each subset (sum for each bucket)
public boolean solve(boolean[] arr , int[] nums , int subsetSum , int i , int target ,int k)
{
//base condition 1: No more subsets to be made
if(k == 0)
return true;
// base condition 2: 1 subset made then do k-1 and set subSet sum back to zero.
if(subsetSum == target)
{
return solve(arr ,nums , 0 , 0 , target , k -1);
}
for(int j = i ; j < nums.length ; j++)
{
if(arr[j] == true || (subsetSum + nums[j]) > target)
continue;
arr[j] = true; //mark the value we have used
//recursive call
if (solve(arr ,nums , subsetSum + nums[j] , j+1 ,target, k))
return true;
//backtracking
//unmarking the value we have just used
arr[j] = false;
}
// our loop through every possible decision and doesn't find decision so return false
return false;
}
}
// NeetCode -- czcams.com/video/mBk4I0X46oI/video.html
```
My only issue with this solution is that what if k == 0 and not all the numbers have been used? Why is this not a possibility?
A lot of people said this solution TLE. Today is March 2022, I used the same code in Javascript, passed all cases. Runtimes are varying for the same solution (3-7seconds), I guess it depends on the speed of the provider. Neetcode - thank you so much for all these videos! You make it simpler to understand for a ton of people!
TLE here even with equivalent C++ code... I guess we need to use the dynamic programming solution instead
who else are getting TLE for this exact same code
This approach is not longer accepted. It is too slow.
For reference, this + memoization and a crazy DP solution are the only acceptable solutions atm.
def canPartitionKSubsets(self, nums: List[int], k: int) -> bool:
"""
dfs(subsetsum, k)
dfs(0,4)
/
dfs(4,4)
/
dfs(0,3)
- we need visisted array to avoid to take used num
-
O(subsetsum*k*n)
"""
n = len(nums)
total_sum = sum(nums)
target_sum = total_sum // k
# edge case
if total_sum % k != 0:
return False
self.memo = {}
def dfs(subset_sum, k):
"""
use s
"""
if k == 0:
return True
if subset_sum == target_sum:
return dfs(0, k-1)
key = tuple(used)
# key = (subset_sum, k)
if key in self.memo:
return self.memo[key]
for i in range(n):
if used[i]:
continue
if subset_sum + nums[i] > target_sum:
break
used[i] = True
if dfs(subset_sum+nums[i], k):
key = tuple(used)
# key = (subset_sum, k)
self.memo[key] = True
return True
used[i]= False
key = tuple(used)
# key = (subset_sum, k)
self.memo[key] = False
used = [False for _ in range(n)]
nums.sort(reverse=True)
return dfs(0, k)
It works for all the test cases/TLE shit.
Does anyone can explain why we used used array as key of cache instead of (subset_sum, k)
First
9th
739th
As a few others have pointed out, if you finalise one subset that has the target sum, it still might not be valid subset in the final scheme of things. The given solution might work but this important nuanced point is not explained properly at all which basically makes this explainer useless.
You should never tell time complexity before explaining the solution.
Hi @NeetCode
[2,2,2,2,3,4,5]
4
Your code is giving "True" for this testcase but expected is "false" can you check it once?.