Problem Statement
Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.
Examples
- For nums = [1,1,1,2,2,3], top 2 frequent elements are 1 and 2
- For nums = [4,2,4,2,4,2,3,4,3,2], top 2 frequent elements are 2 and 4
Solution
Points of Interest
This is clearly a heap problem. We can utilize a modified version of min-heap of size k to solve this problem.
- First we iterate over array to build a counts map, where key is array element and value is its frequency in the array
- With an empty heap initially, we keep pushing the elements into heap - (frequency, value). Note that, in python a tuple can be compared with each other. A tuple with first largest value is greater than another tuple.
- We make sure if heap size grows greater than k, we pop a value from it.
- Popping a tuple from a min heap means that it is the smallest tuple of the bunch.
- In the end we will have a heap with k largest tuples from the array. We consider values from them as answer
Algo
- Iterate over the nums array and capture frequency of elements in a dict named counts
- Create an empty heap array
- Iterate over the keys of counts
- Push (count, val) in heap
- If size of heap is greater than k, we pop an element from the heap
- In the end, return all values from the heap
Code
import heapq
from collections import defaultdict
def topKFrequent(nums: list[int], k: int) -> list[int]:
counts = defaultdict(int)
for n in nums:
counts[n] += 1
min_heap = []
for key in counts.keys():
c = counts[key]
heapq.heappush(min_heap, (c, key))
if len(min_heap) > k:
heapq.heappop(min_heap)
return [v for c, v in min_heap]
Complexity
- We iterate over array of size to build the frequency dict
- For each element in the frequency dict we push element into heap, taking time
- For most of the elements in the frequency dict, we pop element from heap, taking time
- We are maintaining frequency dict whose worst case size can be
- We are maintaining heap array whose size never goes above