Find top k (or most frequent) numbers in a stream

You are given an Integer array ‘ARR’ and an Integer ‘K’. Your task is to find the ‘K’ most frequent elements in ‘ARR’. Return the elements sorted in ascending order.

For Example:

You are given ‘ARR’ = {1, 2, 2, 3, 3} and ‘K’ = 2. Then the answer will {2, 3} as 2 and 3 are the elements occurring most number of times.
Input Format :
The first line of the input contains a single integer 'T', representing the number of test cases.

The first line of each test case contains two space-separated integers, ‘N’ and ‘K’, representing the size of ‘ARR’ and given integer ‘K’, respectively.

The Second line contains ‘N’ space-separated integers representing the elements of ‘ARR’.
Output Format:
For each test case, print the ‘K’ most frequent elements in ‘ARR’.

Print the output of each test case in a separate line.
Note:
It is guaranteed that a unique answer exists.

You do not need to print anything. It has already been taken care of. Just implement the given function.
Constraints:
1 <= T <= 10
1 <= N <= 5000
1 <= K <= Number of unique elements in ‘ARR’
1 <= ARR[i] <= 10^6

Time Limit: 1 sec
CodingNinjas
author
2y

The idea is to store the top k elements with maximum frequency. To store them a vector or an array can be used. To keep the track of frequencies of elements creates a HashMap to store element-frequenc...read more

CodingNinjas
author
2y
QuickSelect

Prerequisite: Quick Sort

 

In this approach, we will try to divide the problem at every step into a smaller problem. We know that the kth top frequent element is (n - k)th less frequent. So, we will do a partial sort from the less frequent element to the most frequent one, till (n - k)th less frequent element takes place (n - k) in a sorted array. To do so, we will use quickselect. Quickselect is an algorithm used to find the kth smallest element in an array. In this problem, we will modify the general quickselect algorithm to find (n - k)th less frequent element. 

 

We will perform quickselect on an array containing all the keys of the hash map where the key is an element of ‘ARR’ and value is how often this element appears in 'ARR'. We will use a partition scheme to put all the less frequent elements to the left of the pivot and the more frequent elements to the right of the pivot. Then, we will put the pivot element at its correct position in the final sorted array(sorted according to the frequencies). When the pivot reaches the target position, i.e.,(n - k)th position, then all the elements to its right will be k most frequent elements.

 

Algorithm :

  • Initialize a hash map ‘MAP’ where the key is the element and value is how often this element appears in ‘ARR’.
  • Iterate every element in ‘ARR’:
    • Add the current element to ‘MAP’ and increase its value by 1.
  • Set ‘SIZE’ as ‘MAP.SIZE’.
  • Initialize an integer array ‘UNIQUE’ and add all the keys of ‘MAP’ to it.
  • Call ‘QUICKSELECT’(0, ‘SIZE’ - 1, ‘SIZE’ -’K’).
  • Return elements of ‘UNIQUE’ from index‘(SIZE’ - ’K’) to ‘SIZE’.

 

Maintain a function ‘QUICKSELECT’(‘LEFT’, ‘RIGHT’, ‘KSMALL’):

  • If ‘LEFT’ is equal to ‘RIGHT’, return.
  • Select a random pivot between ‘LEFT’ to ‘RIGHT’.
  • Set ‘PIVOT’ as ‘PARTITION’(‘LEFT’, ‘RIGHT’, ‘PIVOT’).
  • If ‘KSMALL’ is equal to ‘PIVOT’, return.
  • Otherwise, if ‘KSMALL’ is less than ‘PIVOT’, call quickselect on the left partition.
  • Otherwise, call quickselect on the right partition.

 

Maintain a function ‘PARTITION’(‘LEFT’, ‘RIGHT’, ‘PIVOT’):

  • Set ‘PIVOTFREQUENCY’ as the frequency of ‘UNIQUE’[‘PIVOT’].
  • Swap ‘UNIQUE’[‘PIVOT’] and ‘UNIQUE’[‘RIGHT’].
  • Set ‘IDX’ as ‘LEFT’.
  • Iterate ‘I’ in ‘LEFT’ to ‘RIGHT’
    • If the frequency of ‘UNIQUE’[‘I’] is less than the frequency of the pivot element, move ‘UNIQUE’[‘I’] to the left of the pivot element and increment ‘IDX’ by 1.
  • Swap ‘UNIQUE’[‘IDX’] and ‘UNIQUE’[‘RIGHT’].
  • Return ‘IDX’.
Space Complexity: O(n)Explanation:

O(N).

 

The Hash map and the array of unique elements can have at most N elements. Hence, the total space complexity is O(N).

Time Complexity: O(n^2)Explanation:

O(N ^ 2), where 'N' is the size of the input array.

 

The worst-case time complexity of quickselect is O(N ^ 2) when bad pivots are which doesn’t divide the problem in half are chosen constantly. But when the pivots are chosen randomly, the probability of such a worst-case is small. The average case time complexity of quickselect is O(N).


Python (3.5)
'''

Time Complexity: O(N ^ 2)
Space Complexity: O(N)

where 'N' is the size of the input array.
'''

from math import ceil, floor
import random
from typing import List

uniq = []
mp = {}


def swapUnique(a: int, b: int):
global uniq
temp = uniq[a]
uniq[a] = uniq[b]
uniq[b] = temp


def partition(left: int, right: int, pivot: int):
global uniq
global mp

pivotFrequency = mp[uniq[pivot]]

# Move pivot to end.
swapUnique(pivot, right)
idx = left

# Move all less frequent elements to the left of the pivot.
for i in range(left, right + 1):

if (mp[uniq[i]] < pivotFrequency):

swapUnique(idx, i)
idx += 1

# Move pivot to its final place.
swapUnique(idx, right)

return idx


def quickselect(left: int, right: int, kSmall: int):

# If the list contains only one element.
if (left == right):
return

# Select a random index as pivot.
pivot = left + floor(random.uniform(0, 100000)) % (right-left)

# Find the position of pivot in the sorted list.
pivot = partition(left, right, pivot)

# If the pivot is in its final sorted position.
if (kSmall == pivot):
return

elif (kSmall < pivot):
# Move in the left direction.
quickselect(left, pivot - 1, kSmall)

else:
# Move in the right direction.
quickselect(pivot + 1, right, kSmall)


def KMostFrequent(n: int, k: int, arr: List[int]):

global mp
global uniq
mp.clear()

# Build map where the key is element
# and value is how often this element appears in 'ARR'.

for ele in arr:
if ele not in mp:
mp[ele] = 0
mp[ele] += 1

size = len(mp.items())
uniq = [0 for i in range(size)]
i = 0

for x in mp:
uniq[i] = x
i += 1

# Perform quickselect.
quickselect(0, size - 1, size - k)

# Return top 'K' frequent elements
topK = []

for i in range(size - k, size):
topK.append(uniq[i])

topK.sort()
return topK

C++ (g++ 5.4)
/*
Time Complexity: O(N ^ 2)
Space Complexity: O(N)

where 'N' is the size of the input array.
*/

#include
#include

vector uniq;
map mp;

void swapUnique(int a, int b) {
int temp = uniq[a];
uniq[a] = uniq[b];
uniq[b] = temp;
}

int partition(int left, int right, int pivot) {

int pivotFrequency = mp[uniq[pivot]];

// Move pivot to end.
swapUnique(pivot, right);
int idx = left;

// Move all less frequent elements to the left of the pivot.
for (int i = left; i <= right; i++) {

if (mp[uniq[i]] < pivotFrequency) {

swapUnique(idx, i);
idx++;
}
}

// Move pivot to its final place.
swapUnique(idx, right);

return idx;
}

void quickselect(int left, int right, int kSmall) {

// If the list contains only one element.
if (left == right) {
return;
}

// Select a random index as pivot.
int pivot = left + rand()%(right - left);

// Find the position of pivot in the sorted list.
pivot = partition(left, right, pivot);

// If the pivot is in its final sorted position.
if (kSmall == pivot) {

return;
}
else if (kSmall < pivot) {

// Move in the left direction.
quickselect(left, pivot - 1, kSmall);
}
else {

// Move in the right direction.
quickselect(pivot + 1, right, kSmall);
}
}

vector KMostFrequent(int n, int k, vector &arr)
{
mp.clear();

// Build map where the key is element
// and value is how often this element appears in 'ARR'.
for (int ele : arr) {
mp[ele]++;
}

int size = mp.size();
uniq.assign(size, 0);
int i = 0;

// Build array of uniq elements.
for (auto x: mp) {

uniq[i] = x.first;
i++;
}

// Perform quickselect.
quickselect(0, size - 1, size - k);

// Return top 'K' frequent elements
vector topK;
for(int i = size-k; i < size; i++){
topK.push_back(uniq[i]);
}

sort(topK.begin(), topK.end());
return topK;
}

Java (SE 1.8)
/*
Time Complexity: O(N ^ 2)
Space Complexity: O(N)

where 'N' is the size of the input array.
*/

import java.util.Arrays;
import java.util.HashMap;
import java.util.Map;
import java.util.Random;

public class Solution {

static int[] unique;
static Map map;

public static int[] KMostFrequent(int n, int k, int[] arr) {

map = new HashMap();

// Build map where the key is element
// and value is how often this element appears in 'ARR'.
for (int ele : arr) {

map.put(ele, map.getOrDefault(ele, 0) + 1);
}

int size = map.size();
unique = new int[size];
int i = 0;

// Build array of unique elements.
for (int num: map.keySet()) {

unique[i] = num;
i++;
}

// Perform quickselect.
quickselect(0, size - 1, size - k);

// Return top 'K' frequent elements
return Arrays.copyOfRange(unique, size - k, size);

}

public static void quickselect(int left, int right, int kSmall) {

// If the list contains only one element.
if (left == right) {
return;
}

Random r = new Random();

// Select a random index as pivot.
int pivot = left + r.nextInt(right - left);

// Find the position of pivot in the sorted list.
pivot = partition(left, right, pivot);

// If the pivot is in its final sorted position.
if (kSmall == pivot) {

return;
}
else if (kSmall < pivot) {

// Move in the left direction.
quickselect(left, pivot - 1, kSmall);
}
else {

// Move in the right direction.
quickselect(pivot + 1, right, kSmall);
}
}

public static int partition(int left, int right, int pivot) {

int pivotFrequency = map.get(unique[pivot]);

// Move pivot to end.
swap(pivot, right);
int idx = left;

// Move all less frequent elements to the left of the pivot.
for (int i = left; i <= right; i++) {

if (map.get(unique[i]) < pivotFrequency) {

swap(idx, i);
idx++;
}
}

// Move pivot to its final place.
swap(idx, right);

return idx;
}

public static void swap(int a, int b) {
int temp = unique[a];
unique[a] = unique[b];
unique[b] = temp;
}
}
CodingNinjas
author
2y
Heap

In this approach, we will sort the unique elements of the input array according to how often this element appears in 'ARR'. We will use a hasp map where the key is element and value is how often t...read more

CodingNinjas
author
2y
Bucket Sort

We can observe that frequency of any element in ‘ARR’ can be a minimum of one and maximum ‘N’. So, we can create N buckets and add each unique element in the respective bucket according to ...read more

Add answer anonymously...
Snapdeal Software Engineer Interview Questions
Stay ahead in your career. Get AmbitionBox app
qr-code
Helping over 1 Crore job seekers every month in choosing their right fit company
65 L+

Reviews

4 L+

Interviews

4 Cr+

Salaries

1 Cr+

Users/Month

Contribute to help millions
Get AmbitionBox app

Made with ❤️ in India. Trademarks belong to their respective owners. All rights reserved © 2024 Info Edge (India) Ltd.

Follow us
  • Youtube
  • Instagram
  • LinkedIn
  • Facebook
  • Twitter