Heavy Hitters (Count-Min Sketch) - Problem

Implement a Count-Min Sketch data structure for approximate frequency counting in a data stream. A Count-Min Sketch is a probabilistic data structure that can answer frequency queries about elements in a stream using sublinear space.

Your task is to implement a class CountMinSketch with the following methods:

  • __init__(width, depth): Initialize the sketch with given width and depth
  • update(item, count): Add count occurrences of item to the sketch
  • query(item): Return the estimated frequency of item
  • heavy_hitters(threshold): Return all items with estimated frequency ≥ threshold

The Count-Min Sketch uses multiple hash functions to map items to different positions in multiple arrays (rows). Each update increments counters at the hashed positions, and queries return the minimum value across all rows for an item.

Input & Output

Example 1 — Basic Operations
$ Input: operations = [["init", 4, 3], ["update", "a", 5], ["update", "b", 3], ["query", "a"], ["query", "b"], ["heavy_hitters", 4]]
Output: [null, null, null, 5, 3, ["a"]]
💡 Note: Initialize sketch with width=4, depth=3. Update 'a' with count 5, 'b' with count 3. Query returns estimated frequencies. Heavy hitters with threshold 4 returns ['a'] since only 'a' has frequency ≥ 4.
Example 2 — Multiple Updates
$ Input: operations = [["init", 3, 2], ["update", "x", 2], ["update", "x", 3], ["query", "x"], ["heavy_hitters", 5]]
Output: [null, null, null, 5, ["x"]]
💡 Note: Initialize sketch. Update 'x' with 2, then update 'x' with 3 more (total 5). Query 'x' returns 5. Heavy hitters with threshold 5 returns ['x'].
Example 3 — Query Unseen Item
$ Input: operations = [["init", 5, 2], ["update", "item1", 10], ["query", "item2"]]
Output: [null, null, 0]
💡 Note: Initialize sketch, update 'item1' with count 10. Query for 'item2' (never seen) returns 0.

Constraints

  • 1 ≤ width, depth ≤ 1000
  • 1 ≤ count ≤ 106
  • Item strings have length ≤ 100
  • At most 104 operations total

Visualization

Tap to expand
DATA STREAMHASH & COUNTQUERY RESULTSacount: 5bcount: 3ccount: 1Stream of items with frequencies1Hash 12Hash 23Hash 3Multiple hash functions map to 2D counter array5item "a"3item "b"1item "c"Minimum estimates from hash functionsKey Insight:Using multiple hash functions with minimum estimation provides space-efficient approximate frequency counting with bounded error guarantees.TutorialsPoint - Heavy Hitters Count-Min Sketch | Probabilistic Data Structure
Asked in
Google 25 Facebook 20 Amazon 15 Netflix 12
22.3K Views
Medium Frequency
~35 min Avg. Time
890 Likes
Ln 1, Col 1
Smart Actions
💡 Explanation
AI Ready
💡 Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen