Find the kth most frequent element in a data stream | Hello World
In the word of internet, where everything is in a very large scale. For example you want to find out that among all the ads, which ones are clicked most frequently in a day?
This problem comes down to find the kth most frequent element in a data stream. (or find the element whose frequency is greater than a user defined value).
In order to find the kth most frequent element in a data stream, we need a space of the length of the data stream N. which is not desirable. So we have to use approximation
Two kind of method are proposed to solve the problem
1. counter-based: only monitor a subset of the incoming stream, each element being monitored was assigned a counter, if the incoming element is monitored, the counter is updated, otherwise, do something depending on algorithm.
2. Estimate frequency for all elements using bit-maps of counters
Each element is hashed into the counters' space using a family of hash functions. Hashed-to counters are queried for the frequencies
Read full article from Find the kth most frequent element in a data stream | Hello World
No comments:
Post a Comment