Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I can't resist mentioning a favorite algorithm of mine for solving a related problem. It generalizes the algorithm for the classic problem of finding the majority element (one which occurs at least 50% of the time) with a single pass over a stream, using a constant amount of space, and a second sieving pass. It's one of those simple gems you wonder why it took so long to discover.

http://www.cs.yale.edu/homes/el327/datamining2011aFiles/ASim...



Very nice, thanks! It might be too similar for a post all of its own, but I think it'd be a worthy followup to this post.


Direct link to the pre-generalized: https://www.cs.utexas.edu/~moore/best-ideas/mjrty/index.html

Note that this algorithms fails silently if there is not majority element, which limits the scope of its utility.

I assume that the reason it took so long to discover is that no one needed the solution before. Are there applications that can benefit form this algorithm?


You can do a second pass over the list to check that your solution is actually a majority element. This maintains the linear time and constant space properties.


Unfortunately single pass is usually the binding constraint eg real time data processing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: