Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm working on a paper to show that you can 'simulate' a heap in linear time.

What I mean is: in the comparison model, assume start with an empty heap, a sequence of instructions like "insert element x" and "remove minimum element" of length n.

There's an algorithm that can tell you in O(n) time, which items are still left in the heap at the end (and thus also which items have been removed at some point in time).

The algorithm can't tell you exactly _when_ items have been removed, because that would be equivalent to sorting, and you need at least O(n log n) to sort in the comparison model.

I've worked on this problem on and off for about a decade now as a little hobby project. I would have been happy to prove it impossible or to find a solution. Interestingly, it is possible and the solution is relatively simple, too.

(I have no clue whether this is useful for anything. In practice you probably just run the obvious O(n log n) algorithm which probably has much better constant factors. I'm doing this purely for the theory.)

The data structure / algorithm I found is expressible in a purely functional setting, and (spoiler alert) builds on top of Chazelle's Soft Heap.



Are you at all worried that by posting an open problem on HN and claiming you've found a simple solution, someone might scoop you to the paper?


I'm not a professional academic.

If I find someone else who's interested enough in my pet problem to go to the hassle of scooping me, I'd be elated.


Idea to paper is non trivial effort. Lots of iterations need to be done.


Doesn't that imply a linear "n largest" algorithm? I guess that might already exist, but I wasn't aware of that!


Yes, that does imply a linear 'n largest algorithm'. Specifically also linear median finding.

As the sibling comment points out, that's not news. But it's very good for you to pick up that implication from first principles alone. It's part of figuring out how to solve this.


Quickselect and non-randomized variations should get you that; no?


Yes, exactly.

But this reduction from 'select n largest' to my problem is still instructive: it tells us that in some sense any (deterministic) solution to my pet problem has to be at least as complicated as (deterministic) median finding.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: