nimbusega's comments

nimbusega · on Nov 6, 2024

That's fair. How about a toggle to not show images?

genewitch · on Nov 7, 2024

I think this is more an indictment of how poorly some publications pick images than any sort of layout issue (or design decision). So probably a toggle throws the baby out with the bathwater. Saw a little cockroach and there was an article about a cockroach - okay, fair. Picture of what looks like a forest fire on an article about tuples - probably net negative.

dredmorbius · on Nov 7, 2024

The fact that some of the images are animated (presently: the "passport photos" associated with this story: <https://maxsiedentopf.com/passport-photos/>) is an absolute turn-off.

I'm often reading via an e-ink tablet. Whilst I can drop text quality to better support animations, the effect is a gross degredation of everything else, and of course, why the fuck would I want to see animations randomly?

"Animate on hover" is a setting I've long advocated for sites, and coded into CSS both for my own sites and as restylings of third-party sites. It's a compromise between constant distraction and being able to benefit from the very rarely actually useful animation. In the case of the passport photos story, the same effect could be achieved by a grid (2x2, 3x3) showing the variety of photos simultaneously. Detail isn't relevant, variety apparently is, and animation is a cheap eyeball-grabbing trick.

genewitch · on Nov 7, 2024

I saw the animation, but i was looking at the pdf i made - to offer a solution to another comment. No motion there; just meaningless images mixed with contextual images.

nimbusega · on Nov 6, 2024

I made this to experiment with embeddings and explore how different ways of displaying information affect your perception.

It gets the top 100 stories, sends their html to GPT-4 to extract the main content (this was not producing good enough results with html parsing) and then gets an embedding using the title and content.

Likes/dislikes are stored in local storage and compared against all stories using cosine similarity to find the most relevant stories.

It costs about $10/day to run. I was thinking of offering additional value for a small subscription. Maybe more pages of the newspaper, full story content/comments, a weekly digest or ePub export or something?

ketzo · on Nov 6, 2024

I think some of the highest value from HN comes from the comments, and it's much harder to find the "best" ones, since they might be in threads you might not have otherwise read.

Not sure if it's a "premium feature" so to speak, but would be very cool to extend this to comments generally.

jkestner · on Nov 7, 2024

Render comments in the style of the Onion's man-on-the-street American Voices section.

nimbusega · on Nov 6, 2024

Definitely, comments are usually better than the article. I thought of a 'Letters to the Editors' section that shows top comments (https://news.ycombinator.com/bestcomments) and references the parent story, but it might not be as useful without the context.

Maybe 'See Comments' here could load the comments on the same page? In a newspaper like style.

genewitch · on Nov 7, 2024

AI should be able to do "good enough" sentiment analysis combined with the "votes" should be able to quickly find agree/disagree and the quality of the comment - which should not be based merely on the number of complex words, or the length.

i certainly suspect that the 4chan and reddit datasets, combined with HN's, and building a LoRA that ranks the 4chan and reddit stuff lower and the good HN stuff higher. essentially, subtract all reddit and 4chan style comments from the set of HN comments' weights. Training SD loras was pretty quick but i haven't looked into LLM loras. regardless, the LLM with the HN-4chan&reddit can do sentiment analysis and use the votes; just feed it csv or json: votes, user, comment. I guess you could do votes/age as a cleanup, too.

All this to say i still wouldn't read or use it. I'm not a fan of robots entertaining me.

gsky · on Nov 7, 2024

Why would it cost $10 a day?

It should not cost more than a dollar a day.

Take AWS and azure credits and run it for free for years

jzombie · on Nov 6, 2024

> Likes/dislikes are stored in local storage and compared against all stories using cosine similarity to find the most relevant stories.

You're referring to using the embeddings for cosine similarity?

I am doing something similar with stocks. Taking several decades worth of 10-Q statements for a majority of stocks and weighted ETF holdings and using an autoencoder to generate embeddings that I run cosine and euclidean algorithms on via Rust WASM.

mahin · on Nov 14, 2024

Yes. Your project sounds cool, post it!

jzombie · on Nov 15, 2024

I just responded to an adjacent query with the info.

https://news.ycombinator.com/threads?id=jzombie#42072665

tiborsaas · on Nov 7, 2024

> I am doing something similar with stocks.

How well does it work?

jzombie · on Nov 15, 2024

It seems to do well for a lot of searches, though some are questionable, but I believe that I know why. I'm training some different autoencoders to give it some different perspectives.

The code lives here: https://github.com/jzombie/etf-matcher

The ad-hoc vector DB I've created lives here: https://github.com/jzombie/etf-matcher/blob/main/rust/src/da...

tagawa · on Nov 7, 2024

Nice – I like this a lot. I feel like I'd use this for slow-lane reading and the original HN site when I'm in a rush.

Regarding HTML to GPT-4, I seem to remember commenters here saying they got better results by converting the HTML to Markdown first, then sending to an LLM. Might save a bit of money too.

mahin · on Nov 14, 2024

That's a good idea. I've been experimenting and markdown seems to produce better results.

nimbusega · on Nov 6, 2024

Thanks for the feedback! Print newspaper's have curation, which this lacks. I guess the main thing it takes from newspapers is the image and blurb that help give you a preview of the story.

dangoor · on Nov 6, 2024

There is a form of curation on HN and "editorial judgment" on HN and that's in the points a post has. A closer approximation of a newspaper would be possible by looking at the points of a post and maybe comparing that to other posts and then sizing headlines appropriately based on how "important" the HN community sees a given story.

TheSpiceIsLife · on Nov 6, 2024

The other form of curation is place on the front page.

That's probably closer to the editors choice in the context of HN.

nimbusega · on Nov 6, 2024

Yes, I agree. I think I will change the design to have a hierarchy.

tessierashpool · on Nov 6, 2024

this is exactly how my 2009 version (in my previous comment) chose to size and space its headlines

djmips · on Nov 7, 2024

Currently the 131M Buildings story shows the blog author picture and BIO instead of a summary of the actual story. Is this easily fixable or is it a tough problem.

mahin · on Nov 14, 2024

It should work better now. I improved the prompt a bit to extract the main content.

nimbusega · on Nov 6, 2024

Thank you! I missed that in my sleepiness. Should be fixed now.

nimbusega · on Nov 6, 2024

It updates every hour. This post is on it now!

nimbusega · on Nov 6, 2024

I thought it would fit the grayscale of newspapers. I can add an option to show them in color.