Hacker Newsnew | past | comments | ask | show | jobs | submit | vetler's commentslogin

But LLMs themselves are literally not going away, I think that's the point. Once a model is trained and let out into the open for free download, it's there, and can be used by anyone - and it's only going to get cheaper and easier.


Yeah like Kimi is good enough, if there was some kind of LLM fire and all the closed source models suddenly burnt down and could never be remade, Kimi 2.5 is already good enough forever.

Good enough is probably redundant, it's amazing compared to last year's models


My instinct was also to use LLMs for this, but it was way to slow and still expensive if you want to scrape millions of pages.


Put things to perspective - Gemini 2.5 flash is 0.3/1M tokens - assuming each page is 700 tokens and output is not much you are looking at $210 for 1M pages


You will absolutely struggle to get all the info you need into 700 tokens per page.

Edit: There's also the added complexity of running a browser against 1M pages, or more.


I agree that When pages have similar structure, for one time extraction as it is (not reasoning from context), scraping with selectors is the way to go.

This library also supports HTML as input so running a browser is not required.


Came back here to say I was wrong! I have been experimenting, and it is doable. I have been experimenting with setting up a scraping pipeline with LLM enrichment since I wrote the comment above, and have very positive results so far. :)


As a Norwegian, yeah you're not wrong. We have a lot of challenges to overcome as the oil age is coming to an inevitable end, although there are many here tho refuse to see the writing on the wall.


Their YouTube videos are just amazing to watch: https://www.youtube.com/channel/UCh5q-FtihPqzTbgEkZQRy3g


I used to be "anti Tesla", then I tried one, and now I own one.

Teslas are disruptive - they're not what you expect of cars. I will never look the same at a car again. Tell this to anyone who doesn't understand, and they simply won't understand what you're going on about.

Over a year in; driving my Model X still makes me smile.


Would be even scarier with eSIM, but I suppose it's just a matter of time before we get that.


I bet it's there, someplace, for sometime now.


So no one opened up one of these and looked at the components?


Just like many physical and non-physical (software) products, very often, literally nobody reviews/verifies/checks them.

Google will learn from this mistake, and next time they'll use a fancy MEMS microphone or a similar technology and place it inside a semiconductor package.

When do you think such a feat will be discovered by independent researchers? Probably never.


There's also the reality that they can simply change the components/layout without ever telling anybody about it.

People just keep buying the same boxes without even being aware that the hardware inside these boxes might be completely different revisions.

Who's to say that future batches of Nest won't have cameras added for "future use"? Who's gonna go through the effort of checking every fresh batch of Nests for revisions like that? And what are the chances of actually catching it when it's only rolled out in small batches?


Good point. I'd like to also mention that it is possible for companies like Amazon to "personalise" orders before they are shipped.


On a related note, there's already a Bitcoin ETF on the Swedish stock exchange: http://xbtprovider.com/


That's actually an ETN (exchange-traded note). But don't ask me what the difference is.


ETNs carry credit risk as they generally dont actually own the underlying asset. Instead, the ETN's issuer promises to stand by the value of the underlying asset.

Historically this was not an issue but the collapse of Lehman Brothers, AIG-FP and others made it clear that owning assets is preferable to someone making a promise to give you those assets.


You still don't directly own the assets in the case of an ETF as far as I am aware? You own shares in the ETF and the ETF owns the assets so I imagine in the case of liquidation you're below creditors? Someone more familiar with the concept would need to elucidate.

In addition, a synthetic ETF the company doesn't own the underlying exactly (but some portfolio which perfectly replicates it) - which is again slightly different from a Note I think but still has counterparty risk in there.


If the fund owns the assets, and you own shares in the fund, then you own a part of the assets. Who else would? The management?


You are right, in the case of ETFs. The parent is describing ETFs but my comment (the grandparent) was describing ETNs in response to the great-grandparent which is also speaking about ETNs. Phew!

OK -- to clarify my original comment -- ETNs DO NOT own the asset. The issuer of the ETN promises value equal to the asset, but that promise is worth only as much as the Issuer actually is good for. In cases like Lehman's bankruptcy, the promise would need to go through bankruptcy like any other promise. Thus, ETNs are like debt linked to an asset index rather than an interest rate/index. So not only are you exposed to the risk that underlying/linked asset loses value ("market risk"), but you are also exposed to the risk that the underlying/linked asset does fine yet the Issuer cannot make good on their promise ("counterparty risk").

With ETFs, on the other hand, the fund (and hence you) generally own the asset, but even then, some ETFs are slightly different in that they actually own derivatives on the assets such as futures. They are almost the same since futures are daily-settled. ETNs are not daily-settled thus carry longer-term counterparty risk.


How to create readable code that explains itself should have more focus. It's not just about naming classes, methods and variables, but also about structuring your code. It's quite possible to create a method with the perfect name that is over nine thousand lines long and difficult to understand.

This can be very subjective, but code complexity metrics can help identify problem areas. I find that areas with high complexity often benefit from refactoring into smaller pieces making it more readable and self explanatory.

If you can't structure your code to explain itself, perhaps commenting isn't such a bad idea, at least it would be a record of your thoughts at the time, until someone else comes a long and hopefully cleans it up.


Just to echo what has already been brought up here, sometimes it's not the what that is important, but the why. Reading the code is unlikely to explain the why for quite a few complex things.


> Not sure why one would want to attack that.

Well, that would make it less obvious.

1. Take a short-position on Delta on the stock exchange

2. Sabotage a piece of infrastructure that's important to operations

3. ???

4. Profit!

:)


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: