Hacker Newsnew | past | comments | ask | show | jobs | submit | not2b's commentslogin

I understand why OpenAI is trying to reduce its costs, but it simply isn't true that AI crawlers aren't creating very significant load, especially those crawlers that ignore robots.txt and hide their identities. This is direct financial damage and it's particularly hard on nonprofit sites that have been around a long time.

These are ChatGPT and Claude Desktop crawlers we’re talking about? Or what is it exactly? Are these really creating significant load while not honoring robots.txt?

Genuinely interested.


I bet dollars to doughnuts that 95% of the traffic is from Claude and ChatGPT desktop / mobile and not literal content scraping for training.

That wouldn't explain the 1000x increase in traffic for extremely obscure content, or seeing it download every single page on a classic web forum.

What does "verified" mean here? You can verify that a real company posted that job, but you can't verify whether it is fake in the sense that they really have an H1B candidate they really want for the position and they are just advertising it to meet legal requirements.


Friend of mine mentioned someone made a site to find those hidden jobs so people desperate for work already in the US can widen their own net. Not really sure how effective they are at it.

https://www.jobs.now/jobs


Conservapedia had to have a person create each article and didn't have the labor or interest. Grok can spew out any number of pages on any subject, and those topics that aren't ideologically important to Musk will just be the usual LLM verbiage that might be right or might not.


People were literally buying horse dewormer when their doctors wouldn't prescribe it for them. "Influencers" were selling it. So the media were being accurate. To the extent that this made people look dumb, the intent was mostly to shame them into trying something more effective.


> the intent was mostly to shame them

Yeah I don’t like news that does that, as opposed to giving the best information.


You dropped my words "into trying something more effective". Steering people into treatments found to be more effective is, precisely, giving people the best available information. Ivermectin is great if you have a parasitic infection. It doesn't help against viral infections.


Telling them which treatment is more effective is different than shaming, and also different than making up a misleading name like “horse dewormer”.


This is going to be a huge problem for conferences. While journals have a longer time to get things right, as a conference reviewer (for IEEE conferences) I was often asked to review 20+ papers in a short time to determine who gets a full paper, who gets to present just a poster, etc. There was normally a second round, but often these would just look at submissions near the cutoff margin in the rankings. Obvious slop can be quickly rejected, but it will be easier to sneak things in.


AI conferences are already fucked. Students who are doing their Master's degrees are reviewing those top-tier papers, since there are just too many submissions for existing reviewers.


I think on HN, people waste too much time arguing about the phrasing of the headline, whether it is clickbait, etc. and not enough discussing the actual substance of the article.


You're right, mostly, but the fact remains that the behavior we see is produced by training, and the training is driven by companies run by execs who like this kind of sycophancy. So it's certainly a factor. Humans are producing them, humans are deciding when the new model is good enough for release.


Do you honestly think an executive wanted a chat bot that confidently lies?


Do the lies look really good in a demo when you're pitching it to investors? Are they obscure enough that they aren't going to stand out? If so no problem.


In practice, yes, though they wouldn't think of it that way because that's the kind of people they surround themselves with, so it's what they think human interaction is actually like.


"I want a chat bot that's just as reliable at Steve! Sure he doesn't get it right all the time and he cost us the Black+Decker contract, but he's so confident!"

You're right! This is exactly what an executive wants to base the future of their business off of!


You say that like it’s untrue, but they measurably prefer a lying but confident salesman over one who doesn’t act with that kind of confidence.

This is very slightly more rational than it seems because repeating or acting on a lie gives you cover.


Yes, that is in fact their revealed preference.

Did you have a point?


You use unfalsifiable logic. And you seem to argue that, given the choice, CEOs would prefer not to maximize revenue in favor of... what, affection for an imaginary intern?


Cute straw man.

You must be a CEO.

I'm not arguing anything. I'm observing reality. You're the one who is desperate to rationalize it.


You are declaring your imagined logic as fact. Since I do not agree with the basis upon which you pin your argument on, there is no further point in discussion.


You're hallucinating things I did not say.


Given the matrix 'competent/incompetent' / 'sycophant/critic' I would not take it as read that the 'incompetent/sycophant' quadrant would have no adherents, and I would not be surprised if it was the dominant one.


People with immense wealth, connections, influence, and power demonstrably struggle to not surround themselves with people who only say what the powerful person already wants to hear regardless of reality.

Putin didn't think Russia could take Ukraine in 3 days with literal celebration by the populace because he only works with honest folks for example.

Rich people get disconnected from reality because people who insist on speaking truth and reality around them tend to stop getting invited to the influence peddling sessions.


They may say they don't want to be lied to, but the incentives they put in place often inevitably result in them being surrounded by lying yes-men. We've all worked for someone where we were warned to never give them bad news, or you're done for. So everyone just lies to them and tells them everything is on track. The Emperor's New Clothes[1].

1: https://en.wikipedia.org/wiki/The_Emperor%27s_New_Clothes


No, but they like the sycophancy.


Agreed. I used to review lots of submissions for IEEE and similar conferences, and didn't consider it my job to verify every reference. No one did, unless the use of the reference triggered an "I can't believe it said that" reaction. Of course, back then, there wasn't a giant plagiarism machine known to fabricate references, so if tools can find fake references easily the tools should be used.


Cute. But please don't use this, because in addition to making your text useless for LLMs it makes it useless for blind and vision impaired people who depend on screen readers.


And, conversely, it (presumably) has no effect on VLMs using captive browsers and screenshotting to read webpages.


> making your text useless for LLMs

It arguably doesn't even do this. If this is adopted widely, it would only be for current LLMs; newer models could (and would) be trained to detect and ignore zero-width/non-printable characters.


6,000+, and those machines served many others (back then there were tens of thousands of machines on the Internet, but probably 10x as many that were connected to these by relays that handled email or Usenet traffic).


Also worth remember that especially with Internet-connected computers almost everything was multiuser. You did work on the Internet from a shell on a shared Unix server, not from a laptop.


Serverless remote workspaces as you might call them now.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: