Hacker Newsnew | past | comments | ask | show | jobs | submit | gooob's commentslogin

> “Our opportunity now is to take those 900 million users and turn them into high-compute users,” Simo said, according to a partial transcript of the meeting reviewed by CNBC. “We’ll do that by transforming ChatGPT into a productivity tool.”

Moloch strikes again!

(for those who don't yet get the reference https://slatestarcodex.com/2014/07/30/meditations-on-moloch/)

but seriously, we (or they) need to define what "productivity" actually is these days and in these contexts. is it to generate revenue for a company? is it to become a self-sufficient human? is it to educate the younger generations about what's true? is it to do science? is it to increase the efficiency of whatever task you're already doing (note that increasing efficiency would result in using less compute for the same operation lol)? if we keep moving forward without sitting down and reorienting ourselves about a shared vision, given all the new things we now know since establishing our frameworks of social/economic organization, i fear that there will be a very bad outcome.


Well if chatgpt observes some kid writing decent code/analysing info well/asking useful questions/uncovering issues/showing interest even when there is no reward at hand and some one like that is needed at the bank, telco, power plant, hospital, govt dept etc use your imagination of how the story can go.

can you tell us about this "ansible filesystem swiss army knife"?


I'd be happy to! I find in my playbooks that it is fairly cumbersome to set up files and related because of the module distinction between copying files, rendering templates, directories... There's a lot of boilerplate that has to be repeated.

For 3-4 years I've been toying with this in various forms. The idea is a "fsbuilder" module that make a task that logically groups filesystem setup (as opposed to grouping by operation as the ansible.builtin modules do).

You set up in the main part of the task the defaults (mode, owner/group, etc), then in your "loop" you list the fs components and any necessary overrides for the defaults. The simplest could for example be:

    - name: Set up app config
      linsomniac.fsbuilder.fsbuilder:
        dest: /etc/myapp.conf
Which defaults to a template with the source of "myapp.conf.j2". But you can also do more complex things like:

    - name: Deploy myapp - comprehensive example with loop
      linsomniac.fsbuilder.fsbuilder:
        owner: root
        group: myapp
        mode: a=rX,u+w
      loop:
        - dest: /etc/myapp/conf.d
          state: directory
        - dest: /etc/myapp/config.ini
          validate: "myapp --check-config %s"
          backup: true
          notify: Restart myapp
        - dest: /etc/myapp/version.txt
          content: "version={{ app_version }}"
        - dest: "/etc/myapp/passwd"
          group: secrets
I am using this extensively in our infrastructure and run ~20 runs a day, so it's fairly well tested.

More information at: https://galaxy.ansible.com/ui/repo/published/linsomniac/fsbu...


wait why not robots.txt?


Good question, at least OAI-SearchBot is hitting robots.txt.

I assume the real issue is that what overloads the servers like security bots, SEO crawlers, and data companies — are the ones that don't respect robots.txt in full, but they wouldn't respect LLMs.txt either.


it's malware in the mind. it was happening before deep fakes was possible. news outlets and journalists have always had incentive to present extreme takes to get people angry, cause that sells. now we have tools that pretty much just accelerate and automate that process. it's interesting. it would be helpful to figure out how to prevent people (especially upcoming generations) from getting swept away by all this.


I think fatigue will set in and the next generation will 'tock' back from this 'tick.' Getting outraged by things is already feeling antiquated to me, and I'm in my 30's.


There's a massive industry built around this on YT, exemplified by the OP's post about his parents. To a first-order approximation, every story with a theme of "X does sexist/racist/ageist/abusive thing to Y and then gets their comeuppance" on YouTube is AI-generated clickbait. The majority of the "X does nice thing for Y and gets a reward or surprise" dating from the last year or two are also AI-generated clickbait, but far more of the former. Outrage gets a lot more clicks than compassion.


> news outlets and journalists have always had incentive to present extreme takes to get people angry, cause that sells.

As someone who’s read a newspaper daily for 30+ years, that is definitely not true. The news has always tried to capture your attention but doing so using anger and outrage, and using those exclusively, is a newer development. Newspapers and broadcast news used to use humor, suspense, and other things to provoke curiosity. When the news went online, it became focused on provoking anger and outrage. Even print edition headlines tend to be tamer than what’s in the online edition.


not to mention the high resource-usage of a local LLM that most PCs wouldn't be able to handle, or would just drain a laptop's battery.


All for searching something trivial, where for 99% of cases the already indexed wikipedia summary is good enough and way faster


what i hate most about this (and the discussion happening in the comments), is that nobody is even defining "AI". "artificial intelligence" is not a technical term. what is mozzila doing exactly? what does it mean to put AI in the browser?


There is kind of software that is created using techniques very different from the techniques used to create the vast majority of socioeconomically-important software until about 4 years ago. We need a name for this thing that definitely exists in reality and definitely differs from software created the traditional way. That name is AI. We're probably going to keep on calling it that even if lots of people protest that having "intelligence" in the name is misleading or erroneous.


wait what do you mean? what's wrong with kafka?


wait what's wrong with kafka?


I was in the midst of writing a snarky reply and then realized my actual issue with Kafka is that people reach for it way too often and use it in ways that don't really make sense.

Kind of like how people use docker for evrything, when what you really should be doing is learn how to package software.


Ops here, Docker is packaging software.

Agree on the Kafka thing though. I've seen so many devs trip over Kafka topics, partitions and offsets when their throughput is low enough that RabbitMQ would do fine.


No, docker is a software for packaging systems.

The people distributing software should shut them damn up about how the rest of the system it runs in is configured. (But not you, your job is packaging full systems.)

That said, it seems to me that this is becoming less of a problem.


Nothing inherently wrong with the core product IMHO. The issue is more with Confluent, who have been constantly swinging from hot buzzword to hot buzzword for the last few years in search of growth. Confluent cloud is very expensive, and you still have to deal with a surprising amount of scaling headaches. I have people I consider friends that work there, so I don't want to go too deep into their various missteps, but the Kafka ecosystem has been largely stagnant outside of getting rid of Zookeeper and simplifying operations/deployment. There have been some decent quality of life fixes, but the platform is very expensive, yet if you are really all-in on Kafka, you would be insane to not get support from Confluent- it can break in surprising ways.

So you are stuck with some really terrible tradeoffs- Go with Confluent Cloud, pay a fortune, and still likely have some issues to deal with. Or you could go with Confluent Platform, still have to pay people to operate it, while Confluent the company focuses most of their attention on Cloud and still charges you a fortune. Or you could just go completely OS and forgo anything Confluent and risk being really up the river when something inevitably breaks, or you have to learn the hard way that librdkafka has poor support for a lot of the shiny features discussed in the release notes.

Redpanda has surpassed them from a technical quality perspective, but Kafka has them beat on the ecosystem and the sheer inertia of moving from one platform to another. Kafka for example was built in a time of spinning rust hard disks, and expects to be run on general purpose compute nodes, where Redpanda will actually look at your hardware and optimize the number of threads its spawns for the box it is on- assuming it is going to be the only real app running there, which is true for anything but a toy deployment.

This is my experience from running platform teams and being head of messaging at multiple companies.


What's wrong with kafka or what WILL BE wrong with kafka?


So much that we presume in the modern cloud wasn't a given when Apache Kafka was first released in 2011.

kevstev wrote just above about Kafka being written to run on spinning disks (HDDs), while Redpanda was written to take advantage of the latest hardware (local NVMe SSDs). He has some great insights.

As well, Apache Kafka was written in Java, back in an era when you were weren't quite sure what operating system you might be running on. For example, when Azure first launched they had a Windows NT-based system called Windows Azure. Most everyone else had already decided to roll Linux. Microsoft refused to budge on Linux until 2014, and didn't release its own Azure Linux until 2020.

Once everyone decided to roll Linux, the "write once run everywhere" promise of Java was obviated. But because you were still locked into a Java Virtual Machine (JVM) your application couldn't optimize itself to the underlying hardware and operating system you were running on.

Redpanda, for example, is written in C++ on top of the Seastar framework (seastar.io). The same framework at the heart of ScyllaDB. This engine is a thread-per-core shared-nothing architecture that allows Redpanda to optimize performance for hardware utilization in ways that a Java app can only dream of. CPU utilization, memory usage, IO throughput. It's all just better performance on Redpanda.

It means that you're actually getting better utility out of the servers you deploy. Less wasted / fallow CPU cycles — so better price-performance. Faster writes. Lower p99 latencies. It's just... better.

Now, I am biased. I work at Redpanda now. But I've been a big fan of Kafka since 2015. I am still bullish on data streaming. I just think that Apache Kafka, as a Java-based platform, needs some serious rearchitecture,

Even Confluent doesn't use vanilla Kafka. They rewrote their own engine, Kora. They claim it is 10x faster. Or 30x faster. Depending on what you're measuring.

1. https://www.confluent.io/confluent-cloud/kora/

2. https://www.confluent.io/blog/10x-apache-kafka-elasticity/


https://en.wikipedia.org/wiki/Enshittification is helpful if you arent aware of how late stage capitalism works


Late stage of what?


quite interesting, thanks!


same thing i was thinking lol


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: