> Perhaps you’re a user of LLMs. I get it, they’re neat tools. They’re useful for certain kinds of learning. But I might suggest resisting the temptation to use them for projects like this. Knowledge is not supposed to be fed to you on a plate
Am I the only one using LLMs as if they were a search engine? So before LLMs I was searching on Google things like "pros cons mysql mongodb". I would read the official documentation of each db, forums, blog posts, stackoverflow entries, etc. It was time consuming on the searching side. The time it took to read all the sources was fine for me (it's learning time, so that's always welcomed). Now with LLMs, I simply prompt the same with a little bit more of context "pros and cons of using mysql vs mongodb when storing photos. Link references". So, I get a quick overview of what to keep an eye on, and the references are there to avoid relying on hallucination.
It's true that sometimes I go ahead and say "give me a data schema for storing photos metadata in postgres. I wanna keep X in a different table, though" (or something like that). But I do that because I know very well what the output should look like (I just don't wanna spend time typing it, and sometimes I forget the actual type that I should use (int vs integer?)).
The few times I've used LLMs as question answering engines for anything moderately technical, they've given subtly-but-in-important-ways incorrect information such that taking them at face value would've likely lost me hours or days of pursuing something unworkable, even when I ask for references. Whether or not the "references" actually contain the information I'm asking for or merely something tangentially related has been rather hit or miss too.
The one thing they've consistently nailed has been tip-of-my-tongue style "reverse search" where I can describe a concept in sufficient detail that they can tell me the search term to look it up with.
Absolutely. And I’m finding the same with “agent” coding tools. With the ever increasing hype around Cursor I tried to give it a go this week. The first 5 minutes were impressive, when I sent a small trial ballon for a simple change.
But when asking for a full feature, I lost a full day trying to get it to stop chasing its tail. I’m still in the “pro” free trial period so it was using a frontier model.
This was for a Phoenix / Elixir project; which I realize is not as robustly in the training data as other languages and frameworks, but it was supposedly consuming the documentation, other reference code I’d linked in, and I’d connected the Tidewave MCP.
Regardless, in the morning with fresh eyes and a fresh cup of coffee, I reverted all the cursor changes and implemented the code myself in a couple hours.
Yes, you have to be very careful when querying LLM's, you have to assume that they are giving you sort of the average answer to a question. I find them very good at sort of telling me how people commonly solve a problem. I'm lucky, in that the space I've been working has had a lot of good forums training data, and the average solution tends to be on the more correct side. But you still have to validate nearly everything it tells you. It's also funny to watch the tokenization "fails". When you ask about things like register names, and you can see it choose nonexisting tokens. Atmel libraries have a lot of things like this in them
And the output will be almost correct code, but instead of an answer being:
PORT_PA17A_EIC_EXTINT1
you'll get:
PORT_PA17A_EIC_EXTINT_NUM
and you can tell that it diverged trying to use similar tokens, and since _ follows EXTINT sometimes, it's a "valid" token to try, and now that it's EXTINT_ now NUM is the most likely thing to follow.
That said, it's massively sped up the project I'm working on, especially since Microchip effectively shut down the forums that chatgpt was trained on.
>The one thing they've consistently nailed has been tip-of-my-tongue style "reverse search" where I can describe a concept in sufficient detail that they can tell me the search term to look it up with.
This is basically the only thing I use it for. It's great at it, especially given that Google is so terrible these days that a search describing what you're trying to recall gets nothing. Especially if it involves a phrase heavily associated with other things.
For example "What episode of <X show> did <Y thing> happen?" In the past, Google would usually pull it up (often from reddit discussion), but now it just shows me tons of generic results about the show.
This. I was skeptical at first, but it is indeed good at searching and answering questions without! That said, I still have to double-check results for niche queries or about stuff that is relatively new. Sometimes, the "sources" for the answers are just someone's opinions — unsubstantiated by any facts — on an old Reddit post that's only tangentially related to the topic. And sometimes, you simply know that manual search and digging through SO answers yourself will yield better results. At this point I've developed a gut feeling that helps me decide whether to prompt Perplexity or just g**gle it.
> the "sources" for the answers are just someone's opinions — unsubstantiated by any facts
Isn't that the nature of the web?
I mean, that's exactly what I expect from web searches, so as long as you don't consider the fancy-looking [1] citations "scientific", it just digs up the same information, but summarized.
I don't expect miracles: Perplexity does the same things I've been doing, just faster. It's like a bicycle I guess.
I agree. Use with caution. One of my personal pet peeves with LLM answers is their propensity to give authoritative or definite answers, when in fact they are best guesses, and sometimes pure fantasy.
> Now with LLMs, I simply prompt the same with a little bit more of context "pros and cons of using mysql vs mongodb when storing photos. Link references".
In near future, companies will probably be able to pay lots of money to have their products come up better in the comparison. LLMs are smart enough to make the result seem "organic" -- all verifiable information will be true and supported by references, it will only be about proper framing and emphasis, etc.
I'd say LLMs have helped a lot with this problem actually, by somehow circumventing a lot of the decades of SEO that has now built up. But, I fear it will be short-lived until people figure out LLM optimisation.
I'm very grateful that we have a lot of players training LLM's, including several that are published as open models and open weights.
I fully expect LLM results to start including ads, but because of the competition I hope/believe the incentives are much better than they are for, say Google's search monopoly.
It could potentially be more insidious though.
We'll probably start sending prompts to multiple models and comparing the results with lower-power local models.
I really hope that they don't include ads in paid tiers. But I'm not sure how much you would actually have to pay to cover the potential lost ad revenue... it might be too much.
This is already the case, SEO content, sponsored comparison sites, influencer marketing, it's all about subtle framing. LLMs just supercharge the problem by making it easier and cheaper to scale.
The real issue isn't that LLMs lie, it's that they emphasize certain truths over others, shaping perception without saying anything factually incorrect. That makes them harder to detect than traditional ads or SEO spam.
Open-source LLMs and transparency in prompt+context will help a bit, but long-term, we probably need something like reputation scores for LLM output, tied to models, data sources, or even the prompt authors.
This already happens unintentionally, e.g. Wikipedia loops, where bad info on Wikipedia gets repeated elsewhere, and then the Wikipedia article gets updated to cite that source.
When LLM-generated content is pervasive everywhere, and the training data for LLMs is coming from the prior output of LLMs, we're going to be in for some fun. Validation and curation of information are soon going to be more important than they've ever been.
But I don't think there'll be too much intentional manipulation of LLMs, given how decentralized LLMs already are. It's going to be difficult enough getting consistency with valid info -- manipulating the entire ecosystem with deliberately contrived info is going to be very challenging.
I use it the same way. The feeling is that I'm back in ~2010 when Googling stuff felt like a superpower. I could find anything back then.
Of course, it didn't last long, and trying to Google now is an exercise in pain and frustration. Lots of people have complained about the various things Google and marketers have done to get there, idk, I just don't like how it works now.
Top LLMs feel amazingly good at rapidly surfacing info online, and as I go through the references they're usually pretty good. I guess the same forces as before will apply, and there will be some window of opportunity before it all goes away again.
I wonder when LMMs and services like chatgpt become as bloated as search engines are today, with their own equivalent of SEO/SEM tools and other unwanted stuff distracting and disturbing accuracy, even if one finally stops hallucinating.
Hopefully not that fast, but I'm pessimistic. The cost of the human bloat will far surpass the current cost of hallucinations. And like we saw with Google, that bloat can become a feature of the content itself, not just contained in the tool.
If you do a web search and find a random blog post full of spelling errors and surrounded by ads, you're not going to trust that at the same level as a Stack Overflow post with a hundred upvotes, or an article with a long comment thread on HN.
But an LLM digests everything, and then spits out information with the same level of detail, same terminology, and same presentation regardless of where it came from. It strips away a lot of the contextual metadata we use to weigh credibility and trust.
Sure, you can follow references from an LLM, but at that point you're just using it a fuzzier form of web search.
> Sure, you can follow references from an LLM, but at that point you're just using it a fuzzier form of web search.
That's exactly the superpower, isn't it? Web search without having to dig deep around (especially now that Google's quality has declined over the years). Sure thing, LLMs are capable of more.
I agree and am perfectly happy using it as fuzzier web search, because it works really well for me.
Finding references is often my main goal, but other times I just want some quick facts, in which case I'll be as thorough with checking as I would when reading a random blog with spelling errors.
> I agree and am perfectly happy using it as fuzzier web search, because it works really well for me.
Personally, I want the exact opposite: web search to go back to how it was 10-15 years ago, with deterministic search syntax, granular include/exclude syntax, boolean parsing, and absolutely no "did you mean?"s.
Actually, someone should design one that can pull in quotes. Like as a separate tool the LLM uses to quote that has guarantees, it's just a copy and can't be hallucinated. Then you could see the primary info when it's needed/asked for, similar to articles
There will be a race between the attempts monetize online LLM services like this and the development of consumer owned hardware that can enable local LLMs with sufficient power to deliver the same service but ad free.
Combined with RAG a self hosted LLM will definitely be able to deliver a more impartial and therefore better solution.
I don't think anyone can make a Google today that works as well as it did back then. Google shaped how new content was created, and that was probably a much bigger deal than any changes to the tool itself
Sure, but in the context of this thread where the usage case of modern LLMs was described as:
"Now with LLMs, I simply prompt the same with a little bit more of context "pros and cons of using mysql vs mongodb when storing photos. Link references"
locally hosted LLMs with RAG will absolutely be able to do this, better than Googling even back then could, and so the prospect of monetized LLMs with ads in them degrading the user experience for this sort of usage case is unlikely.
I do this. But the killer usecase for me is writing all boilerplate and implementing some half-working stuff keeps my attention on the issue which makes me able to complete more complex things.
A recent example is when I implemented a (Kubernetes) CSI driver that makes /nix available in a container so you can run an empty image and skip a lot of infra to manage.
I talked to it a bit and eventually it wrote a Nix derivation that runs the CSI codegen for Python and packages it so I could import it. Then I asked it to implement the gRPC interface it had generated and managed to get a "Hello World' when mounting this volume (just an empty dir). I also asked it to generate the YAML for the StorageClass, CSIDriver, Deployment and DaemonSet.
So LLM left me with a CSI driver that does nothing in Python (rather than Go which is what everything Kubernetes is implemented in) that I could then rewrite to run a Nix build and copy storepaths into a folder that's mounted into the container.
Sure implementing a gRPC interface might not be the hardest thing in hindsight, but I've never done it before and it's now a fully functional(ish) implementation of what i described.
It even managed to switch gRPC implementations because the Python one was funky with protoc versions in Nix(Python bundles the grpc codegen it's so stupid) so i asked it to do the same thing for grpclib instead which worked.
The problem with Lisp (or at least Clojure) is that abstracting away the boilerplate requires you to correctly identify the boilerplate.
It’s nontrivial to structure your entire AST so that the parts you abstract away are the parts you’re not going to need direct access to three months later. And I never really figured out, or saw anyone else figure out, how to do that in a way which establishes a clear pattern for the rest of your team to follow.
Especially when it comes to that last part, I’ve found pragmatic OOP with functional elements, like Ruby, or task-specific FP, like Elm, to be more useful than Clojure at work or various Lisps for hobby projects. Because patterns for identifying boilerplate are built in. Personal opinion, of course.
Yes, good tooling shouldn't have boilerplate. Minimizing loc (within reason, not code golf) is the best thing you can do for maintainability. Unfortunately things like Java are popular too.
I hear you. But removing boilerplate via abstraction (Lisp) is very different from generating it on demand (LLMs). The former is obviously qualitatively better. But it requires up front design, implementation testing etc. The latter is qualitatively insufficient, but it gets you there with very little effort plus some manual fixes.
> The latter is qualitatively insufficient, but it gets you there with very little effort plus some manual fixes.
I remember years ago, when I worked at a large PC OEM, I had a conversation with one of our quality managers -- if an updated business process consumes half the resources, but fails twice as often, have you improved your efficiency, or just broken even?
"Qualitatively insufficient, but gets you there" sounds like a contradiction in terms, assuming "there" is a well-defined end state you're trying to achieve within equally well-defined control limits.
There’s necessary complexity like error handling, authz, some observability things, etc. which can’t be trivially abstracted away and needs to be present and adjusted for each capability/feature.
i’ve stopped writing “real” code for the most part, i just bang out some pseudo code like:
read all files in directory ending in .tmpl
render these as go templates
if any with kind: deployment
add annotation blah: bar
publish to local kubeapi using sa account foo
and tell it to translate it to x lang.
so i control the logic, it handles the syntax.
asking it to solve problems for you never seems to really work, but it remembers syntax and if i need some kinda reader interface over another or whatever.
can’t help me with code reviews tho, so i spent most of my time reading code instead of remembering syntax. i’m ok with it.
yeah, that’s what works for me also. LLMs are a nightmare for debugging but a breeze for this.
another good use case: have it read a ton of code and summarize it. if you’re dealing with a new (to you) area in a legacy application, and trying to fix a problem in how it interacts with a complex open-source library, have the LLM read them both and summarize how they work. (while fact-checking it along the way.)
> Am I the only one using LLMs as if they were a search engine?
Nope, you're not alone as I also do this. I'm also not using any AI IDE's (yet).
It's funny, I actually recently failed a live technical interview where I was using my LLM of choice to answer Google-like queries instead of an IDE like cursor. The interviewer told me that he had actually never seen anyone use AI for coding like that before. Up to that point, I assumed that most coders were mainly using AI as a search engine and not necessarily using AI IDE's yet. Are we really that rare?
Not at all, I've been doing this with ChatGPT and Claude for a long time. I only recently (last couple weeks) started playing around with Claude Code on command line (not in an IDE). I didn't like Cursor very much. YMMV
In the past you looking around documentation, SO answers etc would have hopefully helped you learn more about the tools and develop skills required to independently analyze the pros and cons. If you ask an LLM (or a search engine or a colleague) and take their words as the ground truth then you won’t develop the skills. Worst, sooner or later no one will have enough knowledge nor analytic skill to form an opinion on any sufficiently deep subject and they will all be dependent on corporate chatbots to spoon feed them information that may or may not be biased in a way against your interest. Now imagine if ChatGPT tells you to use azure and Gemini tells you to use GCP…
This, and I use them for code review and rapidly generating prototypes that I heavily edit. Almost none of the LLM code survives usually. You could ask "Why dont you just write it yourself then?" but sometimes getting started with the skeleton of a working project is the most difficult part of it.
It is nice when it works, but sometimes I run into trouble where I don't know a right word to put in the prompt to get the answer I'm looking for. I've recently been playing around with Raku and had a really cryptic type signature error and
Claude was of absolutely no help because it didn't know about the interaction of Signature literals and 'Slurpy' sigils in method parameters. Only when I learned about it and included the word Slurpy in my prompt would it actually regurgitate the information I was looking for but at that point I already knew it.
I think the key difference here is that if you type into Google the wrong thing it will return poor results that make it fairly clear that you're not on the right track
LLMs will sometimes just invent something that basically gaslights you into thinking you're on the right track
This plus easier search of poorly/not-at-all documented APIs is like 80% of my usage too. Besides that, a lot of “here’s my design for xyz system, am I a stupid idiot for using this architecture?”.
Yup this is where 90% of the productivity benefits come from for me. Instead of needing to spend an hour scouring documentation I can ask an LLM and have an answer in 5 minutes
Best metaphor I have found to how I use them is as "a hunting dog".
They can get into small crevasses and the foliage and whatnot and they don't mind getting wet. They can fluster rabbits out. And are somewhat smart. But you still have to make the kill and you have to lead the dog, not the other way around.
"Copilot" is a great marketing name, but a bit deceiving.
> I would read the official documentation of each db, forums, blog posts, stackoverflow entries, etc. It was time consuming on the searching side. The time it took to read all the sources was fine for me (it's learning time, so that's always welcomed).
This learning time that you welcomed is what you will now miss out on. The LLM gives you an answer, you don't know how good it is, you use it, and soon enough, if running into a similar issue, you will need to ask the LLM again since you were missing out on all that learning the first time which would have enabled you to internalize all the concepts.
It's like the regex engine example from the article. An LLM can create such a thing for you. You can read through it, it might even work, but the learning from this is orders of magnitudes less than what you get if you build this yourself.
I think it depends. LLMs can link references of where they took the content they throw at you. You can go and read such references. I like what LLMs provide, and at the same time I don't wanna blindly follow them, so I always allocate time for learning whether it's with LLMs or not.
You'd be surprised. A number of fairly technical people who are just not that familiar with ML I know got confused by this and believed the models were actually being tuned daily. I don't think that's universally understood at all.
That has actual practical implications and isn't just pedantry. People might like some model and avoid better dialog engines like perplexity believing they'd have to switch.
I think you either didn't read my response or missed the point. No matter if the LLM output is useful or not, the learning outcome is hugely impacted. Negatively.
It's like copying on your homework assignments. Looks like it gets the job done, but the point of a homework assignment is not the result you deliver, it's the process of creating that result which makes you learn something.
I recently ended up having a multiple hour long conversation with a friend of mine about the potential impact of LLMs and similar tools, which I then made into a blog post: https://blog.kronis.dev/blog/ai-artisans-and-brainrot
Basically, if you use agentic tools (not just to look things up or get surface level answers, but write your code for you), then it's quite likely that your brain no longer does as much heavy lifting as it would have before, nor does anything you do to solve the issues you're working with have much staying power.
Over time, this will decrease our cognitive capabilities in regards to specific things, very much how the same has largely happened with language syntax knowledge thanks to various IDEs and language servers, auto-complete and so on, you no longer need to memorize as much, so you don't.
While outsourcing some of that doesn't seem too bad, it's hard to tell what will happen to our critical thinking skills in the long term, since we're not just getting rid of some technical implementation details, but making the tool do a lot of thinking for us. All while screwing over those who don't use these tools, because they have to deal with the full friction of solving the problems.
For really basic overviews of different technologies I've found YouTube videos to be useful.
For code examples I've always found GitHub code search to be super useful.
I haven't really found a use case for LLMs where it's any faster than the research tools already available on the internet.
A good example is writing AWS CDK code, which is a pain in the ass. But there are so many examples to pull from on GitHub code search that are being used in real projects that I've found this method to be faster than prompting an LLM that may or may not be correct.
I just used an LLM to sleuth out a concurrency issue earlier - I could see what was happening in the debugger, I just couldn’t really see why - asked the LLM to talk me through lifecycle stuff, and boom, first thing it brought up was apparently the answer. Thank you, glorified fuzzy search! Much quicker than pouring through the docs.
I got lucky sure, and sometimes I don’t get so lucky - but it works often enough to have become a part of my regular approach at this point. Why is the thing broke? Maybe the robot knows!
Even when using them to code, I use them as a search engine. Rather than telling them to implement feature X, I clone a repo which has a similar feature and say:
"explore the repo I've cloned at /src/foo and explain how it achieves barFeature. Now look a the project at /src/baz and tell me why it would be difficult to use foo's approach in baz"
I rarely have it do anything novel, just translate ideas from existing projects into mine. Novel work is for me to enjoy coding directly.
I learned about window functions in SQL using an LLM. I hadn't written SQL in over a decade and never ran across them. It explained how they work and the trade-offs. It was great!
My main use for LLMs is to ask it to give me some boilerplate configuration for some library/tool I am not very familiar with and then look up the docs for the options it spits out. Like:
"give me the terraform configuration for an AWS ECS cluster with 2 services that can talk to each other, with one of them available publicly and the other one private"
Occasionally to give some small self-contained algorithms, for example:
Give me a way to format the difference between dates as human-readable text. For example: 1 day, 16 hours.
> Am I the only one using LLMs as if they were a search engine?
I quite like the search-first LLMs like Gemini and Copilot for this reason. They give you links which you can use to verify the output and seem to be less prone to directing you to SEO spam than Google Search/Bing.
Although I do partly think that search engines today are delivering poor performance compared to what they delivered historically so LLMs are benefiting from that.
It's kind of scary how good it is. I haven't completely switched over, but I think a lot of that is just not wanting to admit that this is the new paradigm and the implications. The ability to not only find what you're looking for faster, but have it tailored to your specific context? It's hard to go back from that.
> Am I the only one using LLMs as if they were a search engine?
Google agree with you and now at the top there is a AI generated answer, clearly labeled as AI generated, and it cites the sources. I was trying to escape AI, but I have to recognize the implementation by Google is quite good.
My most common use is to treat it like one of those recipe clippers that grabs a recipe out of an overly long blog post. Last weekend I punched '28 years after credits?' into Kagi and got back only what I wanted. Finally I can have a recipe clipper for any topic. I'm happy with it.
i'm currently using it in large part as a question asking tool. feed it a bunch of documents, then ask it to ask me questions to figure out whats been missed -- stuff that i probably know, or could find out, or look for more documents that have the answer, but arent clear from the documents i had handy.
this afternoon i spent some time feeding context, with the idea of making a QnA context for maybe 50 different website routes, so i can have it ask me similar questions per route, so i can have a hell of a lot of pretty accurate documentation for an intern to pick up a couple to work on, or get some LLM to take a try at filling out the details.
im feeding it knowledge, with the output being documents, rather than me getting knowledge from it.
I primarily use it this way or as a rubber duck. I pretty much use it as a replacement for going on a Google adventure. When I use it for code, it's mostly as a glorified auto-complete or write some boilerplate I can't be bothered to type.
That's basically how I use it though I cannot wait for Gemini to be a click away in my browser with the current screen/page(s)/tab(s) embedded so I can ask it stuff about the current long article/documentation page. We're becoming very, very lazy.
I tell to myself to use them as "teachers", not "interns", i.e. ask them questions to guide my process or look for the sources of knowledge needed to understand or do something, instead of asking them to get things done (except tedious, simple tasks).
Make sure to spend some time asking them questions on topics you already know a lot about. (I like to ask the AI about a game I developed called Neptune's Pride.)
A year ago the AI would just make up some completely random stuff. The current crop do a very good job, but still not 100% correct.
They are wrong enough that I would be wary of using them to "teach" me topics I don't know about.
Nope, I do this too (most of the time.) I don’t like working on code I don’t understand. I have started to ask it to use a client API I’ve written to figure out how clients would work with the stuff I write though. It’s great.
I think one of the major issues I (and others) have with LLM's as search is they are heavily biased towards outdated data. The other day I was trying to have it get me the most updated versions of X, X, X from an old project. I tried several providers and all of them without exception gave me an "upgrade" to an older version than was in the project for at least 1 item(by years). I'm sure they are choosing LTS releases over the new ones because that is most popular overall in a dataset. However no matter how "hard" I prompt they keep giving me those old ones over "latest"
My LLM use case preference is exactly that - a google replacement. I've modified my prompts to ask for links to direct source material, thereby allowing me to go deeper. I find this in no way different then a search engine.
The downside however, is that at least 50% of the links cited no longer exists. This points to a general issue with LLMs, temporal knowledge. I tend to at least go to archive.org on a per-link basis when so inclined.. But clearly that's not ideal.
Does anyone have a better solution, short of just asking the LLM to remove the extra step, and link to the archive.org cache for the ingestion date (which frankly, I also don't have faith in being accurate).
It's a suggestion engine. An autocomplete (I am not saying GPT is just autocomplete of course it is more advanced) but end if the day you need to verify everything it does right now. May not be true in the future though.
Search finds you sources then you can decide if you trust them. AI generates code and you have no idea if what it generated was perfect, slop or almost perfect but bad in some major way.
Nope! I use llms like I used to use Google and StackOverflow. Really great to bounce ideas off of, brainstorm, yes save time too ... But the at the end of the day, what will separate programmers in the future is those that bother to put an ounce of effort into things, read, and learn. Versus those who want the easy button and answer given to them. They're going to be the next generation of people who leak your passwords, social security numbers, PII, and create massive global security incidents.
To be clear. AI doesn't kill people. People kill people. In this case, lazy people. Welcome to Idiocracy my friends.
Am I the only one using LLMs as if they were a search engine? So before LLMs I was searching on Google things like "pros cons mysql mongodb". I would read the official documentation of each db, forums, blog posts, stackoverflow entries, etc. It was time consuming on the searching side. The time it took to read all the sources was fine for me (it's learning time, so that's always welcomed). Now with LLMs, I simply prompt the same with a little bit more of context "pros and cons of using mysql vs mongodb when storing photos. Link references". So, I get a quick overview of what to keep an eye on, and the references are there to avoid relying on hallucination.
It's true that sometimes I go ahead and say "give me a data schema for storing photos metadata in postgres. I wanna keep X in a different table, though" (or something like that). But I do that because I know very well what the output should look like (I just don't wanna spend time typing it, and sometimes I forget the actual type that I should use (int vs integer?)).