More

s5ma6n · 2025-11-17T12:05:28 1763381128

Thanks for sharing it, however I have an unrelated comment.

Maybe I am in minority here but just wanted to provide this feedback: The background animation of the blog page is really distracting and making it difficult to focus on the actual content.

s5ma6n · 2025-05-06T13:58:30 1746539910

I am also using conda and specifically mamba which has a really quick dependency solver.

However, sometimes repos require system level packages as well. Tried to run TRELLIS recently and gave up after 2h of tinkering around to get it to work in Windows.

Also, whenever I try to run some new repo locally, creating a new virtual environment takes a ton of disk space due to CUDA and PyTorch libraries. It adds up quickly to 100s of gigs since most projects use different versions of these libraries.

</rant> Sorry for the rant, can't help myself when it's Python package management...

WanderPanda · 2025-05-06T17:13:23 1746551603

Same experience. They should really store these blobs centrally under a hash and link to them from the venvs

s5ma6n · 2025-02-08T15:50:42 1739029842

Also relevant and useful single-file public domain libraries for C/C++

https://github.com/nothings/stb

s5ma6n · on Nov 27, 2024

I did not know that cleanrooms have classes. Apparently they used a 10000 class cleanroom which is one of the "dirtiest" grade.

Surely an astroid sample return mission is scientifically one of the most important and difficult things to accomplish, and I wonder why they did not use a higher class cleanroom for this even though they mentioned "Researchers recommend enhanced contamination control procedures for future sample-return missions to prevent microbial colonization and ensure the integrity of extraterrestrial samples."

ASalazarMX · on Nov 27, 2024

I bet the hardest part is flying a higher class clean room equivalent to an asteroid, so it lands immaculate on Earth. Otherwise it doesn't make much sense to upgrade the clean room when the sample has been exposed already.

taskforcegemini · on Nov 29, 2024

A clean room is for people to work in, the samples "just" need to be in a sealed container.

s5ma6n · on Nov 22, 2024

Agreed on providing examples is definitely a useful insight vs fine-tuning.

While it is not very important for this toy case, it's good to keep in mind that each provided example in the input will increase the prediction time and cost compared to fine-tuning.

s5ma6n · on Oct 31, 2024

I am puzzled why they have "asked the model" about the confidence and have not used the logprobs of the output tokens to estimate the confidence in responses.

In my use case and tests, model itself is not capable of giving a reliable confidence value where logprobs almost always provide a better view on calibration.

michaelt · on Oct 31, 2024

To measure confidence based on the logprobs of a given token, you must first know which token you're measuring - that's why a lot of benchmarks love multiple choice questions where the LLM responds with a single token.

But of course that's not the way LLMs are normally used. And it precludes any sort of chain-of-thought reasoning.

For some questions, like those involving calculations, letting the model talk to itself produces much better results. For example compare https://chatgpt.com/share/67238eda-6b08-8011-8d2d-a945f78e6f... to https://chatgpt.com/share/67235a98-d2c8-8011-b2bf-53c0efabea...

s5ma6n · on Oct 31, 2024

To me it boils down to what is to be measured here. With logprobs we can measure both correctness and not attempted i.e. if LLM is guessing the response.

Similar to exams where both the progress to the solution and the final outcome/value of the calculations are part of the grade.

To have the cake and eat it too for chain-of-thought reasoning, one way is to ask for a "final answer" so the final response token logprobs can be evaluated https://chatgpt.com/share/67239d92-b24c-800a-af8c-40da7be1f5...

Another trick is using JSON mode to keep intermediate results and final response separate, so each can be graded accordingly.

michaelt · on Oct 31, 2024

> one way is to ask for a "final answer" so the final response token logprobs can be evaluated

Alas, this won't work.

Imagine I ask an LLM to continue the sentence "Summing those up: 4+6.75+6.52=17.27 litres of pure alcohol. In summary, the total amount of pure alcohol they have is: "

The logprobs of the next token do not represent the LLM's confidence in its own answer. They represent the LLM's confidence in its ability to repeat the total from 18 words previously.

_jonas · on Nov 3, 2024

Here are some benchmarks I ran that compare the precision/recall of various LLM error-detection methods, including logprobs and LLM self-evaluation / verbalized confidence:

https://cleanlab.ai/blog/4o-claude/

These approaches can detect errors better than random guessing, but there are other approaches that are significantly more effective in practice.

HappMacDonald · on Nov 1, 2024

I wonder what would happen if token input included the logprob (or n/a for input from outside the LLM) of each token selected and the network were trained with that extra layer of information, especially during the human feedback training at the end.

s5ma6n · on Aug 12, 2024

Reminds me of https://chiptune.app/browse

tomaspollak · on Aug 12, 2024

Chip Player is awesome. I love the retro-looking UI.

s5ma6n · on July 24, 2024

I would be extremely surprised if that's the case. There are "open-source" multimodal LLMs can extract text from images as a proof that the idea works.

Probably the model is hallucinating and adding "Hungarian language is not installed for Tesseract" to the response.

s5ma6n · on Nov 26, 2023

This sounds like the solution I could benefit a lot as well since I am using Obsidian for almost everything.

Could you elaborate more on the "local AI assistant parsing" part please?

Ringz · on Nov 26, 2023

https://news.ycombinator.com/item?id=36933452 https://khoj.dev/

s5ma6n · on Oct 27, 2023

Apparently PEG is/was the one of the reasons of the severe allergic reactions in mRNA COVID-19 vaccines (Moderna, Pfizer)[0]

That's plastics for you folks...

[0]: [Translated from German](https://www-aerzteblatt-de.translate.goog/archiv/217236/COVI...)

sigmar · on Oct 27, 2023

Not a plastic and it is commonly found in food and cosmetics.

>evidence shows the existence of a detectable level of anti-PEG antibodies in approximately 72% of the population, never treated with PEGylated drugs, based on plasma samples from 1990 to 1999. Due to its ubiquity in a multitude of products and the large percentage of the population with antibodies to PEG, hypersensitive reactions to PEG are an increasing concern.[39][40] Allergy to PEG is usually discovered after a person has been diagnosed with an allergy to an increasing number of seemingly unrelated products, including processed foods, cosmetics, drugs, and other substances that contain PEG or were manufactured with PEG

https://en.wikipedia.org/wiki/Polyethylene_glycol

ekianjo · on Oct 27, 2023

PEG is not plastics...

s5ma6n · on Oct 27, 2023

It is a petroleum-based polymer, tomato vs tomato...

hwillis · on Oct 27, 2023

Tons of common plastics are made from plants. PLA, the most common FDM 3d printing plastic, is made from corn. Ethylene in particular is also naturally occurring- it's one of the most important plant hormones.

Plastic implies it is solid, which is wrong.

kragen · on Oct 27, 2023

many grades of peg are liquid at room temperature; the others are highly water-soluble. you've probably never seen an object manufactured from peg in your life. that's why people don't normally call it 'plastic'

orangepurple · on Oct 27, 2023

You're forgetting about Aluminum