Hacker Newsnew | past | comments | ask | show | jobs | submit | s5ma6n's commentslogin

Thanks for sharing it, however I have an unrelated comment.

Maybe I am in minority here but just wanted to provide this feedback: The background animation of the blog page is really distracting and making it difficult to focus on the actual content.


I am also using conda and specifically mamba which has a really quick dependency solver.

However, sometimes repos require system level packages as well. Tried to run TRELLIS recently and gave up after 2h of tinkering around to get it to work in Windows.

Also, whenever I try to run some new repo locally, creating a new virtual environment takes a ton of disk space due to CUDA and PyTorch libraries. It adds up quickly to 100s of gigs since most projects use different versions of these libraries.

</rant> Sorry for the rant, can't help myself when it's Python package management...


Same experience. They should really store these blobs centrally under a hash and link to them from the venvs


Also relevant and useful single-file public domain libraries for C/C++

https://github.com/nothings/stb


I did not know that cleanrooms have classes. Apparently they used a 10000 class cleanroom which is one of the "dirtiest" grade.

Surely an astroid sample return mission is scientifically one of the most important and difficult things to accomplish, and I wonder why they did not use a higher class cleanroom for this even though they mentioned "Researchers recommend enhanced contamination control procedures for future sample-return missions to prevent microbial colonization and ensure the integrity of extraterrestrial samples."


I bet the hardest part is flying a higher class clean room equivalent to an asteroid, so it lands immaculate on Earth. Otherwise it doesn't make much sense to upgrade the clean room when the sample has been exposed already.


A clean room is for people to work in, the samples "just" need to be in a sealed container.


Agreed on providing examples is definitely a useful insight vs fine-tuning.

While it is not very important for this toy case, it's good to keep in mind that each provided example in the input will increase the prediction time and cost compared to fine-tuning.


I am puzzled why they have "asked the model" about the confidence and have not used the logprobs of the output tokens to estimate the confidence in responses.

In my use case and tests, model itself is not capable of giving a reliable confidence value where logprobs almost always provide a better view on calibration.


To measure confidence based on the logprobs of a given token, you must first know which token you're measuring - that's why a lot of benchmarks love multiple choice questions where the LLM responds with a single token.

But of course that's not the way LLMs are normally used. And it precludes any sort of chain-of-thought reasoning.

For some questions, like those involving calculations, letting the model talk to itself produces much better results. For example compare https://chatgpt.com/share/67238eda-6b08-8011-8d2d-a945f78e6f... to https://chatgpt.com/share/67235a98-d2c8-8011-b2bf-53c0efabea...


To me it boils down to what is to be measured here. With logprobs we can measure both correctness and not attempted i.e. if LLM is guessing the response.

Similar to exams where both the progress to the solution and the final outcome/value of the calculations are part of the grade.

To have the cake and eat it too for chain-of-thought reasoning, one way is to ask for a "final answer" so the final response token logprobs can be evaluated https://chatgpt.com/share/67239d92-b24c-800a-af8c-40da7be1f5...

Another trick is using JSON mode to keep intermediate results and final response separate, so each can be graded accordingly.


> one way is to ask for a "final answer" so the final response token logprobs can be evaluated

Alas, this won't work.

Imagine I ask an LLM to continue the sentence "Summing those up: 4+6.75+6.52=17.27 litres of pure alcohol. In summary, the total amount of pure alcohol they have is: "

The logprobs of the next token do not represent the LLM's confidence in its own answer. They represent the LLM's confidence in its ability to repeat the total from 18 words previously.


Here are some benchmarks I ran that compare the precision/recall of various LLM error-detection methods, including logprobs and LLM self-evaluation / verbalized confidence:

https://cleanlab.ai/blog/4o-claude/

These approaches can detect errors better than random guessing, but there are other approaches that are significantly more effective in practice.


I wonder what would happen if token input included the logprob (or n/a for input from outside the LLM) of each token selected and the network were trained with that extra layer of information, especially during the human feedback training at the end.



Chip Player is awesome. I love the retro-looking UI.


I would be extremely surprised if that's the case. There are "open-source" multimodal LLMs can extract text from images as a proof that the idea works.

Probably the model is hallucinating and adding "Hungarian language is not installed for Tesseract" to the response.


This sounds like the solution I could benefit a lot as well since I am using Obsidian for almost everything.

Could you elaborate more on the "local AI assistant parsing" part please?



Apparently PEG is/was the one of the reasons of the severe allergic reactions in mRNA COVID-19 vaccines (Moderna, Pfizer)[0]

That's plastics for you folks...

[0]: [Translated from German](https://www-aerzteblatt-de.translate.goog/archiv/217236/COVI...)


Not a plastic and it is commonly found in food and cosmetics.

>evidence shows the existence of a detectable level of anti-PEG antibodies in approximately 72% of the population, never treated with PEGylated drugs, based on plasma samples from 1990 to 1999. Due to its ubiquity in a multitude of products and the large percentage of the population with antibodies to PEG, hypersensitive reactions to PEG are an increasing concern.[39][40] Allergy to PEG is usually discovered after a person has been diagnosed with an allergy to an increasing number of seemingly unrelated products, including processed foods, cosmetics, drugs, and other substances that contain PEG or were manufactured with PEG

https://en.wikipedia.org/wiki/Polyethylene_glycol


PEG is not plastics...


It is a petroleum-based polymer, tomato vs tomato...


Tons of common plastics are made from plants. PLA, the most common FDM 3d printing plastic, is made from corn. Ethylene in particular is also naturally occurring- it's one of the most important plant hormones.

Plastic implies it is solid, which is wrong.


many grades of peg are liquid at room temperature; the others are highly water-soluble. you've probably never seen an object manufactured from peg in your life. that's why people don't normally call it 'plastic'


You're forgetting about Aluminum


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: