Hacker Newsnew | past | comments | ask | show | jobs | submit | MurizS's commentslogin

Native since v146, about:config -> browser.tabs.splitView.enabled


Which in turn lets the people at Moonshot AI worry about that for them, the only provider for this model as of now.


I think what you're referring to is also known as Stigler's law of eponymy [1], which is interestingly self-referential and ironic in its own naming. There's also the related "Matthew effect" [2] in the sciences.

[1] https://en.wikipedia.org/wiki/Stigler's_law_of_eponymy

[2] https://en.wikipedia.org/wiki/Matthew_effect


The most annoying instance, to me, of Stigler's Law is De Morgan's Laws, which say the following:

1. If two things are not both true, then one or both of them must be false. (And the reverse.)

2. If neither of two things is true, then both of them are false. (And the reverse.)

You might notice that both statements are blindingly obvious, but we've named them after Augustus de Morgan anyway.


I think GP was probably referring to "Scaling Data-Constrained Language Models" (2305.16264) from NeurIPS 2023, which looked first at how to optimally scale LLMs when training data is limited. There is a short section on mixing code (Python) into the training data and the effect this has on performance on e.g. natural language tasks. One of their findings was that training data can be up to 50% code without actually degrading performance, and in some cases (benchmarks like bAbI and WebNLG) with improvements (probably because these tasks have an emphasis on what they call "long-range state tracking capabilities").

For reference: In the Llama 3 technical report (2407.21783), they mention that they ended up using 17% code tokens in their training data.


Is the network only trained on the source code, or does it have access to the results of running the code, too?


Also GPT-3.5 was another extreme if I remember correctly. They first trained only on code then they trained on other text. I can't seem to find the source though.


Most likely "The Brothers Karamazov".



It is not that the HN community is diminishing your credentials in applied category theory or in any other fields, more so the inherited snarkiness in your short comment provides no one a favor and surely doesn't add any value to the discussion.


Funnily enough, someone tells am after a minute he should "slow down and breath a little bit".


Out of curiosity, what was your first act and what made you transition to software engineering?


Bioengineering in academia. Moved to software engineering for an extra $100,000/year. In hindsight, my overall my standard of living has gone down despite the extra money due the housing crunch in the Bay Area.


Your comment reminds me of a study [1] published in 2019, demonstrating that people with the strongest views against genetically modified foods know the least about science but believe they know the most.

Implicating that the Big Bang Theory enjoys its popularity because it "sells" is, I think, showing complete arrogance to landmark discoveries made in the past century - such as the cosmic background radiation [2] (for which the Nobel Prize 1978 was awarded), supporting the thesis of the Big Bang.

[1] Extreme opponents of genetically modified foods know the least but think they know the most. https://www.nature.com/articles/s41562-018-0520-3 [2] https://en.wikipedia.org/wiki/Cosmic_microwave_background


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: