More

nothrowaways · 2025-12-09T02:55:10 1765248910

> Principal component analysis of 200 GPT2, 500 Vision Transformers, 50 LLaMA- 8B, and 8 Flan-T5 models reveals consistent sharp spectral decay - strong evidence that a small number of weight directions capture dominant variance despite vast differences in training data, objectives, and initialization.

Isn't it obvious?

stingraycharles · 2025-12-09T03:05:01 1765249501

Well intuitively it makes sense that within each independent model, a small number of weights / parameters are very dominant, but it’s still super interesting that these can be swapped between all the models without loss of performance.

It isn’t obvious that these parameters are universal across all models.

levocardia · 2025-12-09T04:47:54 1765255674

This general idea shows up all over the place though. If you do 3D scans on thousands of mammal skulls, you'll find that a few PCs account for the vast majority of the variance. If you do frequency domain analysis of various physiological signals...same thing. Ditto for many, many other natural phenomena in the world. Interesting (maybe not surprising?) to see it in artificial phenomena as well

vintermann · 2025-12-09T06:30:50 1765261850

It's almost an artifact of PCA. You'll find "important" principal components everywhere you look. It takes real effort to construct a dataset where you don't. That doesn't mean though, for instance, that throwing away the less important principal components of an image is the best way to compress an image.

mlpro · 2025-12-09T03:11:56 1765249916

Not really. If the models are trained on different dataset - like one ViT trained on satellite images and another on medical X-rays - one would expect their parameters, which were randomly initialized to be completely different or even orthogonal.

crooked-v · 2025-12-09T03:58:05 1765252685

Now I wonder how much this "Universal Subspace" corresponds to the same set of scraped Reddit posts and pirated books that apparently all the bigcorps used for model training. Is it 'universal' because it's universal, or because the same book-pirating torrents got reused all over?

energy123 · 2025-12-09T04:34:54 1765254894

Every vision task needs edge/contrast/color detectors and these should be mostly the same across ViTs, needing only a rotation and scaling in the subspace. Likewise with language tasks and encoding the basic rules of language which are the same regardless of application. So it is no surprise to see intra-modality shared variation.

The surprising thing is inter-modality shared variation. I wouldn't have bet against it but I also wouldn't have guessed it.

I would like to see model interpretability work into whether these subspace vectors can be interpreted as low level or high level abstractions. Are they picking up low level "edge detectors" that are somehow invariant to modality (if so, why?) or are they picking up higher level concepts like distance vs. closeness?

TheOtherHobbes · 2025-12-09T20:33:54 1765312434

It hints there may be common higher-level abstraction and compression processes in human consciousness.

The "human" part of that matters. This is all human-made data, collected from human technology, which was created to assist human thinking and experience.

So I wonder if this isn't so much about universals or Platonic ideals. More that we're starting to see the outlines of the shapes that define - perhaps constrict - our own minds.

nothrowaways · 2025-12-09T02:44:55 1765248295

What if all models are secretly just fine tunes of llama?

nothrowaways · 2025-12-02T08:25:35 1764663935

Where do they get the video training data?

postalcoder · 2025-12-02T08:29:31 1764664171

From the paper:

> Datasets. We construct a diverse and high-quality collection of video datasets to train STARFlow-V. Specifically, we leverage the high-quality subset of Panda (Chen et al., 2024b) mixed with an in-house stock video dataset, with a total number of 70M text-video pairs.

justinclift · 2025-12-02T10:22:08 1764670928

> in-house stock video dataset

Wonder if "iCloud backups" would be counted as "stock video" there? ;)

anon7000 · 2025-12-02T10:43:36 1764672216

I have to delete as many videos as humanly possible before backing up to avoid blowing through my iCloud storage quota so I guess I’m safe

whywhywhywhy · 2025-12-02T18:49:16 1764701356

More likely AppleTV shows

astrange · 2025-12-03T07:18:50 1764746330

Stock video means stock video.

https://en.wikipedia.org/wiki/Stock_photography

fragmede · 2025-12-02T10:28:27 1764671307

Turn on advanced data protection so they don't train on yours.

givinguflac · 2025-12-02T13:14:21 1764681261

That has nothing to do with it, and Apple wouldn’t train on user content, they’re not Google. If they ever did there would be opt in at best. There’s a reason they’re walking and observing, not running and trying to be the forefront cloud AI leader, like some others.

gaigalas · 2025-12-03T01:02:06 1764723726

Why should I buy this "ethical Apple" argument?

They shared audio Siri recordings with contractors in 2019. It became opt-in only after backlash, similar to other privacy controversies.

This shows that they clearly prioritize not being sued or caught, which is slightly different from prioritizing user choices.

nothrowaways · 2025-11-28T11:57:11 1764331031

It is interesting to see the consensus that nobody is enthusiastic about meta Ray-Bans except Zuckerberg.

It's creepy.

benbristow · 2025-11-28T12:00:56 1764331256

The only real usage I've seen is on Instagram reels etc. where people are using them in red light districts like in Amsterdam to film the women.

bryan_w · 2025-11-28T19:59:08 1764359948

I've seen a plumber use it to document a repair that he was doing. Being able to record in tight spaces seems to be a good use case for this tech

I've also seen a home inspector use them to document issues with a new construction

There's also a ton of people using it for cooking videos

willidiots · 2025-11-28T15:00:34 1764342034

I have them and like them. I don't wear them constantly, but on days when I'm doing something interesting, they help me document much more than I otherwise would.

nothrowaways · 2025-11-27T05:30:50 1764221450

3 similar apps already! Apple and big tech UI designers should read this thread.

nothrowaways · 2025-11-10T21:54:47 1762811687

"Invest in my startup"

ares623 · 2025-11-10T21:58:47 1762811927

Before the music stops

nothrowaways · 2025-11-03T20:19:05 1762201145

Python is quickly turning into a crowded keyword junkyard

notatallshaw · 2025-11-03T20:54:49 1762203289

Python has about 40 keywords, I say I would regularly use about 30, and irregularly use about another 5. Hardly seems like a "junkyard".

Further, this lack of first class support for lazy importing has spawned multiple CPython forks that implement their own lazy importing or a modified version of the prior rejected PEP 690. Reducing the real world need for forks seems worth the price of one keyword.

lairv · 2025-11-03T22:01:04 1762207264

For those curious here are the actual keywords (from https://docs.python.org/3/reference/lexical_analysis.html?ut... )

Hard Keywords:

False await else import pass None break except in raise True class finally is return and continue for lambda try as def from nonlocal while assert del global not with async elif if or yield

Soft Keywords:

match case _ type

I think nonlocal/global are the only hard keywords I now barely use, for the soft ones I rarely use pattern matching, so 5 seems like a good estimate

silverwind · 2025-11-04T00:20:16 1762215616

I recall when they added "async" and it broken a whole lot of libraries. I hope they never again introduce new "hard" keywords.

GauntletWizard · 2025-11-03T23:30:39 1762212639

Removing "print" in 3.0 helped their case significantly, as well.

striking · 2025-11-03T21:29:30 1762205370

From the PEP (https://peps.python.org/pep-0810/):

> The choice to introduce a new `lazy` keyword reflects the need for explicit syntax. Lazy imports have different semantics from normal imports: errors and side effects occur at first use rather than at the import statement. This semantic difference makes it critical that laziness is visible at the import site itself, not hidden in global configuration or distant module-level declarations. The lazy keyword provides local reasoning about import behavior, avoiding the need to search elsewhere in the code to understand whether an import is deferred. The rest of the import semantics remain unchanged: the same import machinery, module finding, and loading mechanisms are used.

This functionality is highly desired, and it does appear to actually need a new (soft) keyword. Sorry you don't like it.

onedognight · 2025-11-03T21:31:48 1762205508

The pep didn’t mention considering reusing `async` instead of `lazy`. That would’ve conveyed the same thing to me without a new keyword, and would haven’t been similar to html’s usage `async`.

belval · 2025-11-04T02:10:53 1762222253

I personally would have preferred "defer import os" instead of "lazy import os". It might be the non-native showing but lazy import feels unserious.

bloppe · 2025-11-04T10:52:27 1762253547

Lazy is more canonical: https://en.wikipedia.org/wiki/Lazy_evaluation

_moof · 2025-11-04T14:44:50 1762267490

"Lazy" is standard language for this kind of behavior.

riedel · 2025-11-03T20:51:09 1762203069

It is a 'soft keyword' as the PEP explains. I would not think that this has any major impact on anyone who just chooses to ignore this feature. Assuming that you want this behavior, I wonder how this could have been done in a better fashion without now having 'lazy' in the specific context of an import statement.

rrauenza · 2025-11-03T21:13:46 1762204426

soft keyword for anyone not familiar like I was ...

"A new soft keyword lazy is added. A soft keyword is a context-sensitive keyword that only has special meaning in specific grammatical contexts; elsewhere it can be used as a regular identifier (e.g., as a variable name). The lazy keyword only has special meaning when it appears before import statements..."

aroberge · 2025-11-04T12:26:31 1762259191

> Python is quickly turning into a crowded keyword junkyard

* Javascript (ECMAScript) has 63 keywords. * Rust has 50 keywords. * Java has 51 keywords + 17 contextually reserved words, for a total of 68. * Python has now 36 keywords + 4 'soft' keywords, for a total of 40. * Go has 25 keywords.

nothrowaways · 2025-10-30T01:37:03 1761788223

They left out X because he will bitch_ about it to his 600 million followers lol.

nothrowaways · 2025-10-30T01:38:05 1761788285

Interesting times

pms · 2025-10-30T11:24:17 1761823457

Did you read the article? X was first to be accused by the EU Commission.

nothrowaways · 2025-10-30T01:32:11 1761787931

nothrowaways · 2025-10-29T19:45:49 1761767149

Does speed really matter during python installation?

maccard · 2025-10-29T19:58:47 1761767927

Speed matters everywhere. How much compute is spent on things that could easily be 100x faster than they are? Compare using VMware with pip to run a battery of unit tests with firecracker plus uv. It’s orders of magnitude quicker, and avoids a whole suite of issues related to persistent state on the machine

andy99 · 2025-10-29T23:45:35 1761781535

Possibly for some workflows, though personally I find the emphasis on speed baffling and a big part of the reason I don’t find most of these uv testimonials credible. I’m a regular python user across multiple environments and I’ve never considered waiting for pip to be a material part of my time, it’s trivial to the point of being irrelevant. The fact that so many people come out of the woodwork to talk about how fast it is, means either there’s some big group somewhere with a niche use case that gets them bogged down in pip dependency resolving or whatever gets sped up (obviously the actual downloading can’t be faster) or it’s just a talking point that (presumably) rust zealots who don’t actually use python arrive with en mass, but it’s honestly an extremely ineffective way of promoting the product to most python users who don’t have speed of package installation as anything close to a pain point.

sunshowers · 2025-10-29T19:49:15 1761767355

Yes. Technical excellence is a virtue in and of itself.

mwcampbell · 2025-10-29T22:42:13 1761777733

This! I'm tired of the constant calls to be as mediocre as we can get away with, in the name of getting things done faster and cheaper.

collinmanderson · 2025-10-29T19:51:14 1761767474

It's fast enough that sometimes dependencies can be checked and resolved and installed at program runtime rather than it needing to be a separate step.

You can go from no virtual environment, and just "uv run myfile.py" and it does everything that's needed, nearly instantly.

zahlman · 2025-10-29T20:33:01 1761769981

On my system, Pip takes noticeable time just to start up without ultimately doing anything of importance:

  $ time pip install
  ERROR: You must give at least one requirement to install (see "pip help install")

  real 0m0.356s
  user 0m0.322s
  sys 0m0.036s

(Huh, that's a slight improvement from before; I guess pip 25.3 is a bit better streamlined.)

andy99 · 2025-10-29T23:49:03 1761781743

lol who is using pip so much that .36s of startup time matters to them? This, if presumably uv can do nothing slightly faster, is an absolutely meaningless benefit

zahlman · 2025-10-30T00:20:56 1761783656

>who is using pip so much that .36s of startup time matters to them?

https://danluu.com/productivity-velocity

https://danluu.com/input-lag/

sunshowers · 2025-10-30T01:34:58 1761788098

In general, whenever you introduce a cache to make software faster (along any dimension), you have to think about cache invalidation and eviction. If your software is fast enough to not need caching, this problem goes away.

zahlman · 2025-10-30T19:56:06 1761854166

It's funny because superior caching is also highly relevant to uv's outperformance. (But invalidation/eviction isn't generally a real problem for a cache of installed packages; the cache can be cleaned up whenever and just rebuilt , and the cache has a separate entry per version of a library, where each version is immutable.)

sunshowers · 2025-10-31T20:50:44 1761943844

Agreed -- the dream of caching is immutability.