Hacker Newsnew | past | comments | ask | show | jobs | submit | mathisfun123's commentslogin

> it is nuts that in an object method, there is a performance enhancement through caching a member value

i don't understand what you think is nuts about this. it's an interpreted language and the word `self` is not special in any way (it's just convention - you can call the first param to a method anything you want). so there's no way for the interpreter/compiler/runtime to know you're accessing a field of the class itself (let alone that that field isn't a computed property or something like that).

lots of hottakes that people have (like this one) are rooted in just a fundamental misunderstanding of the language and programming languages in general <shrugs>.


If you dig into JS engine implementations they deal with a lot of the same sorts of things. Simple objects with straightforward properties are tagged such that they skip the dynamic machinery with fallback paths to deal with dynamism when it is necessary.

A common approach is hidden classes that work much like classes in other languages. Reading a simple int property just reads bytes at an offset from the object pointer directly. Upon entry to the method bits of the object are tested and if the object is not known to be simple it escapes into the full dynamic machinery.

I don't know if those exact techniques would work for Python but this is not an either-or situation.

See also: modern Objective-C msg_Send which is so fast on modern hardware for the fast-path it is rarely a performance bottleneck. Despite being able to add dynamic subclasses or message forward at runtime.


What's nuts is that the language doesn't guarantee that successive references to the same member value within the same function body are stable. You can look it up once, go off and do something else, and look it up again and it's changed. It's dynamism taken to an unnecessary extreme. Nobody in the real world expects this behaviour. Making it just a bit less dynamic wouldn't change the fundamentals of the language but it would make it a lot more tractable.

> What's nuts is that the language doesn't guarantee that successive references to the same member value within the same function body are stable. You can look it up once, go off and do something else, and look it up again and it's changed.

There is no such thing as 'successive references to the same member value' here. It's not that you look up the same object and it can change, it's that you are not referring to the same object at all.

self.x is actually self.__getattr__('x'), which can in fact return a different thing each time. `self.x` IS a string lookup and that is not an implementation detail, but a major design goal. This is the dynamism, that is one of the selling points of Python, it allows you to change and modify interfaces to reflect state. It's nice for some things and it is what makes Python Python. If you don't want that, use another language.


ok, then it is nuts that __getattr__ (itself a specially blessed function) is not required to be pure at least from the caller point of view.

If it was it wouldn't be Python. It can never be pure because __getattr__ is just another method that anyone can overwrite.

In Python attribute access aren't stable! `self.x` where `x` is a property is not guaranteed to refer to the same thing.

And getting rid of descriptors would be a _fundamental change to the language_. An immeense one. Loads of features are built off of descriptors or descriptor-like things.

And what you're complaining about is also not true in Javascript world either... I believe you can build descriptor-like things in JS now as well.

_But_ if you want that you can use stuff like mypyc + annotations to get that for you. There are tools that let you get to where you want. Just not out of the box because Python isn't that language.

Remember, this is a scripting language, not a compiled language. Every optimization for things you talk about would be paid on program load (you have pyc stuff but still..)

Gotta show up with proof that what you're saying is verifiable and works well. Up until ~6 or 7 years ago CPython had a concept of being easy to onboard onto. Dataflow analyses make the codebase harder to deal with.

Having said all of that.... would be nice to just inline RPython-y code and have it all work nicely. I don't need it on everything and proving safety is probably non-trivial but I feel like we've got to be closer to doing this than in the past.

I ... think in theory the JIT can solve for that too. In theory


>Remember, this is a scripting language, not a compiled language

This is the fundamental issue and "elephant in the room" that everyone is seems to be overlooking, and putting under the carpet.

The extreme compiled type language guys going gung-ho with very slow to compile and complicated Rust (moreso than C++), while the rest of the world gladly hacking their shiny ML/AI codes in scripting language aka Python "the glue duct tapes language" with most if not all the fast engine libraries (e.g PyTorch) written in unsafe C/C++.

The problem is that Python was meant for scripting not properly designed software system engineering. After all it's based on ABC language for beginners with an asterisk attached "intended for teaching or prototyping, but not as a systems-programming language" [1].

In ten years time people will most probably look in horror at their python software stacks tech debt that they have to maintain for the business continuity. Or for their own sanity, they will rewrite the entire things in much more stable with fast development and compiled modern language eco-system like D language with native engine libraries, and seamless integration C, and C++ (to some extend) if necessary.

[1] ABC (programming language)

https://en.wikipedia.org/wiki/ABC_(programming_language)


> In ten years time people will most probably look in horror at their python software stacks tech debt that they have to maintain for the business continuity.

I regret to inform you that there are _loads_ of multi-decades-old Python stacks at this point.

On the micro level I'll be like "ugh wish I wasn't paying the costs of Python" decently enough. But on the macro level I don't regret Python stacks. At least not when looking at the alternatives.

Tho I will admit I'm a bit mystified at data science stuff in particular persisting in Python. Lots of CPU churn even if the underlying libs are all C extensions.


> In ten years time people will most probably look in horror at their python software stacks tech debt that they have to maintain for the business continuity

Yes, like they did for JavaScript!


> The problem is that Python was meant for scripting not properly designed software system engineering.

What something was meant to do has never, ever stopped people. People find creative ways to use tools in unintended ways all the time. It's what we do.

We can call this dumb or get misanthropic about it, or we can try to understand why people all over the world choose to use Python in "weird" ways, and what this tells us about the way people relate to computing.


> What's nuts is that the language doesn't guarantee that successive references to the same member value within the same function body are stable.

The language supports multiple threads and doesn’t have private fields (https://docs.python.org/3/tutorial/classes.html#private-vari...), so the runtime cannot rule out that the value gets changed in-between.

And yes, it often is obvious to humans that’s not intended to happen, and almost never what happens, but proving that is often hard or even impossible.


wouldn't a concurrent change without synchronization be UB anyway? Also parent wants to cache the address, not the value (but you have to cache the value if you want to optimize manually)

Why would it be UB? All objects are behind (thin) pointers, which can be overwritten atomically.

Not necessarily UB, but absolutely "spooky action" nondeterministic race conditions that make things difficult to understand.

> Nobody in the real world expects this behaviour.

For example, numbers and strings are immutable objects in Python. If self.x is a number and its numeric value is changed by a method call, self.x will be a different object after that. I'd dare say people expect this to work.


basically all object oriented languages work like that. You access a member; you call a method which changes that member; you expect that change is visible lower in the code, and there're no statically computable guarantees that particular member is not touched in the called method (which is potentially shadowed in a subclass). It's not dynamism, even c++ works the same, it's an inherent tax on OOP. All you can do is try to minimize cost of that additional dereference. I'm not even touching threads here.

now, functional languages don't have this problem at all.


OOP has nothing to do with it. In your C++ example, foo(bar const&); is basically the same as bar.foo();. At the end of the day, whether passing it in as an argument or accessing this via the method call syntax it's just a pointer to a struct. Not to mention, a C++ compiler can, and often does, choose to put even references to member variables in registers and access them that way within the method call.

This is a Python specific problem caused by everything being boxed by default and the interpreter does not even know what's in the box until it dereferences it, which is a problem that extends to the "self" object. In contrast in C++ the compiler knows everything there's to know about the type of this which avoids the issue.


That's not true. I mean: it's true that it has little to do with OOP, but most imperative languages (only exception I know is Rust) have the issue, it's not "Python specific". For example (https://godbolt.org/z/aobz9q7Y9):

struct S { const int x; int f() const; }; int S::f() const { int a = x; printf("hello\n"); int b = x; return a-b; }

The compiler can't reuse 'x' unless it's able to prove that it definitely couldn't have changed during the `printf()` call - and it's unable to prove it. The member is loaded twice. C++ compilers can usually only prove it for trivial code with completely inlined functions that doesn't mutate any external state, or mutates in a definitely-not-aliasing way (strict aliasing). (and the `const` don't do any difference here at all)

In Python the difference is that it can basically never prove it at all.


> This is a Python specific problem caused by everything being boxed

I would say it is part python being highly dynamic and part C++ being full of undefined behavior.

A c++ compiler will only optimize member access if it can prove that the member isn't overwritten in the same thread. Compatible pointers, opaque method calls, ... the list of reasons why that optimization can fail is near endless, C even added the restrict keyword because just having write access to two pointers of compatible types can force the compiler to reload values constantly. In python anything is a function call to some unknown code and any function could get access to any variable on the stack (manipulating python stack frames is fun).

Then there is the fun thing the C++ compiler gets up to with varibles that are modified by different threads, while(!done) turning into while(true) because you didn't tell the compiler that done needs to be threadsafe is always fun.


What is going on here is not, that an attribute might be changed concurrently and the interpreter can't optimize the access. That is also a consideration. But the major issue is that an attribute doesn't really refer to a single thing at all, but instead means whatever object is returned by a function call that implements a string lookup. __getattr__ is not an implementation detail of the language, but something that an object can implement how it wants to, just like __len__ or __gt__. It's part of the object behaviour, not part of the static interface. This is a fundamental design goal of the Python language.

> This is a Python specific problem caused by everything being boxed by default and the interpreter does not even know what's in the box until it dereferences it

That's not the whole thing, what is going on. Every attribute access is a function call to __getattr__, that can return whatever object it wants.

bar.foo (...) is actually bar.__getattr__ ('foo') (bar, ...)

This dynamism is what makes Python Python and it allows you to wrap domain state in interface structure.


> same member value within the same function body are stable

Did you miss the part where I explained to you there's no way to identify that it's a member variable?

> Nobody in the real world expects this behaviour

As has already been explained to you by a sibling comment you are in fact wrong and there are in fact plenty of people in the real world who do actually expect this behavior.

So I'll repeat myself: lots of hottakes from just pure. Unadulterated, possibly willful, ignorance.


The above is a very thick response that doesn't address the parent's points, just sweeps them under the rag with "that's just how it was designed/it works".

"Did you miss the part where I explained to you there's no way to identify that it's a member variable?"

No, you you did miss the case where that in itself can be considered nuts - or at least an unfortunate early decision.

"this just how things are dunn around diz here parts" is not an argument.


> No, you you did miss the case where that in itself can be considered nuts - or at least an unfortunate early decision.

This is not a side implementation detail, that they got wrong, this is a fundamental design goal of Python. You can find that nuts, but then just don't use Python, because that is (one of) that things, that make Python Python.


> considered nuts - or at least an unfortunate early decision

Please explain to us then how exactly you would infer a variable with an arbitrary name is actually a reference to the class instance in an interpreted language.


>Please explain to us then how exactly you would infer a variable with an arbitrary name is actually a reference to the class instance in an interpreted language.

Did I stutter when I wrote about "an unfortunate early decision"? Who said it has to be "an arbitrary name"?

Even so, you could add a bloody marker announcing an arbitrary name (which 99% would be self anyway) as so, as an instruction to the interpreter. If it fails, it fails, like countless other things that can fail during runtime in Python today.


But now you are no longer talking about the way Python works, but the way you want Python to work - and that has nothing to do with Python.

"The way our economy works is bad"

"That's how it's been since forever, it's an essential part of country X"

"Yes, and it's a badly designed part".

"But now you are no longer talking about the way country X works, but the way you want country X to work - and that has nothing to do with country X."

See how the argument quickly degenerates?


ok then be the change you want to see in the world and send a PR instead of just proclaiming things lololol

A, the idiotic "it's FOSS, send a PR" argument, lololol

> the word `self` is not special in any way (it's just convention - you can call the first param to a method anything you want).

The name `self` is a convention, yes, but interestingly in python methods the first parameter is special beyond the standard "bound method" stuff. See for example PEP 367 (New Super) for how `super()` resolution works (TL;DR the super function is a special builtin that generates extra code referencing the first parameter and the lexically defining class)


I don't think it's a hot take to say much of Python's design is nuts. It's a very strange language.

> changing our memory allocator

they've been using jemalloc (and employing "je") since 2009.


> Jane Street and Two Sigma are sucking up all the talent.

This is the most made up thing I've ever seen on hn. Those firms hire probably 10 new grads a year (maybe combined!). Unless you're saying the collective talent graduating "high-tier CS programs" numbers in the 10s, this is literally impossible.


Way, way more than 10, but I agree with you that they are not taking even 1% of tech talent per year.

yeah and 2s has not been doing too hot for a few years now. Jane street I buy - they tend to recruit a lot of CMU students. But definitely less than < 15 of the new grads they hire each year are from CMU. They maybe hire on the order of 50-100 new grad SWEs a year.

Lol I guess you weren't around in the goatse days

And goatse is harmless compared to the shit that's out there, especially because it only affects the dude himself. Even 3 guys 1 hammer isn't the worst yet (though pretty bad already).

it's literally the prototypical example for `Assuming`

https://reference.wolfram.com/language/ref/Assuming.html


> spends a ton of money

Bruh lol these courses are marketing material designed by fresh grad communications majors. You're falling for exactly the scam they want you to fall for by giving so much benefit of the doubt to entities which deserve none.

Edit: no I don't do this kind of work but my mother does so I know exactly how the sausage is made.


this is a pointless (valueless) reductive take

> Asm is simple enough that "mental execution" is far easier, if more tedious, than in HLLs

Ya totally I can also keep 32 registers, a memory file, and stack pointer all in my head at once ...fellow human... (In 2026 I might actually be an LLM in which I really can keep all that context in my "head"!)


there's an interesting new API skill for the human cortex v1.0, that allows for a much larger context window, it's called pen and paper.

For real! I occasionally write assembly because, for some reason, I kind of enjoy it, and also to keep my brain sharp. But yes, there is no way I could do it without pencil and paper (unless I’m on a site like CPUlator that visually shows everything that’s happening).

What do the words "mental execution" mean?

Using your brain and not the machine.

8 registers are sufficient; if you forget what one holds, looking up at the previous write to it is enough.

Contrast this with trying to figure out all the nested implicit actions that a single line of some HLL like C++ will do.


> can tap into your RAM pool

lol no it can't - there's a small (40MB) SRAM that can DMA to DRAM and then each of the tiles is another DMA away from that SRAM.


The title of the article of "science" not "computer science".


Yes, and it opens by talking about STEM fields. I consider CS part of both STEM and science generally.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: