I disagree with that hot take because it ignores the momentum around pandas, scipy and other key libraries that were just easier to use in Python than (for instance) R, and the network effects.
In the years I was doing active ML and data science/engineering work, I met exactly _one_ team that was using Julia, and only came across R when dealing with academia. Mainstream/corporate customers never even considered any other alternative (other than MatLab, which had an established base in manufacturing, oil, gas, etc.).
I think you do have to factor in ecosystem inertia, however I also feel that numpy, scipy, and scikit-learn are pretty new, all things considered. We could have (and indeed did) take different paths in the past, just none of the others (xlispstat) really took. To me the chain of events that got us here go way further back.
C was much, much better suited to squeezing performance out of the hardware of the late 1970s and 1980s (C fast; lisp slow). From there, the foundation of basically every operating system and system library we know and love today were built... in C.
Python never tried to break free of being a veneer over C, with vaguely C sensibilities. Python displaced Perl as the prevailing veneer language because it had objects, and objects were all the rage right as people were learning about Python on Usenet and later Slashdot.
The initial set of assumptions haven't held. Just about any serious language in the lisp family runs circles around Python. Lisp is no longer slow. But now we've been thinking in "C-ish" for so long that lisp is weird. We broke our collective brains on the gross abstractions of C and C-veneer, that we can't even see how gross it is.
Will Julia ultimately gain traction? I don't know. They're trying to be a lisp that doesn't trigger our weird reflex. Bindings to a lower level language seem more like a liability in 2023 than 1993. And just in case, Julia has gone to some pains to allow reuse of C code, way down at the LLVM level, which is something Python can't say. But while Julia was getting built, Python built an ecosystem.
I love this story, except that it wasn't quite so regarding early C compilers.
"Oh, it was quite a while ago. I kind of stopped when C came out. That was a big blow. We were making so much good progress on optimizations and transformations. We were getting rid of just one nice problem after another. When C came out, at one of the SIGPLAN compiler conferences, there was a debate between Steve Johnson from Bell Labs, who was supporting C, and one of our people, Bill Harrison, who was working on a project that I had at that time supporting automatic optimization...The nubbin of the debate was Steve's defense of not having to build optimizers anymore because the programmer would take care of it. That it was really a programmer's issue.... Seibel: Do you think C is a reasonable language if they had restricted its use to operating-system kernels? Allen: Oh, yeah. That would have been fine. And, in fact, you need to have something like that, something where experts can really fine-tune without big bottlenecks because those are key problems to solve. By 1960, we had a long list of amazing languages: Lisp, APL, Fortran, COBOL, Algol 60. These are higher-level than C. We have seriously regressed, since C developed. C has destroyed our ability to advance the state of the art in automatic optimization, automatic parallelization, automatic mapping of a high-level language to the machine. This is one of the reasons compilers are ... basically not taught much anymore in the colleges and universities."
-- Fran Allen interview, Excerpted from: Peter Seibel. Coders at Work: Reflections on the Craft of Programming
As I understand it (and this is before my time), garbage collection in lisp languages was really a downer on 70s and even 80s era hardware. I think a case could be made that C wasn't the only language at the time that could have birthed UNIX, but I'm not sure a lisp family language is in that list. (Edit: noting that C very much happened together with unix)
Java was the first garbage collected language that really hit it bigtime in the mainstream. And even then, I remember the zeitgeist of the early days of Java being that Java was perhaps fatally slow.
A lot has changed since then, and I'm firmly in camp "why can't we have nice things like lisp?" but I'm not sure I'd have picked a lisp (or C) back when the foundations of our current *nix paradigm were laid.
> As I understand it (and this is before my time), garbage collection in lisp languages was really a downer on 70s and even 80s era hardware.
It mostly was. The main reasons were that GC lacked necessary features and memory was extremely small&expensive.
The first really usable GCs arrived with processors like the 68020+MMU or the 68030. Even on a Lisp Machine the GC got much more usable with the introduction of the ephemeral GC, which used hardware support to track memory changes in RAM. As soon as the GC needed to work on virtual memory on disk, it was much slower.
BASIC interpreters for 8 bit microcomputers, having 48 Kb of RAM or less, and running at 1 MHz, used garbage collection for strings just fine. That software was tailor-made to the machines from scratch.
I would say it was the size and complexity of Lisp systems, developed in ivory towers on big iron hardware, which had trouble fitting in to emerging low-cost microchip-based hardware.
When microcomputers emerged, they had the capabilities of mainframes from 15 (or more) years before. Current software from mainframes just wouldn't fit. That's how a lot of the languages and operating systems became swiftly relegated to the past.
(Why do we still have Unix and Unix-like systems today? Unix started relatively late, on small minicomputers that were not so far off from subsequent microcomputers. Unix made the jump from PDP-7, 11 to DEC Vax, and 680x0 Sun boxes and such.)
A Lisp system measuring its heaps and image sizes in hundreds of kilobytes or megabytes simply wouldn't fit into a system measured in tens of kilobytes. People working with Lisp machines in the 80's couldn't get customers (or not mass market customers), because mass market customers didn't want to buy expensive hardware. Only some big companies or governments.
Today your GNU Bash may hit a 20 megabyte VM footprint, and you don't even notice, let alone pause to think about how "wrong" that is.
Personally I think Pandas gets too much credit, credit which really belongs with Numpy, and which was probably the reason that Pandas was built on it, rather than say, Ruby, which in the late 2000s was a contender. This is especially true in the world of ML. Pandas is incredibly convenient (clone as it started out being, after all, of R's eponymous DataFrame, which is also incredibly convenient for feature engineering), but it's Numpy that does the heavy lifting.
Not just the directly ML adjacent libraries but all the other tertiary libraries too. Having a single language codebase/pipeline that goes from scraping the web to collect data (using scrapy!), to data cleaning, model training and then prediction serving with flask or fastapi is sort of magical.
> pandas, scipy and other key libraries that were just easier to use in Python than (for instance) R
Depends on what you're doing. dplyr/ggplot2 are so so much better than pandas but sklearn is pretty rocking (and better than caret).
R the language is pretty annoying relative to python, but I'd actually argue that pandas is the worst of both R and Python, in terms of the API at least.
Ecosystem >> language to the point they language is almost irrelevant. If another language had got critical mass of its ML and data science ecosystem, it could have dominated, but I see no evidence it would have made a difference to the pace of ML advances.
Also, python for ML is basically a wrapper around low level GPU routines, with some "scientist" friendly features, it's less about it's strengths as a language
Better for whom? Better for computer science researchers perhaps, but certainly not better for working geologist or chemists trying to apply ML to solve practical real world problems in their respective domains.
This is disregarding the development of said ecosystems though. The point is that Python has been quite inhibitory to the development of this ecosystem. There are many corpses of automatic differentiation libraries (starting from autograd and tangent and then to things like theano to finally tensorflow and pytorch) and many corpses of JIT compilers and accelerators (Cython, Numba, pypy, and TensorFlow XLA, now PyTorch v2's JIT, etc.).
What has been found over the last decade is that a large part of that is due to the design of the languages. Jan Vitek for example has a great talk which describes how difficult it is to write a JIT compiler for R due to certain design choices in the language (https://www.youtube.com/watch?v=VdD0nHbcyk4, or the more detailed version https://www.youtube.com/watch?v=HStF1RJOyxI). There are certain language constructs that void lots of optimizations which have to then be worked around, which is why Python JITs choose subsets of the language to avoid specific parts that are not easy to optimize or not possible to optimize. This is why each take a domain-specific subset, a different subset of the language for numba vs jax vs etc., to choose something that is nice for ML vs for more generic codes.
With all of that, it's perfectly reasonable to point out that there have been languages which have been designed to not have the compilation difficulties, which have resulted having a single (JIT) compiler for the language. And by extension, it has made building machine learning and autodiff libraries not something that's a Google or Meta scale project (for example, PyTorch involves building GPU code bindings and a specialized JIT, not something very accessible). Julia is a language to point to here, but I think well-designed static languages like Rust also deserve a mention. How much further would we have gone if every new ML project didn't build a new compiler and a new automatic differentiation engine? What if the development was more modular and people could easy just work on the one thing they cared about?
As a nice example, for last NeurIPS we put out a paper on automatic differentiation of discrete stochastic models, i.e. extending AD to automatically handle cases like agent-based models. The code is open source (https://github.com/gaurav-arya/StochasticAD.jl), and you can see it's almost all written by a (talented) undergraduate over a span of about 6 months. It requires the JIT compilation because it works on a lot of things that are not solely in big matrix multiplication GPU kernels, but Julia provides that. And multiple dispatch gives GPU support. Done. The closest thing in PyTorch, storchastic, gets exponential scaling instead of StochasticAD's linear, and isn't quite compatible with a lot of what's required for ML, so it benchmarks as thousands of times slower than the simple Julia code. Of course, when Meta needs it they can and will put the minds of 5-10 top PhDs on it to build it out into a feature of PyTorch over 2 years and have a nice release. But at the end of the day we really need to ask, is that how it should be?
Software engineering is hard problem. How is Julia in multiple authors codebase? A language need to consider ergonomics of multiple users working on same thing. Reading >> Writing.
C++/Python still hold ground in this regards sadly. One side for low-level control, another side for glue code.
If you look at Julia open source projects you'll see that the projects tend to have a lot more contributors than the Python counterparts, even over smaller time periods. A package for defining statistical distributions has had 202 contributors (https://github.com/JuliaStats/Distributions.jl), etc. Julia Base even has had over 1,300 contributors (https://github.com/JuliaLang/julia) which is quite a lot for a core language, and that's mostly because the majority of the core is in Julia itself.
This is one of the things that was noted quite a bit at this SIAM CSE conference, that Julia development tends to have a lot more code reuse than other ecosystems like Python. For example, the various machine learning libraries like Flux.jl and Lux.jl share a lot of layer intrinsics in NNlib.jl (https://github.com/FluxML/NNlib.jl), the same GPU libraries (https://github.com/JuliaGPU/CUDA.jl), the same automatic differentiation library (https://github.com/FluxML/Zygote.jl), and of course the same JIT compiler (Julia itself). These two libraries are far enough apart that people say "Flux is to PyTorch as Lux is to JAX/flax", but while in the Python world those share almost 0 code or implementation, in the Julia world they share >90% of the core internals but have different higher levels APIs.
If one hasn't participated in this space it's a bit hard to fathom how much code reuse goes on and how that is influenced by the design of multiple dispatch. This is one of the reasons there is so much cohesion in the community since it doesn't matter if one person is an ecologist and the other is a financial engineer, you may both be contributing to the same library like Distances.jl just adding a distance function which is then used in thousands of places. With the Python ecosystem you tend to have a lot more "megapackages", PyTorch, SciPy, etc. where the barrier to entry is generally a lot higher (and sometimes requires handling the build systems, fun times). But in the Julia ecosystem you have a lot of core development happening in "small" but central libraries, like Distances.jl or Distributions.jl, which are simple enough for an undergrad to get productive in a week but is then used everywhere (Distributions.jl for example is used in every statistics package, and definitions of prior distributions for Turing.jl's probabilistic programming language, etc.). I had never seen anything like that before in the R or Python space, by comparison Python almost feels like it's all solo projects.
Python did not win the ML language wars because of anything to do with front-end, but rather because it does both scripting and software engineering well enough. ML usually requires an exploration/research (scripting) stage and a production (software engineering) stage, and Python combines these seamlessly better than many ML languages before it (Matlab, Java, R). Notebooks became the de facto frontend of ML Python development and to me it's evidence that frontend in ML is inherently messy.
Do I wish a better language like Julia had won out? Sure, but it came out 10+ years into this modern age of ML, which is an eternity in computing. By the time it really gained traction it was too late.
I agree, but can you imagine that Matlab, and then R, were the de facto ML languages before Python really took off? Putting R models into production was an absolute nightmare. Before R, I was writing bash scripts which called Perl scripts that loaded data and called C code that loaded and ran models that were custom built by Matlab and C. Python (and the resulting software ML ecosystem) was a huge breath of fresh air.
Python proves that there is a use case for a tool with imprecise semantics (no meaningful type system) even if that tool has terrible performance compared to other options (GIL). That use case is gluing together C++.
Hottest take? In 5-10 years, ML advances will result in Python being mostly replaced by LLM based “programming glue” systems. Much Python doesn’t need to be “correct” the way statically typed languages are correct, it only needs to be correct for the few things you ask it to do (i.e. train this particular NN). LLMs are similarly imprecise, but more powerful. We will move from a world of Python gluing together C++, to LLMs gluing together some hopefully-better language (I bet $10 on Swift). Main delay will be from getting LLM based solutions to run performantly-enough on developer hardware.
Related to your prediction, I think that's a major part of why python took over - looking English-like, a lot more people saw it and were like "hey, I can do this!" than with (most of) what came before.
>Hotter take: ML would have advanced faster if another front-end language had been available and widely adopted instead of Python.
One that is interactive yet fast & compilable, multithreaded (no GIL), isn't bloated, doesn't care about white spaces,...
It would have to have been statically typed to get any real benefit. The programming world would be so much better if OCaml was the dominant programming language. With 5.0 it checks off every single box for an amazing programming language:
The problem with statically typed imho is that you often need to pre-define all your input data types ahead of time which is a massive pain and matrix operations don't work with the type system.
As a result your language is only type safe outside of the data and NN graph components which makes it fairly useless when 99% of the work is on the data and NN components.
edit: I've done a fair amount of ML, Data and Scala including ML/Data in Scala. So I've tried it and the type system wasn't of much help to be honest.
> is the programming language something that even matters?
It absolutely does. I think one of the major reasons that all these massive software companies grind to a halt with innovation is because of bad software architecture and a bad choice of programming language contributes to that. Also underlying systems built in non-performant languages hobble the company for years.
It's complete delusion that a company will rewrite systems as they gain product market fit.
I have a strong list of requirements but there is no perfect language today. For general software any language I start a project with today must have
1. Fast iteration / compilation times
2. null safety
3. Some form of algebraic data types (and by implication a static type system)
4. Garbage collection (unless the domain specifically requires the performance benefits gained from forgoing a garbage collector).
5. A robust standard library and package ecosystem.
All of the above contribute to two very important things:
1. The ability to iterate quickly.
2. The ability to describe state in such a way that invalid states become impossible to represent due to the type checker.
Nice to have productivity boosters:
1. Actual value types
2. Pattern matching.
3. A good class system with traits or interfaces.
There isn't a perfect language that meets these requirements so there will always be tradeoffs with language choice.
> I think one of the major reasons that all these massive software companies grind to a halt with innovation is because of bad software architecture and a bad choice of programming language contributes to that.
I see social structures having far more influence on architecture than any underlying technical details so I feel your thesis is built on false premise
> I see social structures having far more influence [...]
I don't know. If that were true, then we as a society would just architect our C/C++ based projects such that we don't bump into memory safety issues. Instead we see something like 70% of security issues coming from memory unsafety across the board.
The premise seems legit when thinking about it in terms of the issues with C.
This isn't true. A lot of the reason Julia would be so good for ML is that you can write your own high level kernels in Julia that compile down directly to device code with CUDA.jl. The reason Python is in a lot of ways a really bad fit for ML is that you can't write fast python. You instead have to call a C/C++ library.
Language (and, equally important, language culture) absolutely does matter. There is a reason why Julia ML development is pushing academic boundaries and doing really cool experiments with automatic differentiation, while Python ML development is more focused on applying ML to different fields and developing and deploying simpler, more robust, models to production in industry.
As an ML researcher I think about learning a new language like rust whenever I see articles about it appearing here. However, when I consider which parts of my job could be easily replaced with another language besides Python, the list is nearly empty.
I think people are talking about about two very different things when they are talking about ML "advancing".
Python helped ML 'advance' in the sense that it is no longer a niche academic endeavour, but rather a common programming technique used practically in basically every field you care to name. Doing basic ML using scikit-learn and PyTorch can be taught to basically anybody with basic programming skills in a couple of days, and is really no harder than any other type of basic programming. Python is very much responsible for this development.
Python might at the same time have held back advancement of ML, in the sense that very little ML being done by most ML developers is "interesting" in an academic sense. It's mostly just people more or less randomly chaining together black box neural networks in the obvious way and seeing what comes out. The ML development coming out of the Julia camp for example is in many ways a lot more interesting and they're trying new things and pushing the boundaries of what can be done. Unfortunately not a lot of this is directly relevant to the countless people who just need to implement 'simple' solutions to 'simple' ML problems right now.
In a world where python ML never happened and a Lisp became the dominant ML language, ML might have been slightly more interesting and 'advanced' from an academic perspective, but it would also be a much more academic endeavour over all and nowhere near as wide spread in industry as it is today.
> It's mostly just people more or less randomly chaining together black box neural networks in the obvious way and seeing what comes out.
But would that not be the same in any language if it had the same library support as Python? For ML as it stands, I don't think it matters much what programming you are working in ; you are not really using the features of the programming anyway vs a medium sized SaaS product for instance. So I really doesn't (and wouldn't have) matter, so just pick the most popular and that's it. You are basically just working with the syntax of Python to stuff things into numpy/torch etc, you are not getting the joys of idiomatic Python anyway. And as I read more and more modern ML code, it really is just that; open file, set up for GPU work, next step which is more of the same.
Fully agree. Python has been key imo (and I dislike Python and have avoided it for 2 decades, full disclosure) to the recent successes. It is a very accessible language and that proved to be the secret sauce of it trivially traversing domain boundaries. Lots of data + GPUs + accessible language. Compare with AI Winter stack.
On contrary.
No scientist, except computer scientists, wants to deal with the quirks of most programming languages. Python has several advantages that you do not find in that combination in any other language:
1. It is very readable, IMHO the most readable of all languages.
2. It has the most useful higher-level, generic data structures built-in, e.g. dictionaries, lists, sets.
3. It is interpreted, which is the only way to make exploratory programming comfortable. Most useful for learners and scientists.
4. Its late binding data type safety makes ad-hoc programming less painful and encourages the use of tests when production readyness is needed.
Of course, Python is not perfect. I e.g. would appreciate immutable-datatypes-first and a more functional centric approach.
But one can lament about everything.
> On contrary. No scientist, except computer scientists, wants to deal with the quirks of ~~most~~ any programming language.
Fixed that for you. IME, in academia (outside of CS), Python is not especially loved at all: it's just there, with Java, R & Perl, and is more seen as a fact of life to deal with rather than anything else.
> R is too specific and it lacks the vast ecosystem of Python
That's the first time I've ever heard this criticism of R: most staunch R critics will acknowledge R's vast breadth and depth of statistical libraries outclasses Python. Python has far more general use programming libraries (so deploying models is easier) and used to have a slight advantage in ML specific libraries, but there aren't major differences nowadays between the two ecosystems.
Counterfactual hypotheses are untestable and are of little interest other than for social banter and light conversation. "If only we had done this, this might have happened..."
Reading the tea leaves and forecasting the future has never been a very successful enterprise, and doing so in hindsight isn't much better. This behavioral tendency was explained by Frank Herbert, among others:
> "Deep in the human unconscious is a pervasive need for a logical universe that makes sense. But the real universe is always one step beyond logic." ― Dune
I think it’s useful to revisit things with the benefit of hindsight to better understand why thinks happened a certain way and how they might have played out differently. But I also completely agree that they offer no proof of alternative outcomes and the exercise should be limited.
Something like the viral adoption of a language for ML has too many complex psychosocial factors for a hot take counterfactual sound bite. And twitter allows for just enough space to stake a claim on some rhetorical ground, not much left to explore the detail required.
ML is mainly limited by poor data quality. It is hard to convince a business to fix this technical debt unless they are truly data-driven. Most businesses believe they are data-driven if they have few fancy-looking dashboards, even if they are built on quicksand.
If things were totally different, things would have turned out different. There is a universe where the ecosystem would have led to faster development, and there are many where it would have led to slower development.
There are lots of things Python could've done better that would make it easier to work with...., But the Chief AI Scientist here cares about white space instead...
Seriously: Virtualenvs and multiple versions of python are a mess (even with pyenv), packaging and package installation is a mess (why was there ever a need for running setup.py script to figure out dependencies?), dependency management is a mess (pip-tools? poetry? pipenv? why 3 imperfect tools instead of a single perfect one?), deployment is a mess (virtualenvs aren't easily transferrable between machines, docker makes it a bit more palatable)...
This is just another "X eats Y for breakfast" snowclone, where somebody thinks deeply about Y, so they conveniently ignore X as it relates to why something they do/don't like with respect to Y is unpopular/popular.
"Culture eats strategy for breakfast" (Drucker's law)
"Data eats algorithms for breakfast" (Andrew Ng's deep learning lesson)
"Computation eats models for breakfast" (Sutton's bitter lesson)
"Descriptive eats prescriptive for breakfast" (Linguists everywhere)
and here, another
"Ecosystem and community eats language features for breakfast"
If you take the thread's stated pros (low barrier to entry, experimental scripting workflow) and cons ([not] fast, compilable, no GIL) at face value, it's interesting evidence for what really matters compared to what needs to be merely acceptable for a ML language.
The thing neither of them list which is absolutely enormous is the breadth and quality of boring general purpose stuff in Python. It's a big deal that you can write a production grade web app or api in the same language you do modeling work in.
Why does anyone care? Change is inevitable. It comes when it comes. Change comes faster than it ever has before. The vehicle doesn’t matter.
Go read only the paranoid survive and compare it to what’s happening today. The preface can be applied today to the strategic shift. Imagine then waking up one day to this significant change. Nobody can predict it. Sure plenty of people can speculate now that it has already happened.
> The Lesson is, we all need to expose ourselves to the winds of change. - Andy Grove
Personally I'm a fan of any ecosystem having at least 2 languages that work well, specifically ones with very distinct set of strengths and weaknesses. No one language can be everything to everyone, though I admit Python tries incredibly hard by using C FFI to hide some of the speed issues.
I was a fan of LeCun back in the early 2000's when his review papers showed me the way to build industrial text classifiers.
That said, in 1999 I rage quit Python because somebody retabbed a file and sent it back to me and spent a while debugging a logic problem caused by a spacing change that was invisible in my text editor. I got dragged back into Python in the mid 2010's because there was so much ML and science work that people wanted me to do. That's how Python survived the near death experience of the 2 -> 3 transition, it found a new community of users who could start working in Python 3 and who often didn't have the usual industrial concerns about backwards compatibility.
If you were going to bitch about Python today though it would be (1) generic brokenness in environments because of (a) the site local directory, (b) misconfigured charsets, (c) other misconfigurations and also (2) pip not correctly resolving complex projects, and (3) distractions like conda that claim to make things better but actually make them slightly worse.
I worked at a place that was developing ML models in Python that struggled with all of the above, by the time I really understood the problems the company had decided to move away from Python. It was worse back then because we were running models in different versions of Tensorflow that required different versions of CUDA. If you installed software the way NVIDIA tells you to do that would involve a lot of logging in, click, click, clicking and watching progress bars crawl slowly, but I did figure out how to package CUDA libs in conda although conda drove me nuts because it uses bzip2 compression that had me looking at progress bars crawl slowly all the time.
Today there has been a lot of reform and it is not so bad as it was but there is still the insanity that the best Python libs can do is code like
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
and it is just unacceptable to me in an industrial environment to have the script itself download a 400MB model as opposed to managing models in the deployment system. (Somehow schemes like the above wind up downloading the model 20 times when you only need to download it 1, stash 20 copies of the model in different files, corrupt the files, etc.)
Hottest take: it’s not a useful question, certainly not for exploration on Twitter.
If you want to argue the murky depths of counterfactuals where little can be proved then it’s going to take more space than awkwardly formatted twitter conversation threads.
The 2-3 transition was a non-issue to a lot of people, and a major stumbling block for folk who staunchly avoided any kind of change to their codebase. But I don't think Python was already in a dominant position -- many people switched stacks.
However, for me there are two key points around it: 1) it's over and done with 2) it wasn't that big of a deal if you planned for it. 90% of the things I was working on just worked with minor changes. The rest was mostly a matter of a few critical dependencies that had to be refactored around.
> The 2-3 transition was a non-issue to a lot of people,
It was a non-issue for relatively small scripts. But if you were dependent on non-converted libraries, or e.g. if you handled a lot of non-English strings, it could get iffy very fast – and blow up in your face months later, in a rarely used code path.
> don't think Python was already in a dominant position
IME, it was very much in a, if not dominant, at least powerful positions everywhere.
It was roaring in webdev, most of academia had switched from Perl to Python, ML/DS were starting to seriously take off outside of research, and it was winning the war against Ruby for the goto language for quick 'n dirty between-bash-and-C programs & scripts.
Every language community apparently craves “pain” in some aspect of their chosen golden hammer while simultaneously pointing fingers at other languages and their pain points. There is an element of masochism to programming that is often overlooked. (Sometimes this manifests as sadistic tendencies inflicting pain on the user.) Python is so painless at the source level that naturally the required pain had to be injected elsewhere. Of course no one can touch the JavaScript community as far as programming masochism is concerned.
The worst thing that ever happened to Python was being adopted as a scripting language in Linux distributions, which meant you couldn't ever update the "python" binary. "python3" was the worst idea since Solomon proposed cutting a baby in half because it leads inevitably to "python3.5", "pip3.6", "python3.7-pypy", etc. It's one more reason you can't write scripts that "just work".
Fortunately Python is doing the right things to restore sanity (pip refusing to install anything outside of a venv) but it's been a long time.
In the years I was doing active ML and data science/engineering work, I met exactly _one_ team that was using Julia, and only came across R when dealing with academia. Mainstream/corporate customers never even considered any other alternative (other than MatLab, which had an established base in manufacturing, oil, gas, etc.).