Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Matplotlib (matplotlib.org)
64 points by tosh on Nov 12, 2023 | hide | past | favorite | 42 comments


By far my favorite feature of matplotlib is that it produces readable and accessible plots by default. Font sizes and linewidths are not too small, the default color palette is colorblind-friendly, and the default colormaps are perceptually uniform. Other software (for example Matlab and Mathematica) have absolutely terrible defaults and require a lot of fiddling to get the plots right.


IMO the font size is still too small (ditto for pretty much all stat software). But matplotlib makes it quite easy to change the default look, here is mine:

    import matplotlib
    
    theme = {'axes.grid': True,
             'grid.linestyle': '--',
             'legend.framealpha': 1,
             'legend.facecolor': 'white',
             'legend.shadow': True,
             'legend.fontsize': 14,
             'legend.title_fontsize': 16,
             'xtick.labelsize': 14,
             'ytick.labelsize': 14,
             'axes.labelsize': 16,
             'axes.titlesize': 20,
             'figure.dpi': 100}
    
    matplotlib.rcParams.update(theme)


That fiddling used to drive me nuts in any of the tools I used to work with. It's part of the problem I'm trying to solve with my current open source project (evidence.dev) where we're tackling viz-as-code. You might find it interesting.

Previous HN discussions:

https://news.ycombinator.com/item?id=35645464 (97 comments)

https://news.ycombinator.com/item?id=28304781 (91 comments)


Tangential but a thing about defaults - when I was at uni, in a lot of labs we were doing data analysis and plotting in something called Origin. I absolutely loved the defaults there, to this day I can recognize if it was plotted using origin and I had visceral reaction if I'd see something done in Excel. Sane defaults are very important and I'm not sure why they are often afterthoughts


My only issue with it is that you need a lot of tweaking to get a compact layout. Other than that, it's a key part of what makes Python such an incredibly useful language.


> the default color palette is colorblind-friendly

No, it very much isn't. The second and third colors, the orange and the green, look extremely similar to protanopes (red deficiency). Fortunately, there's a plan to fix this for Matplotlib 4.0.


It's a great library. I only wish that it didn't have two completely different, incompatible interfaces to muddy the documentation, examples, and SO answer threads.


By which you mean these I think?

1. the pyplot API (the old one, that is designed to mimic MATLAB - it has functions like plot() and xlabel()).

2. The Axes API - fig, ax = plt.subplots(); then call ax.plot(), ax.set_xlabel() and so on

One should always prefer the Axes way of doing it, it's refactorable and uses less hidden global state.


> One should always prefer the Axes way of doing it, it's refactorable and uses less hidden global state.

I fully agree, but half the time you're looking up how to do something, you find methods documented using the other approach, with slightly different method names that don't even have an alias in the axes API. In fact those are usually the ones you find, because people answering SO questions seem to prefer the brevity.


generally when i'm using matplotlib instead of d3 it's because i'm less concerned about things like refactorability and hidden global state than about getting some data points on the screen in as few keystrokes as possible; %pylab inline and the pyplot api are far superior for that

i wish the axes api didn't exist


And when I use matplotlib it's to clean up some non-default tweaks to seaborn plots. Refactorability here means that it's easy to go from one plot into one that's split into facets (multiple subplots etc.)


true, the subplot api is especially painful because i have to edit each line of code i copy-paste to give it a different subplot number


The old API was meant to mimic MATLAB. It's in the name -- the "mat" in "matplotlib" refers to MATLAB. And yes, those of us who were brought up on MATLAB gravitated toward it in the early days because it was so familiar. For better or worse, MATLAB was and still a popular in academia (less so these days, but still being used especially in control engineering).

MATLAB's plotting is not the best, but it's familiar to MATLAB legacy folks, who made up many of the early folks who moved over to Numpy, Matplotlib, SciPy. It was a bridge.


Matplotlib is amazing and nice to have when you need it but can be way too complicated for the average user and visualization. Most users' needs, I think, would be better served by something like Plotly Express or other higher-level dataviz packages (for visualizations-thru-code) and Tableau, PowerBI, and Excel (for interactive WYSIWYG dataviz).


People don't seem to understand that matplotlib is the lower-level backend/API and not really intended to be a visualization interface for the average user. It's a power tool. Matplotlib's users are other python developers making more high-level visualization libraries, like seaborn, pandas (df.plot()), plotly, ggplot (the python port/version). If you know the matplotlib API, then you can directly modify figures from seaborn, etc.


What about it? Are you trying to point to something specific?


I haven’t used matplotlib in a while (also because I wasn’t fluid with it) and now am using it regularly in Code Interpreter w/ GPT-4.

Now I’m thinking there must be many more libraries that are now “in reach” for me for casual use which weren’t just a few weeks ago.


Same, I just had the problem of wanting to display crosstabs between n raters which calls for a gridview of heatmaps. It was a 15min GPT job rather than struggling for 2 hours on my own. I got to worry about the salient parts of the problem than the minutia of matplotlib.


Probably it's their favorite library?!


I always wondered how different Matplotlib would have been if John Hunter didn’t die so young. His passing was right near the boom of Python too.


Bokeh has a much more sensible API, and it generates an interactive HTML plot by default.

Unfortunately it's missing quite a few specialized plots from matplotlib, in particular the popular histogram plot is strangely difficult to draw.


Bokeh requires a lot more typing to do the same thing. Also, for interactive work, I often want the GUI, not a webpage with the plot on it.


Have you tried Altair? I am considering switching away from Plotly and have been eyeing Altair. (I have no idea how good it is.)


I'm moving the opposite direction: Altair -> Plotly. I find altair to be too "grammar of graphics" for its own good. And the vega backend makes it hard to hack around. Saving to pdf or high-quality also takes extra steps with additional dependencies.


Matplotlib is a fantastic library even if I don’t love the API.

The name and API come from MATLAB, and once you realize that it makes more sense. It was originally meant to replicate the plotting functionality in python to be familiar to MATLAB users. IMO this is what makes it feel non-pythonic, but it was never really a python API in the first place.


Exactly! Typing "import matplotlib" at the python REPL instantly gave you the capability of Matlab without the requirement of a commercial license. If I recall correctly, matplotlib was built after Matlab increased the prices for its commercial license for academics.

Between matplotlib and pandas I believe we have a sufficient explanation as to how python became the language of choice for data analysis.


Both have excellent features and confusing APIs.


My pet-peeve with matplotlib is the terrible layout of multiple plots on a grid. It's tedious (requires quite a bit of redundant typing), setting aspect ratios is tricky, getting margins right is an effort of trial and error -- setting reasonable font-sizes will almost certainly get you overlapping axes labels (especially when using tight_layout) by default.

Don't get me wrong matplotlib still gives the best looking publication-ready plots (people mentioned bokeh, which is great for interactive plots on the web, but completely unsuitable for creating plots for pdf articles). I just sometimes wish less tweaking was required.

I found proplot: https://proplot.readthedocs.io/en/stable/ is a nice wrapper around matplotlib that alleviates quite a few issues.


"Artist" in Matplotlib - something I wanted to know before spending tremendous hours on googling how-tos. - DEV

This looks like a really nice explainer https://dev.to/skotaro/artist-in-matplotlib---something-i-wa...


ggplot > madplitlib, cmm


Yep! even started using python clone - plotnine


Not my favorite library but it gets the job done.


Ditto. Awfully convoluted compared to, say, R.


odd how it's still the de facto standard python library for plots despite much better libraries like ggplot or plotly


Plotly might just be one of the libraries that manages to make documentation less readable than matplotlib. API reference consists of 90% autogenerated doc trash 90 pages long.


Pgfplots produces highest quality plots that I have seen.


I find Pgfplots quite hacky since TeX is really not made for handling data (it doesn't even support floating-point arithmetic). A much cleaner solution is to enable LaTeX text rendering in Matplotlib (https://matplotlib.org/stable/users/explain/text/usetex.html). In this way, TeX is used for what it is good at (text rendering), and everything else is handled by Matplotlib.


The idea is to process it all in Python and export the final data points to text files to be plotted by pgfplots. Then you can import the pgfplots code directly in the latex document or slide, and adjust it there. It will all be latex in one document, and you will have full control over the plot and its annotation. With Tikz, you can superimpose diagrams.


Or use Matplotlib's PGF backend. You can even import a preamble so the notation macros from your paper are usable in the legend, axes labels, etc.


How is this relevant? Pgfplots isn't even a matplotlib competitor really, they're completely different in terms of use cases and ecosystems.


can you provide some links to examples so we can see what you mean


See examples in the PGFPlots manual:

https://ctan.org/pkg/pgfplots

However, examples from pgfplots and matplotlib should be considered in a paper, together with the effort in preparing and revising the figures.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: