Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Tqdm (Python) (tqdm.github.io)
738 points by manjana on Dec 16, 2021 | hide | past | favorite | 162 comments


I really love the wave of new tools that are gaining traction in the Python world: tqdm, rich (and soon textual), fastpi, typer, pydantic, shiv, toga, doit, diskcache...

With the better error messages in 3.10 and 11, plus the focus on speed, it's a fantastic era for the language and it's a ton of fun.

I didn't expect to find back the feeling of "too-good-to-be-true" I had when starting with 2.4, and yet.

In fact, being a dev is kinda awesome right now, no matter where you look. JS and PHP are getting more ergonomics, you get a load of new low level languages to sharpen your hardware, Java is modern now, C# runs on Unix, the Rust and Go communities are busy with shipping fantastic tools (ripgrep, fdfind, docker, cue, etc), windows has a decent terminal and WSL, my mother is actually using Linux and Apple came up with the M1. IDEs and browsers are incredible, and they do eat a lot of space, but I have a 32 Go RAM + 1TO SSD laptop that's as slim as a sheet of paper.

Not to mention there is a lot of money to be made.

I know it's trendy right now to say everything is bad in IT, but I disagree, overall, we have it SO good.


I had to look these up (just me doing Google searches):

- tqdm: progress bars (https://tqdm.github.io/)

- rich: text formatting (https://github.com/willmcgugan/rich)

- textual: TUI, using rich (https://github.com/willmcgugan/textual)

- fastpi: Rest APIs (https://fastapi.tiangolo.com/)

- typer: CLI Library, uses Click (https://typer.tiangolo.com/)

- pydantic: Custom data types (https://pydantic-docs.helpmanual.io/)

- shiv: Create Python zipapps (https://shiv.readthedocs.io/en/latest/)

- toga: GUI Toolkit (https://toga.readthedocs.io/en/latest/)

- doit: Task runner (https://pydoit.org/)

- diskcache: A disk cache (https://github.com/grantjenks/python-diskcache/)


You can find most of those here: https://taoofmac.com/space/dev/python.

Those that aren't should be there momentarily :)


Just wanted to add Austin: Python frame stack sampler for CPython written in pure C (https://github.com/P403n1x87/austin)


Thanks for the links!

FWIW python has a built-in disk cache—- shelve ( https://docs.python.org/3/library/shelve.html )


Diskcache has the amazing benefit of transparent sharding, so you can use it in multithreading and multiprocessing programs.


Damn, so many goodies to play with. Thanks for the list.


TQDM isn't all that new, it's one of the first things I started to play with when learning to write code. That said, it is an amazing tool.


New features get added and these posts are a nice reminder to check the docs on a tool I literally never need to do that for because it's so habitual to wrap any long-running iterable inside of a `tqdm()`.

TIL tqdm now has a dead simple telegram hook https://tqdm.github.io/docs/contrib.telegram/


It never occurred to me that chat posts could be quickly edited to make them dynamic. Maybe other chat-bots do that too. Its definitely a cool add-on.


Yeah, I've been using it for a while, but with the barrage of memes and politics on the internet nobody finds out about the good stuff until years later.


tqdm is indeed 8 years old, but it's been popping up everywhere for only a couple of years now. Hence the "gaining traction".


I used an early version of this in 2009.


Ive used tqdm 20 years ago


I've used tqdm 30 years ago!

Although the first release on pypi was in 2013: https://pypi.org/project/tqdm/#history


Way before @noamraph put it up on in pypi it was internally available in our company. To my knowledge he didn't yet make it 30 years ago, but if he did then probably at least not for Python!


Wise of you...


OP didn't take issue with "gaining traction" and clearly was responding to "new tools". Tqdm isn't a new tool.


It’s been making some tqdm lately


Despite of all the hate Python gets (pkg system), it’s still my go to language. Most fun. Most rewarding. Getting things done fast.

I can go home with Python at 5pm. And have a good time with my family.


I hate python, I just hate everything else more.

Jokes aside it's a great language. I would love to see a better package management ecosystem for it, as that is also my biggest issue. Python also does something no other mainstream scripted language does- it allows you to install extensions to the language right in the same package manager, as it can compile libraries from source when wheels are unavailable- this makes it a much harder challenge. At the same time I'm really happy that PyPI is a non-profit organization and won't have to go through the issues that something like NPM did.


I love Python (flexibility, ecosystem, datamodel). I love Java (verbosity, explicitness, robustness). Now, kill me. I don't let others define what I should and shouldn't think about a tool, in this case a programming language. I make up my own mind based on my own experience with the tools. Remember, they're just tools. Not the end game.


For me at least, Poetry [1] automates the process almost on par with npm, cargo, etc.

[1]: https://python-poetry.org/


I really like Poetry for my own uses but as a semi-infrequent Python user my struggle is dealing with other people's Python repositories that aren't using it.

I can never remember all the differences and subtleties between virtualenv, pipenv, venv, pyenv-virtualenv, workon, conda, and so on when I encounter them in a random git repo.


Right?

Doing advent of code in Python is almost cheating. It feels like playing a video game.


I did AoC 2021 through Day 7 in Python, before switching to Go (for personal learning, can't remember if times listed below were with "go run" or from running after "go build"). D7 actually in both, to try and solve a problem I was having with my Python performance. But, it was actually a big learning moment for me when it came to structuring my logic.

My Python code in Part 2 (some minor adjustments from Part 1) took about 40 seconds to run. Terrible, but usable to get a proper answer. I was able to bring it down to ~25 seconds with some optimization by adding a calculated lookup dictionary per loop. Now, the same exact logic in Go ran in about 0.8 seconds.

However, I wasn't satisfied with this and realized that if I moved the dictionary to be global rather than per-loop, I would be able to realize significant gains in performance by completing eliminating redundant calculations. This change dropped the runtime from 25 seconds to 0.35 seconds (of course, applying the same logic in Go brought it down from 0.8s to 0.05s).

Due to the nature of performance you can get out of the Python interpreter, it can actually lead you down paths of learning better optimization strategies that you may initially write off in other languages (depending on the use case) because they perform inherently better. It made me think a bit more about what I was doing and how I could improve it since (in this particular case), the impact of not doing so was pretty drastic.


Day 6 and 7 are fantastic for that. It's a learning moment, where you can get the benefit of thinking about the problem for a few minutes instead of bruteforcing gives you a dramatic speed up.

When I read the Day 7, I saw bruteforcing would lead to bad perfs. Then I remembered that in high school, I learned a formula to calculate the nth term of a sequence without having to process the entire sequence.

I couldn't remember the formula, nor the name of the concept, so I google around until I found some tutorials, and relearned what I was taught as a child: arithmetic sums.

The consumption for a crab can then be calculated in constant time:

    def crab_consumption(crab, target):
        n = abs(target - crab) - 1 
        return (n**2 + 3*n + 2) / 2
And the Python solution finishes instantly.

Bottom line, I could keep using Python for all problems and benefit from the amazing productivity of it.

I've been using python since 2.4 now, and it's not always fast enough. But it very, very often, is.


Interesting, I'll have to look at that. Actual algorithms are a big weak point for me, I'm not a developer by day, so I don't spend a lot of time learning code practices or computer science/math topics. This was my solution to Day 7 Part 2 (positions = sorted int list of input data):

    def calculate_fuel(positions):
        fuel = None
        low_fuel_value = None

        calculated = dict()
        for value in range(positions[0], positions[-1]+1):
            consumed = 0
            for position in positions:
                diff = abs(value - position)
                try:
                    consumed += calculated[diff]
                except KeyError:
                    if position != value:
                        consumption = sum([x for x in range(1,diff+1)])
                        calculated[diff] = consumption
                        consumed += consumption
            if fuel:
                if consumed < fuel:
                    fuel = consumed
                    low_fuel_value = value
            else:
                fuel = consumed
                low_fuel_value = value

        print(f"Aligning to: {low_fuel_value}")
        return fuel
And Day 6 I fell for the bruteforce bait and had the thought "It can't be as easy as changing 80 to 256, right?". Then I realized the pain I had created for myself. BUT! My 6p2 code ran faster than my 6p1 by a good margin, which I was happy about.


Can even calculate the target without bruteforcing. . n^2+n is approximated by n^2 when n is large. Can take the mean of all the distances and use that (or perhaps +/- 1).

    val crabs = lines.first().split(",").map { it.toInt() }
    val avg = crabs.sum() / crabs.size
    return crabs.sumOf { abs(it - avg).let {dst -> (dst * (dst + 1)) / 2} } 
And similarly for part1 take the median. Why it kinda works:

Part1: The median I felt made sense intuitively, as in my head I thought about an example ala 1,1,3,100. Never makes sense to use x>3, because even though the crab at x=100 then can walk shorter, there are 3 others then having to walk longer. And x=1,2or3 doesn't matter, just symmetrically changes which side has to walk one step less or one step more.

And for part2 I thought similar, except the cost is exponential and therefore I want to minimize the avg move and not the total moves, thus taking the average.


The better optimization with Python is to not use python. Half joking. If you want awesome performances, transform your problem into a numerical one, and use Numpy. Numpy is awesome. I always miss it when using other languages than Python.


I don't understand your solution to Day7. What kind of dictionary, and how come even a naive solution would be so slow?

My Kotlin solution runs in about a second. And it was even so stupid that I didn't calculate the sum of the arithmetic series directly, but through a loop. Can't fathom something being slower.

    line.split(",").map { it.toInt() }.let { crabs ->
        (0..crabs.maxOf { it }).minOf { pos ->
            crabs.sumOf {
                (0..abs(it - pos)).sum() // slower than calculating arithmetic sum, but quicker to write 
            }
        }
    }
My proper Kotlin solution runs in less than a ms, though.

I do think that "it's good that python is slow because it forces you to optimize" is a weird take, though.


> I do think that "it's good that python is slow because it forces you to optimize" is a weird take, though.

My take wasn't that it forces you to, but that non-optimal code paths can be greatly exaggerated in comparison to other languages, particularly compiled ones. You can still ignore it (I mean, within reason), but it can give you that extra push to really look a bit deeper to understand what's going on. And of course, there's optimized libraries written in C/C++ that you can take advantage of for even better number crunching than standard CPython.

> What kind of dictionary, and how come even a naive solution would be so slow?

My naive solution was literally going through every single element for every loop and not storing any data besides the fuel buildup and the alignment number that generated it. The dictionary was added to act as a cache to store already computed fuel consumption values, initially per-loop then moved one level up to be global (because the summations wouldn't be different).

I'm not saying my method (posted in a sibling comment) is the best solution, but it's the way my brain walked through the problem.


Posted my optimized Kotlin solution in that sibling thread :)

Cool of you to participate without being a developer! Lots of computer science topics makes it easier, so hard without knowing of them. For instance graph searching / Dijkstra has beem relevant this week.


Yeah, I was doing a CS minor in college but had to drop as it was consuming too much of my time from my other discipline in the non-intro courses. Big(O)/time complexity were my usual failings in the intro algorithms course I took.

I'm not unfamiliar with programming, but I come from the sysadmin side of things. "Glue" work is usually where things are focused and the 'fun' nitty-gritty of algorithms can be a bit out-of-scope, though I'm not a sysadmin in my current role anymore so any dev-related work I do is purely personal now.

I've had to take a break from AoC, only got up through Day 10, but didn't get P2 for 8 and 9. It's a fun way to keep the mind going and to slip back into the coding space to at least not lose skills, even if the solutions are simple/non-optimal.


Yes! Tasks like AoC (relatively small, without stringent performance requirements but still requiring the correct algorithm) are where Python is not only a reasonable choice, it's unreasonably effective.

Doubly so this year, with the theme being linear algebra.

  import numpy as cheat
:)


Your positivity and enthusiasm is infectious. Thanks for being an optimist.


doit [0] is a superb toolkit for building task-oriented tools with dependencies. It isn't too complex to get started and gives you a lot to work with. I've used it for an internal tool used by everyone in the company for 5+ years and it has never given me a headache.

[0]: https://pydoit.org


And surprisingly underrated.

I mean, it's declarative, works on Windows, easy things are (very) easy, and because you can mix and match bash and python actions, hard things are suspiciously easy too.

Given how complicated the alternatives are (maeven, ninja, make, gulp...), you'd think it would have taken the world for years.

Yet I've only started to see people in the Python core dev team use it this year. It's only getting traction now.


"hard things are suspiciously easy too."

Care to share an example?


Here is a task that groups all static files from a web project into a directory.

It has to make sure a bunch of directories exist, run a node js command to build some files from the proper dir, then run a python command to regroup them + all static files from all django apps into one dir. Simple.

But then I had a problem I had to hack around. This required me to change an entry to a generated TOML file on the fly at every build.

doit just lets me add a 5 lines python function that does whatever I want, and insert it between my bash tasks, and I'm done.

    def task_bundle_static_files():

        def update_manifest():
            conf = Path("var/static/manifest.toml")
            data = toml.loads(conf.read_text())
            data["./src/main.js"] = data["index.html"]
            conf.write_text(toml.dumps(data))

        return {
            "actions": [
                "mkdir -p ./var/static/",
                "rm -fr ./var/static/* ",
                "cd frontend/; npm run build",
                "python manage.py collectstatic --noinput",
                update_manifest,
                "cp -r ./var/static/* ",
            ]
        }


or if you prefer rust, there's just. https://github.com/casey/just/

better yet, use both and just doit


For simple tasks — e.g. a list of commands — that are fine staying in sh, https://taskfile.dev is excellent


How does this compare to pyinvoke or they are completely different ?


The pythontic replacement for makefiles?


Yes. Like make, it lets you define tasks with dependencies and targets, and rebuild only if one of them have changed.

But I prefer it because:

- it runs anywhere you have python.

- it uses a clean syntax, and the parser is great at telling you about errors since it's python.

- you get access to python's stdlib: string formatting, maths, etc.

- you get access to python's ecosystem. Want to deal with timezone, hashing, crypto ? Sure you can.

- you get access to python's tooling (debugger, formatter, linter).

- you can still just use bash if you want. Easy things stay easy.


That would be invoke tasks. https://docs.pyinvoke.org/en/stable/getting-started.html, which I first encountered in fabric: fabfile.org, which has the fab utility and fabfiles (analogous to Makefiles) but is so much more.


It is a pain to work with invoke now. It is all find and dandy for the basic features but you going to hit the seams soon you start trying advanced stuff. Looks like the project is going to be abandoned.


What is advanced?

btw, the project is old and it is useful with the current feature set. There is an issue with the bus factor but is common for many tools.


How does this compare to snakemake?


https://github.com/keleshev/schema a library for validating python data structures by schema definitions, gave me a wow moment. (i have a slightly less original one here: https://github.com/MoserMichael/kwchecker )


Love the idea of Schema, but not a fan of that syntax, doesn't seem Pythonic to me, seems more like something you'd see out of Javascript word.

Like looking at that very first example, I have no clue what "len" means in that context. Is it implicitly checking that it's not an empty string? Then on the next line, how come `int` has `Use()` around it, but on the previous line `str` didn't? I guess that int is being used as a converter, line on the next line with str.lower, but the str was being used as a type check?

Not a fan of that API.


I'll add BeeWare (https://beeware.org/) as a pretty nice nice to deploy Python applications cross-platform easily. It's more like a suite of tools, though.


I love Python for many things: scripts, data science, prototypes, etc. But I would never use it to build a large backend system. I just can't handle all the warts. Yes it has been used by large companies, but not without their problems.


You use Python to rapidly build a small backend system and then, if any warts appear that need to be smoothed over, you carve that function off to a more robust solution.

Python is great for programming at the speed of inspiration.


To me Python is fast just like not writing tests is fast.


for prototypes and MVPs time-to-learning and time-to-market are king, and programmer time is the most expensive thing. in that niche Python is helpful and tests just slow you down

tests are super helpful and wise in any sort of long term or safety-critical software, obviously

so, tests (and Python) are not inherently a win or a loss. they just give you different trade-offs

times to write testless Python and times to write test-heavy Go, Rust etc.


Generally I believe in engineering tradeoffs but I think some tests are always important. Even with MVPs. Writing test might double your time to market, but they will allow you to move faster and make less regressions once in production. The early days are some of the least stable features wise and having tests in place to verify things when massively changing a codebase is nice. Also, it can greatly help add confidence to newcomers who don't know the codebase.


understood. theres def an art to judging when its too early to write tests vs the perfect time. its easy for anyone to recognize once its become "too late" and therefore more painful and costly to add it than otherwise would have been

I therefore try to put each situation quickly into 1 of 3 buckets then move forward on that basis:

1. heck no

2. heck yes

3. either way. a gray area

I generally see a case 1 and 2 with confidence. therefore by a process of elimination, that also lets me deduce when its case 3. and in those cases you cant go wrong. :)

"Impossible to predict, the future is." - Yoda


It's interesting because even FANG companies can't agree on testing. I've heard Facebook and Google are very different in this regard.


What do you think about using Python for prototyping of microservices in large backend systems?


Sounds reasonable. Especially for a ML group. I just know that many times the prototype turns into the production product.


I think that is also reasonable, if the production system does not have to service a heavy load, and does not need to be scalable. Scalability is often assumed to be a requirement, however it is not always required in practice.


Good point about scalability. The first company I worked for was tiny, and this really hits home :)


tqdm has been around for ages now, it's not exactly new.


tqdm is indeed 8 years old, but it's been popping up everywhere for only a couple of years now. Hence the "gaining traction".


You said "wave of new tools that are gaining traction" maybe you meant to say "wave of tools newly gaining traction".


A new wave "of tools gaining traction"


How would you separate this new wave from any old waves? I've never been aware of "waves of traction".

Please just admit you made a mistake instead of responding to everyone trying to defend your use of the word "new" on an old tool. :D


I though I wrote it correctly, but I'm French, so I'm going to assume I haven't.


I'd chip in dramatiq as a nice python tasks/message passing library


rich is great - exactly this library i used to solve same problem as tqdm does and it includes estimation (ETA) too!

Rich is just amazing library


too bad python is the least energy efficient programming language, by a lot. It's like rolling coal of programming.


It's energy inefficient like a school bus, not inefficient for the sake of it, but because it's a decent way to shuttle passengers of all experience levels through all weather.


Do you seriously think that 20 cars are more economic than a single bus?

Python performance problems are exaggerated for many use-cases (hot paths are not written in Python e.g., matching regexes happens in C code(re,regex), parsing xml too (elementtree, lxml), sqlite, numpy,scipy (C, Fortran), etc. Cython makes it trivial to drop to C level where necessary)


Python might not be energy efficient in the hardware it is definitely efficient in the wetware.


fire is also really nice for easily building cli


tqdm is like around for 5 years or something now


Might as well toss out Plumbum: https://plumbum.readthedocs.io/en/latest/

No more bash! No more subprocess! Write shell-like code with the convenience of Python!

Seriously, it's a great module. A bit of a learning curve, but then it feels natural.


I raise you xonsh: bash in your Python and vice versa, all in the shell! https://xon.sh/

Also have to mention the fantastic Python Prompt Toolkit, which xonsh is based on - https://python-prompt-toolkit.readthedocs.io/en/master/

I mirror another commenter's excitement on this thread about the cool libraries we have at this age, and the tools that are made possible by them.


i have a subprocess wrapper, might also be of help: https://github.com/MoserMichael/subb


wow, didn't know this existed. I've been using ipython as shell-ish replacement for not-so-serious-thing. I need to take a detail look at this from my initial glance over. thanks


Plumbum is great but I actually use the bang notation in Jupyter even more.

Actually, for me Jupyter (I use lab but I'm sure NB is amazing too) is the tool that boosts my productivity the most. (And omg vim mode.)


Also sh.


`from tqdm import tqdm; [i for i in tqdm(range(10))]`

...bit of a learning curve?


I think what OP is referring to is how fiddly it is to run a subprocess and capture the output. By the way, also super annoying in Java - you have to faff about with a thread (and you can't even really do that in python, which as much as I love the language, is quite pathetic) that drains stdout before it gets jettisoned by the OS, and in both cases long story short is that it's easy to write something that works when the output is short, but getting it to work on long output is an exercise in frustration. One neatly solved by plumbum, and so I'm definitely with OP it's use makes code measurably less shit.


If you like tqdm, it's worth checking out pqdm, a parallelized version. If you have embarrassingly parallel work to process in a script, it makes it dead simple to parallelize and monitor the progress of something. Highly recommend:

https://github.com/niedakh/pqdm


+1 for PQDM, I use it a ton and most of the time it just works. I did have some rare cases where PQDM was much slower than a direct joblib implemention, but that could well be a fluke on my end. Either way, amazing package!


My current issue with tqdm is nested progress bars in multi-threading/processing causing dead locks + tqdm.write just being broken in those contexts, either deadlocking or the ui just being wrong. Does pqdm do a better job?


One library that haa been really useful for me lately is Dataset https://dataset.readthedocs.io/en/latest/

It makes it a lot easier to casually use a database (e.g. SQLite) to persist dictionaries without explicitly building a schema.

Great for CLI apps and ad hoc data crunching.


Python is such an amazing language, on one hand, it's easy enough where you could probably train a non technical person on it within about two months.

Yet, you can still make 200K writing it, I don't know the next time I'll create a command line application in Python, but I'll keep this little tool in mind. I hope Python eats the world.


it may be easy to get going with python, but it takes a non trivial amount of time to understand, what is going on. I have a advanced python3 course https://github.com/MoserMichael/python-obj-system that explains some of the more advanced concepts. One of the things covered are decorators (tqdm is a decorator)


This is amazing. I just gave a quick glance and love your style.

I just wish you would have provided ipynb instead of md. It would be so much fun to just modify and run as I am learning. Thank you!


Thank you! I am not an expert in Jupyter Notebook, may look into it now.


This is gold, thank you !


Big fan of tqdm in python, so I ported it to ruby 5 years ago:

https://github.com/powerpak/tqdm-ruby

(shameless plug and an invitation for pull requests)


Good stuff, thank you!


Check out Will Chrichton's talk "Type-Driven API Design in Rust" where he live-codes a tqdm-like progress bar for Rust iterators. Solid presentation and eye-opening how straightforward it was to extend the core language through traits.


Came here to comment the same thing. Really awesome to see how the type system can be leveraged to produce syntactically lightweight abstractions.


Fun fact, tqdm has an official (pre-alpha) C++ port!

https://github.com/tqdm/tqdm.cpp


See also the shell command pv(1).


Indeed. This comes to mind right away. Question is: are we missing something? Of course, tqdm brings this functionality right into python REPL, which is new and looks great.

pv and tqdm would look even better if they'd be called implicitly (with an opt-out) since I always end up regretting not using pv when my command is taking too long. Too late.


tqdm & click make a nice CLI pairing for taking a simple idea and turning into a reusable tool that you can feel good about sharing in minutes.


I have such mixed feelings about Click; I've reached the point of using it so much that I now hate it. Which I think probably means that it's great but only as as stopgap until you learn how to bend argparse to your will. It's just too magic, too much global state, and too much spooky action at a distance.


I actually don't think Click is too much global state and spookiness, at least in relation to a lot of popular Python libraries.

I do wish I could hook into it to test better, the only thing you can really do right now is to have it print stuff out and assert the output string. It's not really necessary to just build a CLI with click, but I want to build a library that integrates with it and testing the integration is a PITA.

I want to write a config-loading library for CLI apps like Golang's Viper lib, but for Python



I went the opposite way. Used to do everything with argparse, then discovered click and never looked back. While argparse lets you do anything, click forces you to build a good CLI.


Click inherits the eternal argparse limitation of "you can't stick --verbose everywhere", doesn't it?


I had the same feeling. I wish there were an equivalent that tried to minimize the magic and global state a bit while still letting you make decent CLIs with a tiny amount of code.


I wasn't aware of Click, it looks promising.

For reference, it's a backronym for 'Command Line Interface Creation Kit': https://click.palletsprojects.com/en/8.0.x/


Click is really good, typer is great...


I love Typer, it's my go-to tool for building CLIs, but I'm worried about its future. It's developed by a single developer, there are too many issues left unattended, no development for months, lacking a good API to extend it or to interact with Click.


I use tqdm (with argparse) in a pyinstaller packaged exe. 5-stars - Its great! I call the exe from Java to do some ML and forward the tqdm progress and status messages to a Swing progress bar. It makes the user experience seamless. Depending on the task and the user's settings the tool usually takes about 7-8 seconds(including the PyInstaller extract) but it can also take up to a minute. When its 7-8 seconds the progress messages fly by and the tool feels snappy. When its 50-60 seconds the users are very grateful for the progress bar. Meanwhile I can develop and test the tool from the command-line and see progress info and when I want to run the ML code in Jupyter notebooks the progress bars can still be made to work.


Python Fire is my favorite tool for a dead simple way to create a CLI without using ugly argparse


does click.progressbar use tqdm underneath?

In Click projects I usually use that rather than tqdm directly.


tqdm is one of the very few Python packages that makes it into every script I write. It's a very high ROI for managing and tracking simple loops.

My only complaint is the smoothing parameter; by default it predicts the estimated time remaining based on the most recent updates so it can fluctuate wildly; smoothing=0 predicts based on the total runtime which makes more sense given law of large numbers.


The older I get, and the more i come across these issues, the more sympathy i have for the Windows file copy progress dialog...


Yet the estimate of when you'll become fully sympathetic to the Windows file copy dialog changes wildly by the day, right :)


Yes, assuming i.i.d. samples. If the first batch is "warmup samples", this goes out the window.


Even if it's not i.i.d, a small amount of samples with much higher/lower times will just average out as noise.


well, it seems to me that its quite likely that speed varies across the process, so the more recent updates are probably a more useful default, so if your first part of the process is slower/faster this won't permanently mess up the estimates.


It has no dependencies which is pretty cool


Is there a way to globally disable all tqdm progress bars?

Something like the NO_COLOR environement variable?

These progress bars are nice when you launched a single loop yourself, but when you are running an automated battery of many things they become annoying and pollute your terminal too much. I know that you can silence each particular program that uses tqdm by setting a 'disable' option. But this requires editing the python source code of the program.


Something like this should work:

    import os
    import sys
    from tqdm import tqdm as _tqdm

    def tqdm(*args, **kwargs):
        try:
            disable = bool(int(os.environ['NO_PROGRESSBARS']))
        except KeyError:
            disable = not sys.stdout.isatty()
        except (ValueError, TypeError):
            disable = False
        kwargs.setdefault('disable', disable)
        return _tqdm(*args, **kwargs)
Then import that instead of tqdm.tqdm.

sys.stdout.isatty() isn't a perfect answer to what people ask when they want to know "am I running in an automated environment, or is a human user looking at my output?", but it's close. More nuance is available online.


Thanks for the answer!

> import that instead of tqdm.tqdm

But it's not me who is importing tqdm to begin with! I call many programs in parallel from shell scripts (out of my control) and they all call tqdm individually. I need to stop tqdm output from outside these programs.

Your code should be part of tqdm itself, not written by individual programmers.

> am I running in an automated environment, or is a human user looking at my output?

But I want to stop tqdm output precisely because I'm a human looking at it. If you have more than one or two progress bars simultaneously, it becomes useless clutter.

As much as I like to use tqdm myself for my programs, I'm sad that as tqdm becomes more and more popular, my terminal output becomes more and more cluttered, to an absurd amount. Piping the output to a file does not help and is totally the wrong idea. I'm precisely interested in seeing--in real time--the part of the output that does not come from tqdm, such as warnings and errors.


That's a reasonable request. Discussion about a feature along those lines seems to be happening in https://github.com/tqdm/tqdm/issues/614; perhaps you could weigh in there?


chime [1] is another python package that fans of tqdm might like.

[1] https://github.com/MaxHalford/chime/


Fantastic share and subsequent discussion about in this thread. Threads like these are why I love HN.

My Python skills have improved so much thanks to HN.


I always wondered how a post like this a year ago didn't go anywhere but this post makes the front page. Love the Python package. https://news.ycombinator.com/item?id=23160774


I like how tqdm can manage multiple progressbars in the same terminal. I have used it to track the progress of multi day processes. It also produces a nice animation on jupyter notebooks.


I just had the pleasant experience of importing tqdm into a script, checking HN, and seeing tqdm here :)


Tqdm is awesome and I use it all the time!

Another great alternative is fastprogress: https://github.com/fastai/fastprogress

It often works better in Jupyter Notebooks.


For notebook I am using tqdm.notebook[0]. Works quite well for me.

[0] https://tqdm.github.io/docs/notebook/


Do you know that tqdm is pronounced like the Arabic word تقدم which mean progress?


Yes! The same as the Hebrew word תקדם for any Hebrew speakers.

It makes the module so much funner to use for some reason :)


I wish there was something this good for Golang.


While not having all the tqdm features, i find mpb to be quite good actually.

https://github.com/vbauerster/mpb


I thought the name was familiar: I use it with TensorFlow. Nice project!


Whoaaaa, this tqdm thing is, like, um, toats kewl!!! It's waaay better than the <blink> tag old people used to use, like, before I was even born lol. IM so so happy idont have to think too hard to get flashing light thingys to show up on my screen.


If you are trying to say that youngsters are stupid for adding visualisations to their programs and terminals: sorry, but you are wrong.


Wo dude, not at all. But you are literally saying exactly what my gramps used to say "but you are wrong." I figured his brane was too old to tell me why I was wrong. Computers are really fast now. Why cant they just give the answer right away?


They are stupid for using visualisations that break standard python logging.


It's for interactive sessions not background ones


Yeah, exactly. The software I write is complex enough to require logging of hundreds of progress bars at the same time.

The great thing that just using `logging` module is enough to log something like `Downloading file A, 55% [55Mb/100Mb]` (absolutely equivalent in terms of information to a "graphic bar") and also happens to be composable in a way that I can then reuse that package as part of anything that is also non-interactive.


tqdm is one of the (many) really useful packages that make me miss using Python (I’m stuck writing TypeScript for Node.js in my current job).


doesn't have every bell and whistle of the original but I found nqdm recently and it works:

https://www.npmjs.com/package/nqdm


Love tqdm, I used to use it on all my python scripts.


tqdm is really great. (Also great name BTW) It has a lot of features like cli, jupter notebook widget, flexibility, etc.

I have been using it for 4-5 years now.


tqdm also works on Colab (Jupyter Notebooks) yay


To clarify, the tqdm.notebook progress bar (which is much prettier) works on Colab. If you want to make the script more agnostic, you can import tqdm.auto.

A fun quirk is that tqdm.notebook didn't work with Colab's dark mode so the text was readable; this was very recently fixed.


I think the author doesn't fully appreciate how insanely difficult it is to write accurate progress meters for anything slightly more complicated than a single for-loop.


this looks like a fantastic tool! would love to use this in some Python CLI's to help users understand the progress


tqdm is great, except I really like my type hints, and tqdm has no type hints :(


How would it know how long a loop is ahead of time? Is this solving the halting problem?


The halting problem is easy to solve most of the time. For instance, you can easily tell whether the following two programs halt:

    while True:
        pass

    while False:
        pass
What's impossible is solving the halting problem 100% of the time with 100% accuracy. Most people don't need to do that. Solving the halting problem most of the time and then saying "I don't know, that's too hard" the rest of the time is of immense practical value, and many practical systems (the kernel's eBPF verifier, the thing in your browser that detects stuck pages, etc.) do exactly that.

In this particular case, tqdm solves the halting problem in cases where it's easy: https://tqdm.github.io/docs/tqdm/#__init__

> total: int or float, optional

> The number of expected iterations. If unspecified, len(iterable) is used if possible. If float("inf") or as a last resort, only basic progress statistics are displayed (no ETA, no progressbar).


For things with a fixed length, it looks it up ahead of time (i.e. things that return things when you call len() on them). For other things such as generators, you can supply a total=N value when you create the progress bar. If you dont know the total ahead of time, it doesnt give you a percent completion, but still tells you thinks like number of iterations per second and number of iterations finished.


Have you ever used a for loop? You don't know how long it'll take, but you know how many iterations, and at any given point you can know how long past iterations have taken. Extrapolation is not hard, even if not always accurate.


This is not how for loops work in Python, which does not have ranged loops.

The example works because range objects implement __len__ which is where tqdm will get it's total/count information from.


There are plenty of for loops in which you don’t know how many iterations you will have. With a generator, it can even be infinite.


In which case the visualisation will adjust and show this fact.

E.g. passing an unbounded range will jus show some info about the past iterations.


Wait.. so python takes several seconds to simply count from 0 to 9 million?


Hey, I know this is a bit off-topic, but I can't wrap my mind around one thing. I'm getting a ton of Google ads on that website. Where do they come from? Is Github adding them? Did the devs add them to their docs?


I'd be stunned if Github added them, so I checked to verify, and the devs added them; I saw four matches for "adsbygoogle" in https://github.com/tqdm/tqdm.github.io/blob/gh-pages/index.h...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: