Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What, specifically, is bad about the docs? This whole thread is people who just looked at the home page, saw that it is "DataFrames", but didn't know what that means and came here to complain. Nobody has said anything about issues with the docs for someone who understands what a data frame is (or spent like two minutes looking that up) but is struggling to figure out how to use this library specifically.


I think your experience is probably making it difficult to understand the noob side of things. For me, I've struggled with simply slicing up a dataframe. And as I specified, these aren't tools I use a lot, so the "who understands what a data frame is" probably doesn't apply to me very well and we certainly don't need the pejorative nature suggesting that it is trivially understood or something I should know through divine intervention. I'm sure it's not difficult, but it can take time for things to click.

Hell, I can do pretty complex integrals and derivatives and now so much of that seems trivial to me now but I did struggle when learning it. Don't shame people for not already knowing things when they are explicitly trying to learn things. Shame the people that think they know and refuse to learn. There's no reason to not be nice.

Having done a lot of teaching I have a note, don't expect noobs to be able to articulate their problems well. They're noobs. They have the capacity to complain but it takes expertise to have clarify that complaint, turning it into a critique. I get that this is frustrating, but being nice turns noobs into experts and often friends too.


I really think this is a misunderstanding of the purpose of different kinds of documentation. The documentation of a new tool for a mature technique is just not the primary place to focus on writing a beginners' tutorial / course on using that technique. Certainly, "the more the merrier" is a good mantra for documentation, so if they do add such material, all the better. But it is very sensible for it to not be the focus. The focus should be, "how can you use this specific iteration of a tool for this technique to do the things you already know how to do".

Nobody is suggesting that you should be an expert on data frames "through divine intervention". But the place to expect to learn about those things is the many articles, tutorials, courses, and books on the subject, not the website of one specific new tool in the space.

If you're really interested in learning about this, a fairly canonical place to start would be "Python for Data Analysis"[0] by Wes McKinney, the creator of pandas and one of the creators of the arrow in-memory columnar data format that most of these projects build atop now.

This is a (multiple-) book length topic, not a project landing page length topic.

0: https://wesmckinney.com/book/


> But it is very sensible for it to not be the focus.

Sure. I mean devs can do whatever they want. But the package is a few years old now and they do frequently advertise, so I don't think it makes make it more approachable for... you know... noobs.

This is a bit difficult of a conversation too, because you've moved the goal post. I've always held the context of noob, but now you've shifted to just be dismissive of noobs. Totally fine, but different from the last comment.

> But the place to expect to learn about those things is the many articles, tutorials, courses, and books on the subject, not the website of one specific new tool in the space.

I actually disagree. This is the outsourcing I expressed previously, but it's clear from the number of complaints that this is not sufficient for a novice. You do seem passionate about this issue, and so maybe you have the opportunity to fill that gap. But I very much think that official documentation is supposed to be the best place. Frankly because it is written by the people who have a full understanding of the system and how it all integrates together. I'm sure you've run into tons of Medium tutorials that get the job done but are also utter garbage and misinform users. It isn't surprising when most of these are written by those in the process of learning, and are better than nothing, but they are entirely insufficient. The whole point of creating a library is to save people time. That time includes onboarding. For example of good docs, I highly recommend the vim docs. Even man pages are often surprisingly good.


> now you've shifted to just be dismissive of noobs

No, I'm sorry, this is getting ridiculous. I'm not being dismissive of noobs, I'm saying "noobs should seek introductory material when attempting to learn an entirely new subject, like books, courses, or tutorials on the subject matter".

It's just so freaking weird for you to expect every single tool in some space to create that introductory material.

I promise you that the ruby on rails website did not assume total ignorance of the term "web application" when I first came across it as a "noob". I was a total noob at ruby on rails, but I had to understand why I might be interested in "web applications, but easier".

I could spend all day coming up with examples that are just like this. And this is not some kind of failure of imagination in how to document specific projects, it's just specialization. The website of a new tool for something that has been done a bunch of times over multiple decades is not the right place to put the canonical text on what the thing you're doing is; you put that in a book or in college courses or other kinds of training materials.

Unless what you have made is a brand new entirely unfamiliar thing (which is very rare) with no introductory materials for your brand new novel concept available anywhere, it makes more sense to focus your documentation on "why choose this specific solution over the other ones people are already familiar with" rather than "what even is the thing that we're doing here from first principles". Sure, add some links to the best introductory materials, but don't try to write them yourself, that's crazy!

> I actually disagree. This is the outsourcing I expressed previously, but it's clear from the number of complaints that this is not sufficient for a novice. You do seem passionate about this issue, and so maybe you have the opportunity to fill that gap.

No, I'm not passionate about this issue. I think people who actually want to learn things will continue doing research and reading books and taking classes to learn about new subjects, and that people who just want to complain will continue to do so. There is no "gap" to fill. There are tons of great materials that will describe in great depth what "data frames" are, and how to work with them, for anyone who is even the tiniest bit interested.

> I very much think that official documentation is supposed to be the best place. Frankly because it is written by the people who have a full understanding of the system and how it all integrates together.

I think what you seem to be confused by is the difference between this one library - polars - and an entire large subject - tabular data analysis using data frames. It certainly does make sense for the polars website to document the polars library, which (in my view) it already does. But if you want to learn the subject, you need to do that in the normal way that people have always learned new subjects. I'm sorry, because you seem resistant to this, but again, the way to do that is with books and courses, not by reading the documentation of one tool comprising a tiny sliver of a very large subject.

> I'm sure you've run into tons of Medium tutorials that get the job done but are also utter garbage and misinform users.

No, Medium tutorials should not be your go-to source for learning about a new subject! Your go-to source should be books and courses.

This is why I keep commenting here. I want to get through to you that you seem to be going about the acquisition of knowledge in a very weird and fundamentally misguided way. It just isn't the case that knowledge is mostly found in the documentation of tools! There is way more foundational knowledge to learn than it would ever make sense for every little tool to document themselves.

This is, in a very literal sense, why people write books about things, and why schools exist. We don't teach algebra by linking to the Mathematica documentation.


I can't speak for the Python side of the Polars docs but coming from Python and Pandas to Rust and Polars hasn't always been easy. To be fair, that isn't just about docs but also finding articles or Stack Overflow answers for people doing similar things.


That certainly makes sense!


I'm a dataframes noob. I saw this post and the performance claims attracted me. I went to chatGPT to understand what dataframes were about. Then on udemy, I searched for a polar course. A course required pre-requisites : a bit about jupyter notebooks and pandas. Then I went through a few modules of a pandas course. Now, I'm going through a polars course. Altogether, I spent about 2-3 hours to setup the environment and know what this is all about.

A little bit context would have helped to have attracted a lot more noobs.g


Your first paragraph makes perfect sense! I was nodding along. But then your concluding sentence was a bit of a record scratch for me. This all worked as intended! You knew what the project was about - "data frames" - and what might make it attractive to you - the performance claims - and then you went and followed exactly the right path to get the context you needed to understand what's going on with it. It's a big topic that you were able to spin up on to a basic level in 2-3 hours, by pulling on strings starting at this landing page. This is a very successful outcome.

I'd also recommend this book: https://wesmckinney.com/book/. It's not about polars, but you'd be able to transfer its ideas to polars easily once you read it.


"How To Be A Pandas Expert"[1] is a good primer on dataframes. There's a certain mental model you need to use dataframes effectively but it's not apparent from reading the official docs. The video makes it explicit: dataframes are about like-indexed one-dimensional data, and every dataframe operation can be understood in terms of what it does to the index.

[1] https://www.youtube.com/watch?v=oazUQPrs8nw


The Rust docs are for some reason much worse than the Python docs, or at least that used to be the case




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: