Hacker Newsnew | past | comments | ask | show | jobs | submit | macobo's commentslogin

In practice product requirements can get in the way of technically ideal solutions. One example of this is that analytics products allow users to pass in and analyze arbitrary number of user properties to do analysis on - more than even a columnar database can handle. The current solution of storing JSON as a column indeed has a very significant performance trade-off, but it's also needed to power queries users need to run. This is also why we're really excited about the new Object data type that landed in 22.3 as it handles these cases gracefully by creating dynamic subcolumns.

On JOINs - again, requirements bite us in different ways. Product analytics tool data ingestion pipelines can get quite complicated due to needing to handle merging anonymous and signed in users and user properties changing over time. Handling that via JOINs is as a go-to-market helps avoid that upfront cost by centralising the logic in SQL, but indeed does come with a significant cost in scalability. Delaying in turn allows you to be building tools users need. That said every loan needs to be paid at some point and we're currently knee deep in re-architecting everything to avoid these joins.

Also note that JOINs don't work the way you described from our experience - rather the right hand side of the join gets loaded into memory. The bottleneck there is memory pressure rather than I/O with a good ORDER BY on the table.

All that said, what a great summary of all the different things to keep an eye on. Thanks for reading and sharing your thoughts!


I talked about memory latency with respect to joins - that the random access in the hash table is much slower per "row" than vectorized operations on columns. I didn't say I/O would be slow. That I said hash table implies that one side is fully loaded into RAM.


> I'm convinced the only people who have good things to say about using linux (or BSD for that matter, been there done that, no thanks) on a laptop are the kind of people who keep their "laptops" on the same desk, plugged in to ethernet, and are effectively using a desktop with poor thermals

Good for you for making your own decisions, but don't be a condecending arschloch. Personally I prefer linux because it works fine and consider Apple is overpriced piece of spyware and many their users smug idiot hipsters.

Happy holidays!


> don't be a condecending arschloch

> many their users smug idiot hipsters

Maybe take your own advice?


Examples, suggestions?


I've never created a reading list, so this was a fun exercise.

Programming:

* Analyzing Computer System Performance with Perl::PDQ - Gunther

* The Mythical Man Month - Brooks

Philosophy:

* A Journey Around my Room - de Maistre

* Anger, Mercy, and Revenge - Seneca

* Schrödinger - What is Life?

* Man's Search for Meaning - Frankl

* Essays - Montaigne

* Ethical Intuitionism - Huemer

* The Consolations of Philosophy - de Botton

* A Manual for Living - Epictetus

* Meditations - Aurelius

Psychology / Meaning / Purpose / Science:

* Purpose and Meaning in the Workplace - Dik, Byrne & Steger

* The Case Against Education - Caplan

* Selfish Reasons to Have More Kids - Caplan

* Dawkins - The Selfish Gene

* A Confession - Tolstoy

* Enlightenment Now - Pinker

* The Better Angels of our Nature - Pinker

* The Improving State of the World - Goklany

* The Skeptical Environmentalist - Lomborg

* Religion for Atheists - de Botton

* Ending Aging - de Grey

* Gut Feelings - Gigirenzer

Fiction:

* Heart of Darkness - Conrad

* Candide - Voltaire

* Brave New World - Huxley

* Selected Works - Goethe

* 1984 & Animal Farm - Orwell

Politics:

* Obedience to Authority - Milgram

* The Problem of Political Authority - Huemer

* The Communist Manifesto - Marx

* Socialism - von Mises

* Just One Child - Greenhalgh

* The God That Failed - Crossman

* Death by Government - Rummel

Thought-provoking:

* Free to Learn - Gray

* The Beautiful Tree - Tooley

* Education and the State - West

* The Machinery of Freedom - Friedman

* Against Intellectual Monopoly - Boldrine & Levine

* From Mutual Aid to the Welfare State - Beito

* The Not So Wild, Wild West - Hill

* More Guns, Less Crime - Lott

* Race & Economics - Williams

* Emancipating Slaves, Enslaving Free Men - Hummel


Why take 60% of the rating space up by negative ratings? It seems like what you really care about is degrees of goodness.

An alternative approach:

1 - I disliked it. 2 - It's OK 3 - This is good 4 - This is great 5 - This is a must-read


It's the problem with star systems, which is that we'll always have different definitions. Your 3s are living next to the other poster's 3s and mean very different things.

That said, I think the world has also suffered from ratings inflation. I tend to assume anything under a 4 means "bad" or "meh" myself.


The best rating system I have ever seen is the "best of two" system that pixoto.com uses to rate photos.


Bigger numbers don't help much either, rating systems out of 10 tend towards anything under an 8 or maybe a 7 being average.


This is exactly the way the current Goodreads rating is supposed to work (and I'm personally OK with it). But my guesstimate is that for 95% Goodreads users everything below 4 stars means that the book sucks.


How do you know that's how the rating system is "supposed" to work?


(Not parent)

If you go to rate a book on good reads and hover over each of the the 5 stars, here are the "title" attributes of the links:

* title="did not like it"

* title="it was ok"

* title="liked it"

* title="really liked it"

* title="it was amazing"


Interesting, thanks. I use the app mainly, and it doesn't have those descriptors as far as I know.

I think it's interesting to have the middle/neautral rating described (3 star, middle of the range available) as "liked it" (a positive response).


https://fishshell.com/ for sure

Just having sensible defaults on a shell works wonders on my day-to-day productivity.

Add a couple of aliases for productivity and off you go.

  abbr --add s "git status"
  abbr --add gap "git add --patch"
  abbr --add gco "git checkout"
  abbr --add gd "git diff"

  alias recent="git for-each-ref --sort=-committerdate refs/heads/ --format='%(color:yellow)%(refname:short)|%(color:bold green)%(committerdate:relative)|%(color:blue)%(subject)%(color:reset)' | column -ts'|' | tac"
  alias r="recent"


I absolutely love Fish, but it's worth warning anyone who's taking a first look and excited to try it, that some syntax differences from more familiar shells like Bash may cause frustration, both in terms of muscle memory and anytime you need to copypasta. It hasn't been enough of a problem for me to switch back (or to ZSH), the benefits far outweigh the frustration, but I do think it's worth a small warning.


I use zsh and found https://github.com/zsh-users/zsh-autosuggestions to be very comparable with fish autocomplete, while keeping zsh niceties.


Second that. Would additionally recommend the fzf for fuzzy reverse history search. Absolute gold


This one's tricky. I've driven fish daily for years, and it's definitely snappier than my old omzsh setup (I much prefer this style of history), but you can get bitten by certain tools not using shebangs properly.


In europe, https://eagronom.com/ is taking an interesting approach - rather than throw all existing farming practices out with the bathwater, try to build on top of them and modernize to achieve sustainable farming.


This smells a lot like meteor. Would love to see an actual semi-large app built with this approach.


There's an infinite amount of detail that's impossible to capture in a comment and which invariably changes over time and doesn't hold in the future.

For my team, the solution has been writing longer commit messages detailing not only what has changed, but also the why and other considerations, potential pitfalls and so forth.

So in this case, a good commit message might read like:

``` Created square root approximation function

This is needed for rendering new polygons in renderer Foo in an efficient way as those don't need high degree of accuracy.

The algorithm used was Newton-Raphson approximation, accuracy was chosen by initial testing:

[[Test code here showing why a thing was chosen]]

Potential pitfalls here include foo and bar. X and Y were also considered, but left out due to unclear benefit over the simpler algorithm. ```

With an editor with good `git blame` support (or using github to dig through the layers) this gives me a lot of confidence about reading code as I can go back in time and read what the author was thinking about originally. This way I can evaluate properly if conditions have changed, rather than worry about the next Chthulu comment that does not apply.


Now you have to look in two places for the information- the code and the commit messages.


How so? The code still documents what is happening, the commits however lay out the whys.

The point is that these two are separate questions, and that trying to use comments as a crutch to join the two religiously is a headache. It's impossible to keep everything in sync and I don't want to read needless or worse misleading information.

What's worse, in comments we often omit the important details such as why was the change made, what other choices were considered, how was the thing benchmarked, etcetc.

That said, comments still have a place. Just not everywhere for everything and especially not for documenting history.


I disagree. I think the "whys" belong in the comments- in fact, that's the most important part of the comment if the code is cleanly written. I don't want to be happily coding along, get to a glob and have to go to the repo pane, hunt for the commit that explains this particular thing, then read a commit message. Put it in a comment in the code. Pretty please.


You need a queue in front of your database irregardless of write latency. Otherwise you tie your availability to database availability as downtime (even for upgrades) is often unavoidable and network problems are common.

Dan gave a pretty good talk about the high-level details of how/why postgres a couple of years back: https://www.youtube.com/watch?v=NVl9_6J1G60.

One reason of why pg is that SQL is really powerful for building complex queries.


Note that working remotely takes some getting used to just as with working at an office. Yes, you can dump someone into unfamiliar conditions and draw conclusions from that, but that's just confirming your prior assumptions without actually trying.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: