More

macobo · on March 23, 2022

In practice product requirements can get in the way of technically ideal solutions. One example of this is that analytics products allow users to pass in and analyze arbitrary number of user properties to do analysis on - more than even a columnar database can handle. The current solution of storing JSON as a column indeed has a very significant performance trade-off, but it's also needed to power queries users need to run. This is also why we're really excited about the new Object data type that landed in 22.3 as it handles these cases gracefully by creating dynamic subcolumns.

On JOINs - again, requirements bite us in different ways. Product analytics tool data ingestion pipelines can get quite complicated due to needing to handle merging anonymous and signed in users and user properties changing over time. Handling that via JOINs is as a go-to-market helps avoid that upfront cost by centralising the logic in SQL, but indeed does come with a significant cost in scalability. Delaying in turn allows you to be building tools users need. That said every loan needs to be paid at some point and we're currently knee deep in re-architecting everything to avoid these joins.

Also note that JOINs don't work the way you described from our experience - rather the right hand side of the join gets loaded into memory. The bottleneck there is memory pressure rather than I/O with a good ORDER BY on the table.

All that said, what a great summary of all the different things to keep an eye on. Thanks for reading and sharing your thoughts!

barrkel · on March 23, 2022

I talked about memory latency with respect to joins - that the random access in the hash table is much slower per "row" than vectorized operations on columns. I didn't say I/O would be slow. That I said hash table implies that one side is fully loaded into RAM.

macobo · on Dec 21, 2020

> I'm convinced the only people who have good things to say about using linux (or BSD for that matter, been there done that, no thanks) on a laptop are the kind of people who keep their "laptops" on the same desk, plugged in to ethernet, and are effectively using a desktop with poor thermals

Good for you for making your own decisions, but don't be a condecending arschloch. Personally I prefer linux because it works fine and consider Apple is overpriced piece of spyware and many their users smug idiot hipsters.

Happy holidays!

user-the-name · on Dec 21, 2020

> don't be a condecending arschloch

> many their users smug idiot hipsters

Maybe take your own advice?

macobo · on Oct 18, 2020

Examples, suggestions?

statquontrarian · on Oct 18, 2020

I've never created a reading list, so this was a fun exercise.

Programming:

* Analyzing Computer System Performance with Perl::PDQ - Gunther

* The Mythical Man Month - Brooks

Philosophy:

* A Journey Around my Room - de Maistre

* Anger, Mercy, and Revenge - Seneca

* Schrödinger - What is Life?

* Man's Search for Meaning - Frankl

* Essays - Montaigne

* Ethical Intuitionism - Huemer

* The Consolations of Philosophy - de Botton

* A Manual for Living - Epictetus

* Meditations - Aurelius

Psychology / Meaning / Purpose / Science:

* Purpose and Meaning in the Workplace - Dik, Byrne & Steger

* The Case Against Education - Caplan

* Selfish Reasons to Have More Kids - Caplan

* Dawkins - The Selfish Gene

* A Confession - Tolstoy

* Enlightenment Now - Pinker

* The Better Angels of our Nature - Pinker

* The Improving State of the World - Goklany

* The Skeptical Environmentalist - Lomborg

* Religion for Atheists - de Botton

* Ending Aging - de Grey

* Gut Feelings - Gigirenzer

Fiction:

* Heart of Darkness - Conrad

* Candide - Voltaire

* Brave New World - Huxley

* Selected Works - Goethe

* 1984 & Animal Farm - Orwell

Politics:

* Obedience to Authority - Milgram

* The Problem of Political Authority - Huemer

* The Communist Manifesto - Marx

* Socialism - von Mises

* Just One Child - Greenhalgh

* The God That Failed - Crossman

* Death by Government - Rummel

Thought-provoking:

* Free to Learn - Gray

* The Beautiful Tree - Tooley

* Education and the State - West

* The Machinery of Freedom - Friedman

* Against Intellectual Monopoly - Boldrine & Levine

* From Mutual Aid to the Welfare State - Beito

* The Not So Wild, Wild West - Hill

* More Guns, Less Crime - Lott

* Race & Economics - Williams

* Emancipating Slaves, Enslaving Free Men - Hummel

macobo · on Sept 12, 2020

Why take 60% of the rating space up by negative ratings? It seems like what you really care about is degrees of goodness.

An alternative approach:

1 - I disliked it. 2 - It's OK 3 - This is good 4 - This is great 5 - This is a must-read

SamBam · on Sept 13, 2020

It's the problem with star systems, which is that we'll always have different definitions. Your 3s are living next to the other poster's 3s and mean very different things.

That said, I think the world has also suffered from ratings inflation. I tend to assume anything under a 4 means "bad" or "meh" myself.

chillfox · on Sept 13, 2020

The best rating system I have ever seen is the "best of two" system that pixoto.com uses to rate photos.

robryan · on Sept 13, 2020

Bigger numbers don't help much either, rating systems out of 10 tend towards anything under an 8 or maybe a 7 being average.

hipnoizz · on Sept 13, 2020

This is exactly the way the current Goodreads rating is supposed to work (and I'm personally OK with it). But my guesstimate is that for 95% Goodreads users everything below 4 stars means that the book sucks.

SECProto · on Sept 13, 2020

How do you know that's how the rating system is "supposed" to work?

ThePadawan · on Sept 13, 2020

(Not parent)

If you go to rate a book on good reads and hover over each of the the 5 stars, here are the "title" attributes of the links:

* title="did not like it"

* title="it was ok"

* title="liked it"

* title="really liked it"

* title="it was amazing"

SECProto · on Sept 13, 2020

Interesting, thanks. I use the app mainly, and it doesn't have those descriptors as far as I know.

I think it's interesting to have the middle/neautral rating described (3 star, middle of the range available) as "liked it" (a positive response).

macobo · on April 13, 2020

https://fishshell.com/ for sure

Just having sensible defaults on a shell works wonders on my day-to-day productivity.

Add a couple of aliases for productivity and off you go.

  abbr --add s "git status"
  abbr --add gap "git add --patch"
  abbr --add gco "git checkout"
  abbr --add gd "git diff"

  alias recent="git for-each-ref --sort=-committerdate refs/heads/ --format='%(color:yellow)%(refname:short)|%(color:bold green)%(committerdate:relative)|%(color:blue)%(subject)%(color:reset)' | column -ts'|' | tac"
  alias r="recent"

eyelidlessness · on April 13, 2020

I absolutely love Fish, but it's worth warning anyone who's taking a first look and excited to try it, that some syntax differences from more familiar shells like Bash may cause frustration, both in terms of muscle memory and anytime you need to copypasta. It hasn't been enough of a problem for me to switch back (or to ZSH), the benefits far outweigh the frustration, but I do think it's worth a small warning.

zerd · on April 17, 2020

I use zsh and found https://github.com/zsh-users/zsh-autosuggestions to be very comparable with fish autocomplete, while keeping zsh niceties.

posedge · on April 13, 2020

Second that. Would additionally recommend the fzf for fuzzy reverse history search. Absolute gold

toastal · on April 14, 2020

This one's tricky. I've driven fish daily for years, and it's definitely snappier than my old omzsh setup (I much prefer this style of history), but you can get bitten by certain tools not using shebangs properly.

macobo · on Oct 22, 2019

In europe, https://eagronom.com/ is taking an interesting approach - rather than throw all existing farming practices out with the bathwater, try to build on top of them and modernize to achieve sustainable farming.

macobo · on Feb 15, 2019

This smells a lot like meteor. Would love to see an actual semi-large app built with this approach.

macobo · on Jan 24, 2018

There's an infinite amount of detail that's impossible to capture in a comment and which invariably changes over time and doesn't hold in the future.

For my team, the solution has been writing longer commit messages detailing not only what has changed, but also the why and other considerations, potential pitfalls and so forth.

So in this case, a good commit message might read like:

``` Created square root approximation function

This is needed for rendering new polygons in renderer Foo in an efficient way as those don't need high degree of accuracy.

The algorithm used was Newton-Raphson approximation, accuracy was chosen by initial testing:

[[Test code here showing why a thing was chosen]]

Potential pitfalls here include foo and bar. X and Y were also considered, but left out due to unclear benefit over the simpler algorithm. ```

With an editor with good `git blame` support (or using github to dig through the layers) this gives me a lot of confidence about reading code as I can go back in time and read what the author was thinking about originally. This way I can evaluate properly if conditions have changed, rather than worry about the next Chthulu comment that does not apply.

dhimes · on Jan 24, 2018

Now you have to look in two places for the information- the code and the commit messages.

macobo · on Jan 24, 2018

How so? The code still documents what is happening, the commits however lay out the whys.

The point is that these two are separate questions, and that trying to use comments as a crutch to join the two religiously is a headache. It's impossible to keep everything in sync and I don't want to read needless or worse misleading information.

What's worse, in comments we often omit the important details such as why was the change made, what other choices were considered, how was the thing benchmarked, etcetc.

That said, comments still have a place. Just not everywhere for everything and especially not for documenting history.

dhimes · on Jan 24, 2018

I disagree. I think the "whys" belong in the comments- in fact, that's the most important part of the comment if the code is cleanly written. I don't want to be happily coding along, get to a glob and have to go to the repo pane, hunt for the commit that explains this particular thing, then read a commit message. Put it in a comment in the code. Pretty please.

macobo · on June 13, 2017

You need a queue in front of your database irregardless of write latency. Otherwise you tie your availability to database availability as downtime (even for upgrades) is often unavoidable and network problems are common.

Dan gave a pretty good talk about the high-level details of how/why postgres a couple of years back: https://www.youtube.com/watch?v=NVl9_6J1G60.

One reason of why pg is that SQL is really powerful for building complex queries.

macobo · on May 8, 2017

Note that working remotely takes some getting used to just as with working at an office. Yes, you can dump someone into unfamiliar conditions and draw conclusions from that, but that's just confirming your prior assumptions without actually trying.