I still have nightmares of SQL servers optimizer taking down production systems ...

fennecfoxen · on Jan 18, 2022

I worked for a payments firm that submitted transfers in bulk every day, but had a monthly surge day, and a late cut-off period to submit or cancel payments. It doesn't matter if you can do everything else in 15 minutes, if the query planner decides to take a 30-second query and turn it into a 4-hour nightmare because different customers are making different numbers of payments today, and it trips over some Postgres heuristic.

The Postgres query planner was, quite frankly, the enemy. By the time I left we were at the point that we considered it a business risk and were looking at alternatives. If you need queries to run in a predictable amount of time — forget fast, simply predictable — then Postgres is quite simply not fit for purpose.

slt2021 · on Jan 18, 2022

these query issues are most of the times not SQL Server problem, they are rather symptom of poor data model architecture, table design.

A lot of times I see in operational DB hot data gets mixed with warm and cold data, and your hot path query with ton of JOINs/subqueries will get rekt.

Proper redesign and rearchitecture helps to provide large buffer against these problems.

Also not everything should be inside SQL server, if you run large query every 5 seconds or something - probably consider using in-memory cache or denormalized model

fennecfoxen · on Jan 18, 2022

Excuse me. You are replying with advice that is fundamentally inappropriate for a batch workload (describing a "hot path query" and suggesting in-memory caches), and it shows that you have not read and comprehended the post I wrote.

Besides which, if you need to re-architect and denormalize by moving records in and out of different tables just so you can run what is essentially a monthly report in a predictable (not fast! just predictable) amount of time -- well then, why bother using Postgres in the first place, you might as well bite the bullet and go with something NoSQL at that point, because at least when you read from a given Cassandra partition you know all the results will be right next to each other, 100%; why leave it to chance? you've been burned by Postgres before

slt2021 · on Jan 18, 2022

As a developer I never had to blame SQL server, it was all the time develop mistake: poor architecture, poor query, some data quality issues. Dont blame the platform for developers mistakes, thats what I was trying to say

fennecfoxen · on Jan 18, 2022

The wise man builds his house upon the rock. The foolish man builds his house upon the sand. I will 100% blame the platform when what it provides is deficient, for selecting a deficient platform is one of the gravest mistakes a developer can make.

And Postgres, again, is materially deficient in its query performance predictability, and its developers are insistent that they have no desire to allow the approaches that would mitigate it. If this matters to your application, then the prudent developer will drop Postgres and use a real database.

SigmundA · on Jan 18, 2022

Must not have been working with it very long, off the top of my head having to disable the "new" optimizer in Sql 2014 and go back to the 2012 version due to bad plans https://sqlservergeeks.com/sql-server-2014-trace-flags-9481/

I have been working with it since way back in 6.5 and it has made many obviously wrong decisions in the optimizer over the years, its has gotten better but is by no means perfect, good thing is it has hints unlike PG!

slt2021 · on Jan 18, 2022

So whats the issue with marginal difference in cardinality estimators? Both queries seem perfectly fine speed-wise.

Just splitting hairs, again most of the time developers fail to notice the significant data distribution drift, and dont rearchitect their data models then blame the engine for developers omissions and invalid assumptions

SigmundA · on Jan 18, 2022

https://www.sqlconsulting.com/archives/big-performance-probl...

fragmede · on Jan 18, 2022

Blaming the table architecture doesn't get rid of the fact that the PostgreSQL planner can make decision that will cause a query to take an unknown amount of time.

newlisp · on Jan 18, 2022

I guess that is why cloud databases like spanner charge big bucks.