For every programmer like you there's ten more at places where I've been employed trying to claim they need big data tools for a few million (or in some cases a few hundred thousand) rows in mysql. I get why you could feel attacked when this message is repeated so often, but apparently it still isn't repeated anywhere near enough.
I hope the NoSQL hype is over by now and people are back to choosing relational as the default choice. (The same people will probably chasing block-chain solutions to everything by now...)
Why should relational be the default choice? There are many cases where people are storing non-relational data and a nosql database can be the right solution regardless of whether scaling is the constraint.
Most nosql SaaS platforms are significantly easier to consume than the average service oriented RDBMS platform. If all the DBMS is doing is handling simple CRUD transactions, there's a good chance that relational databases are overkill for the workload and could even be harmful to the delivery process.
The key is to take the time to truly understand each workload before just assuming that one's preferred data storage solution is the right way to go.
You can have minimally relational data such as URL : website, but that's still improved by going URL : ID, ID : website because you can insert those ID's into the website data.
Now plenty of DB's have terrible designs, but there I have yet to year of actually non relational data.
That's fair and I'll concede that my terminology is incorrect. I suppose I'm really considering data for which the benefits of normalization are outweighed by the benefits that are offered by database management systems that do not fall under the standard relational model (some of those benefits being lack of schema definition/enforcement* and the availability of fully managed SaaS database management systems).
I'm also approaching this from the perspective of someone who spends more time in the ops world than development. I won't argue that NoSQL would ever "outperform" in the realms of data science and theory, but I question whether a business is going to see more positive impact from finding the perfect normal form from their data or having more flexibility in the ability to deliver new features in their application.
* I'm fully aware that this can be as much of a curse as a blessing depending on the data and the architecture of the application, which reenforces understanding the data and the workload as a significant requirement.
Because there are URL's in the website data and or you want do do something with it. Looking up integers is also much faster than looking up strings. And as I said you can replace URL's in the data with strings saving space.
But, there are plenty of other ways to slice and dice that data, for example a URL is really protocol, domain name, port, path, and parameters, etc. So, it's a question of how you want to use it.
PS: Using a flat table structure (ID, URL, Data) with indexes on URL and ID is really going to be 2 or 3 tables behind the scenes depending on type of indexes used.
> The key is to take the time to truly understand each workload before just assuming that one's preferred data storage solution is the right way to go.
Although that is true in principle, in reality that results in the messes I see around me where a small startup (but this often goes for larger corps too) has a plethora of tech running it fundamentally does not need. If your team’s expertise is Laravel with MySql then even if some project might be a slightly better fit for node/mongo (does that happen?), I would still go for what you know vs better fit as it will likely bite you later on. Unfortunately people go for more modern and (maybe) slightly better fit and it does bite them later on.
For most crud stuff you can just take an ORM and it will handle everything as easily as nosql anyway. If your delivery and deployment process have a rdbms, it will be natural anyway and likely easier than anything nosql unless it is something that is only a library and not a server.
Also, when in doubt, you should take a rdbms imho, not, like a lot of people do, a nosql. A modern rdbms is far more likely to fit whatever you will be doing, even if it appears to fit nosql better at first. All modern dbs have document, json/doc storage built in or added on (plugin or orm) : you probably do not have the workload that requires something scaleout like nosql promises. If you do, then maybe it is a good fit, however if you are conflicted it probably is not anyway.
> There are many cases where people are storing non-relational data
No, there are not. In 99% of applications, the data is able to be modeled relationally and a standard RDBMS is the best option. Non-relational data is the rare exception, not the rule.