Guess the only valid point is 3: "Not novel at all — it represents a specific implementation of well known techniques developed nearly 25 years ago"
Probably true..
The Map/Reduce paradigm was being talked about in papers about functional programming languages in the early 1980's.
There really are cases where a "full scan" is the fastest way to do something, and, when it works, sequential I/O can be orders of magnitude faster than the random access I/O used when you've got indexes -- particularly if you have to create the index in order to do your job. I've written systems that process hundreds of millions of facts, and I can do a "full scan" of these in 20 minutes on an ordinary desktop computer whereas it takes about 4 days to load these into an index in mysql or an RDF database.
Now we know that it's possible to parallelize SQL databases quite a bit, and commercial products are there, which leaves two questions for extra credit: (i) why do the "cool kids" completely ignore these commercial products, and (ii) why are there no Open Source projects in this direction?
> Guess the only valid point is 3: "Not novel at all — it represents a specific implementation of well known techniques developed nearly 25 years ago" Probably true..
Unfortunately, the USPTO does not agree. MapReduce was patented in 2010.
Guess the only valid point is 3: "Not novel at all — it represents a specific implementation of well known techniques developed nearly 25 years ago" Probably true..