As someone studying this intensely, it's quite the opposite. Basic ML can (and has) been commodified with good toolkits and APIs. Additionally, much of practical ML is just applications of already invented algorithms to fields that just haven't even seen them yet.
But that said, the deeper message is in interpretation and discovery from data. Large data, small data, highly structured data, or just regularized DB pulls. The heart of it is statistical pattern recognition and it's really just begun to be broached (even academically) in the last 25 years.
I respectfully disagree. Tools like Weka, nltk, etc. are okay for exploratory data analysis, but it's risky to use them for problems that scale, problems that differ from the norm, or homegrown solutions for data that does not yet exist. Because a large portion of HN users are interested in bringing their ideas to life, I'd suspect that the latter particularly resonates with them.
The problem facing people who intend to work with data that does not yet exist becomes one of feature selection: what data matters and how do we use it? For NLP tasks, does stemming matter? What about part-of-speech tagging? Some classification problems are not linearly separable, which makes certain kernel methods impossible without using (and knowing to use) the kernel trick.
In the end, I think my reply here is tautological: ML is too complex to be transformed into a set of APIs a la Google Maps and Google Search.
The problem facing people who intend to work with data that does not yet exist becomes one of feature selection: what data matters and how do we use it? For NLP tasks, does stemming matter? What about part-of-speech tagging?
Indeed, I worked on machine learning in NLP (fluency ranking, parse disambiguation). As a general rule, roughly 90% of the improvement of models is in clever feature engineering and exploiting the underlying system to get more interesting information that improves classification, 10% you get from using more clever machine learning techniques than, say a standard maxent learner with a gaussian prior (for linearly separable data).
For instance, the last relatively large boosts of the accuracy of the parser developed by our research group came from feature engineering:
I think we just disagree on what "basic" ML means. I think a lot of real problems have solutions which involve very simple applications of poorly tuned ML algorithms.
Engineering even a basic ML solution is challenging---feature engineering especially.
Actually, Google Prediction API is very simple and it covers supervised learning (regression and classification) already. I can imagine very simple extensions (of the API itself, the algorithms would be completely different) to cover a lot of the unsupervised and semi-supervised ground as well.
The algorithms are not disclosed, but the docs hint that they are properly regularized so throwing more features at them is always good.
You still need to be able to reformulate the problems so that they fit a standard ML setting and then know how to tune things, but it looks like the API can get you pretty far.
But that said, the deeper message is in interpretation and discovery from data. Large data, small data, highly structured data, or just regularized DB pulls. The heart of it is statistical pattern recognition and it's really just begun to be broached (even academically) in the last 25 years.