The biggest problem with machine learning occurs when people subscribe to the belief that it's a black-box solution. The truth is that you can't just drag-and-drop your data into a pre-existing solution. The types of algorithms you use depend on the types of problems you're trying to solve (e.g., classification, regression, clustering). The data you collect depends on the algorithms you use.
Sure, prediction APIs could arise that give detailed use cases for each algorithm, but then there's a problem with the fringe cases: you might not know that two pieces of data are so heavily correlated that they completely shatter a conditional independence assumption, for example.
As a hacker who originally subscribed to the belief that a thorough understanding of machine learning was overkill, it is without hesitation that I admit being 100% wrong. The truth of the matter is that when it's done properly, artificial intelligence and machine learning ought to be inextricably linked with your core business processes.
Agree: I had a dataset for work no one had yet been able to use in categorizing two effects (one category was 98% of all the data). The values looked too "Gaussian normal" with everything mixed up. It couldn't be separated out, but a combination of SVM and in dept knowledge of the source of the data and I was able to find a generalized model that could accurately categorize parts 80%+ of the time for the small set, without misclassifying the other 98%. All other methodologies had failed up to that point and a blind approach with linear regression or SVMs resulted in at best 70% accuracy on all categories... not very good or implementable in a production setting (that means in the bulk of cases the 98% I was only correct 70% of the time).
I can certainly see a role for somebody that understands the tradeoffs of each of these algorithms and that understands how to properly select and prepare dataasets. But I wonder how many people will really need to be able to actually implement these algorithms.
I think the regress you're talking about is super important---black box AI only goes so far---but I also think there's great benefit to just applying the first layer of broken, incorrectly paired ML to a new field.
My prediction is that even the most black box ML, creatively applied, is and will be an incredible skill. Increasing levels of sophistication will continually kill off the current practices of black box ML, but the willingness to apply statistical pattern recognition to new and interesting areas can't help but be incredible.
Sure, prediction APIs could arise that give detailed use cases for each algorithm, but then there's a problem with the fringe cases: you might not know that two pieces of data are so heavily correlated that they completely shatter a conditional independence assumption, for example.
As a hacker who originally subscribed to the belief that a thorough understanding of machine learning was overkill, it is without hesitation that I admit being 100% wrong. The truth of the matter is that when it's done properly, artificial intelligence and machine learning ought to be inextricably linked with your core business processes.