Your last comment on the seemingly circular relationship between Clara and a "turing-test-passing bot" is especially salient. This reference (found in the second footnote in the post) http://www.ijcai.org/Proceedings/95-1/Papers/125.pdf has a lot of interesting perspectives on the role of machine learning vs. human intelligence.
On the note of bots and your denouncement of the Turing Test as a metric, I do remember an article on here a couple of days ago hinting that one of the biggest frustrations of customer service calls is dealing with scripted responses, bots if you will, instead of real humans.
I do wonder whether alleviating this user frustration will amount to passing some "similar" but possibly weaker test.
We think it's a good idea too ;) Machines and humans truly have different talents: machines are great at memory, keeping track of state, distributing information while people are great at understanding subtle nuance in natural language.
And in another 3-5 years, AI will start understanding subtle nuances too (probably) judging by the number of papers in computer language understanding.
Unfortunately "machines" (AI software, really) are really bad at nuance, subtle or blatant.
The number of papers in the field is a very good example of why: until you've read (enough of) them you have no idea what the state of the field is.
Edit: By this I mean that your assessment about the 3-5 years to "subtle nuance" is extremely, unrealistically optimistic. We're nowhere near AI understanding language. Try 300 to 500 years and you might be closer.
Even modern methods today (deep recurrent networks, etc.) can do pretty well with these kinds of tasks (a very large ontology for instance) if you have enough annotations of the nuance!
>> Even modern methods today (deep recurrent networks, etc.) can do pretty well with these kinds of tasks (a very large ontology for instance) if you have enough annotations of the nuance!
Where do you get that from? NLP, with neural networks or not, stays as safely away from meaning as is humanely possible while working in an area very closely connected to it.
Also, ontologies? Very few people are interested in that nowadays, though that includes the team that made Watson. The push is instead to do away with all that and rely on statistical approximation.
It's certainly possible. One advantage of our setup is that rather than getting ok -to- noisy labels from customers, our CRAs understand the end-goal of the application and generate pretty great data. We are also able to incentivize them to produce fewer errors.
Clara has a 1-hr SLA for the processing of an incoming message. While I cannot give numbers on the speed of annotators (or volume), I can say that our platform is designed to enable quick and accurate work via incentive mechanisms. We avoid fatigue in part by making it easy for CRAs to navigate and work with data. Wrt to overlapping annotator schemes, these are known to be effective. We'll be writing more about how our human backend works in future posts.
Thanks, in the future post would also be interested in the trade offs between building internal system vs. using a 3rd party (maybe you do?) like mechanical turk and the incentive structure.