Alan Turing - BTW: a movie about him will be hitting theaters soon... |
- First off, the answers are screwy and it's clear that much of what the computer heard it misinterpreted.
- Then they presented the AI as if it were an adolescent from war-torn Ukraine.
- And they also used the lowest possible threshold to gauge success - this threshold which represented a part of Turing's paper on the subject - suggested that success be declared if on average at least 30% of humans judging the AI would be fooled into thinking it was a human. So, the AI named Eugene, scored a 33% - but that is only because judges lowered the bar thinking he was a semi-illiterate teen.
More important than all of this of course is the central question as to whether or not the metric or test is actually an accurate way to assess machine intelligence anyway? In a way, every system that has ever tried to compete in one of these tests to date has been purpose-built to pass the test. But does that make it intelligent (if it were actually to pass it)? The technology necessary for a machine to "think" through a conversation the way a human does simply does not exist - nor are we even close to understanding what that model would even look like. The systems trying to pass the Turing Test are simply conversational "hacks," in other words they include built-in tricks like responding to a question with a question or trying to work off of keyword cues. What's missing of course is any continuity of thought - any consciousness - and even the most simplistic conversation requires that. None of these systems can think and none of them can really learn.
Now it may be that conversation hacking may become sophisticated enough in coming years so that many of these systems may actually pass the Turing Test threshold of 30% on a regular basis. But that test as it is now defined will never provide us with an accurate assessment as to whether a machine has in fact achieved some innate level of intelligence. There is no way to determine through the conversation if the system has "added value" to the topic rather than simply replied phrase by phrase in rather one-sided dialectics. It will be difficult to assess or acknowledge any growth or change. There is no expectation in a simple conversation to determine if you are in fact conversing with a self-aware entity.
In the movie Her, this guy falls in love with his operating system (and it didn't come from the Apple store!) |
The first thing we need to do before we tackle how we might achieve AI is to determine what the appropriate assessment or validation for human-like intelligence really needs to be. We are going to suggest one and explain the rationale for it...
The Technovation AI Test -
AI Test Prerequisites / Expectations
- The Test is not meant to assess acquired knowledge per se, it is meant to assess cognitive ability. In other words, it is not about preparation or repetition of learned information, but is concerned with potential and / or application of any particular knowledge set.
- The Test does not have to occur in one sitting, but take place over any duration (within reason).
- The Test isn't merely concerned with correct answers or maturity in a point of time, but can also assess the ability to grow over time based upon responses to various aspects of the test (or other stimuli encountered within the time-frame of the test).
- The Test is not merely a linguistic exercise - the machine must not merely demonstrate the ability to communicate like a human, it must also demonstrate it can learn.
- Foremost above all else though, the machine must demonstrate the one trait most closely associated human intelligence (as opposed to raw computing power) - it must demonstrate intuition. In this context, Intuition represents shorthand problem-solving (which we will discuss in much more depth in a future post).
- On last aspect of the test that must be included is a review of the code to ensure that "conversational snippets" are not allowed to be prep-programmed. This implies that the majority of dialog is generated 'real time' by the machine. Now, that would not prevent the machine from reviewing logs of previously generated dialog (in some database), but that review could not lead to verbatim quoting - rather must paraphrase or other restate previous points.
The AI Test
In a series of panel interviews, the AI must convince the judges or reviewers that it should be hired to perform a complex human role. The type of job and foundational knowledge can cover any number of topics but must be sufficiently complex to avoid "lowering the bar." (so, any job that requires a degree). Also, the interview style must be open (similar to essay tests in written assessments) - the answers must not just be correct, they must demonstrate value added insight from the intelligence conveying them. And the answers may be entirely subjective... (even better as long as the machine can rationalize them)
This test necessarily implies a very high threshold - perhaps in excess of a 90% rating for a very complex set of conversations. Why raise the bar this high? Simple - this is the one way we can force the development of a system that can both learn and apply that knowledge to problem solving and do it on the fly. To have human like intelligence, machines must have the ability to understand nuances of human communication and psychology - thus it must not only be able to interact, it must be able to convince us as well.Now that we have a more concrete target to aim for - how do we get there. In our next post, we'll delve into Learning - what works and what doesn't and how human and machine intelligence differ today.
Copyright 2014, Stephen Lahanas
#Semantech
#StephenLahanas
#TechnovationTalks
0 comments:
Post a Comment