Artificial Intelligence has helped mankind to dream of doing things that would seem impossible just a decade ago. People nowadays are indulging more in using systems, built using technologies that directly engulfs Machine Learning. This calls for a need of reality check of the testing tools that such systems use, the algorithms, the training sets etc. so that we know that the systems we are relying on, for our day to day activities are accurate and as even-handed as possible.
Training data set in AI: How strong is the basis?
Ece Kamar, a researcher in Microsoft’s adaptive systems and interaction group, is working on using a combination of algorithms and human expertise to eliminate data and system imperfections. Kamar says that people are already trusting AI for important tasks of their daily life hence it is equally important to for us – the developers and researchers to trace back and see where the systems are making mistakes.
Kamar has pointed out some practical shortcomings of the training data set that the developers use to teach the system to do a particular task. Such training data sets can have blind spots which can lead the system to deliver false results. Many developers and researchers rely on such training data sets instead of building one on their own, based on system requirements, which might result in failure to learn the specific task at hand. Thus, Kamar and her fellow colleagues have worked on an algorithm that can be used to identify such blind spots in predictive models, allowing the developers and researchers to fill in the cracks before releasing out the system.
In an another research paper, Kamar, and the team aims at to show the direct correspondence between the different type of mistakes in a complex AI system and the incorrect results thrown out by such systems. With a defined methodology, Kamar aims to help the researchers identify the problems, troubleshoot them and root out the possibility of potential system failure.
Microsoft Speech Language Translation Corpus
In an another attempt to test out the accuracy of the conversational translation, Christian Federmann, a senior program manager working with the Microsoft Translator team, has developed, with his colleagues, Microsoft Speech Language Translation (MSLT) Corpus which acts as a standardized data set for testing bilingual conversational speech translation systems. Released now for public usage, MSLT Corpus contains conversational, bilingual speech test and tuning data for English, French, and German collected by Microsoft Research.
Derived from the need of a high-quality data set while aiming for a high-quality test, the Corpus focuses on creating a standard against which people can measure how well their conversational speech translation system works. Researchers and developers working in the field of conversational translation systems usually rely on data that is freely available but not up to the mark for testing such high-end systems.
The team at Microsoft is hoping that the Corpus will do good to the conversational speech translation field and help to create even more standardized benchmarks. Both of these research papers will be presented at AAAI Conference on Artificial Intelligence, which is currently being held in San Francisco and help the developers and researchers working in the relevant fields to propel forward.