Microsoft is inviting third-party developers to have an access to its Custom Recognition Intelligent Service (CRIS), a tool for enabling speech-to-text functionality in their apps. Along with that, the Redmond giant is also releasing public previews to two of its APIs- Speaker Recognition APIs and Video APIs.
These announcements came via a blog post by Ryan Galgon, Senior Program Manager at Microsoft Technology and Research and are a part of Project Oxford. Project Oxford is an initiative by Microsoft to make its machine learning and artificial intelligence technology available to developers for implementing intelligence to their apps without being AI experts.
After the latest announcements, the developers who get access to CRIS will be able to customize the Microsoft speech recognition system to a particular vocabulary, environment, and/or user population. This means that the problems which non-native speakers often face while using speech-to-text functionality can be eliminated.
CRIS can also be used in the apps which are to be used by people working in noisy environments such as a loud shop floor or busy shopping center.
The speaker recognition APIs will help to recognize a person from their voice and can be used for stronger authentication purposes. They can also be used for enhancing customer service experience by automatically identifying the calling customer without any manual process for identifying the customer.
On the other hand, video recognition APIs will help in analyzing and automatically editing videos using Microsoft video processing algorithms. These algorithms can detect and track faces in a video, detect when motion has occurred in videos with stationary backgrounds, and can also smooth and stabilize videos.
Developers can get access to these APIs and implement them in their apps for better features. Many similar Microsoft tools are already using similar technologies, for example, the facial recognition tool, age-detection tool and last month, Microsoft had launched an emotion-detection tool which can tell whether the person in picture was happy, sad or angry.