Last April, we saw Microsoft open-source its deep learning toolkit called the Computational Network Toolkit and make it available to researchers via Codeplex under a restricted open-source license. The software giant announced on Monday that the toolkit will now be available for anyone to use on GitHub.
CNTK was, according to Microsoft’s chief speech scientist Xuedong Huang, developed by the company’s researchers out of necessity in order to make faster improvements to how well computers can understand speech. The toolkit is, in short, based on a deep learning algorithm which makes machine learning faster and much more efficient. In fact, deep learning is currently one of the most popular type of artificial intelligence.
The approach is inspired from biological neural networks, that is the human brain, and how they handle real-time learning and data storing. It involves training artificial neural networks on a large set of data and using this data to influence future outcomes.
According to Huang, CNTK has proved more efficient than four other popular computational toolkits including Theano and Google’s TensorFlow in Microsoft’s own benchmarks. To be honest, there seems to be no competition at all, CNTK provides almost double the performance than any other deep learning toolkit, though the tests were made on GPU-based systems.
Along with moving the software to GitHub, Microsoft is also getting rid of the restrictive license that came along with CNTK. Now developers are free to make changes and use the toolkit as they please. Microsoft believes that this will strengthen the ecosystem and the toolset.
The Redmond giant is internally using CNTK on a set of powerful computers that use graphics processing units, or GPUs. And although GPUs are actually mainly used for dealing with computer graphics, researchers have found that they are ideal for processing real-time learning and implementing algorithms. The blog post announcing the new development reads:
Chris Basoglu, a principal development manager at Microsoft who also worked on the toolkit, said one of the advantages of CNTK is that it can be used by anyone from a researcher on a limited budget, with a single computer, to someone who has the ability to create their own large cluster of GPU-based computers.
The researchers say it can scale across more GPU-based machines than other publicly available toolkits, providing a key advantage for users who want to do large-scale experiments or calculations.
Other companies like Baidu, Facebook and Google have all released open-source deep learning toolkits in the past. And although Microsoft had done the same last year, its toolkit was governed by a restricting license which limited the software to non-commercial uses.