I gave a talk last week on MMLSpark, Microsoft's latest innovation to scale and operationalise Data Science. Two posts ago you heard about Machine Learning Modelling and Experimentation which couples fantastically with MMLSpark through a client called Workbench which contains a myriad of features that I described.
In today's short post I will talk a little about the talk and demos I gave last week. MMLSpark is a fantastic library which is built with the idea of fitting into Spark's MLPipelines framework. The library is built on top of toolkit called CNTK which comes out of Microsoft Research and provides some wonderful Deep Learning capabilities including feed-forward DNNs, CNNs, RNNs and LSTMs. The last couple of weeks I've had the chance to break the back of MMLSpark and I am incredibly impressed. I shared the stage with Dash from a company called Qubole, who gave a great talk on Intel's BigDL. The first thing about this framework which struck me was that it was it too configurable and didn't wrap up the nice abstractions that MMLSpark gave me.
For example, MMLSpark has OpenCV bindings which allow you to build image manipulation into a stage of the MLPipeline. The BigDL equivalent took two screenloads of code just to produce a bunch of numbers from the images that the algorithms could understand. Clearly Microsoft has thought through the development of this library with tightly controlled well-structured classes each of which use the composability of Spark's MLPipelines.
The last two weeks allowed me to catch up on some great concepts. I'll replay some of them here based on a sample I created and tweaked in the Microsoft samples gallery for image recognition using MMLSpark.
1. Transfer Learning: This is a great way to take a pre-trained neural network and apply it to a new neural network. For example if I have a set of images I can painstakingly train a model over and over again to recognise some basic things about the image or I can use the concept of "hidden layers" in the network to convey the information through a set of weights and transfer what we learned from the previous network to the new one. I tried this in a few lines of code using MMLSpark with the airplanes vs automobiles image bank using a pre-trained Microsoft model called "ConvNet". The code simply loads in the model and transfers them to all layers except "l8". When the model is trained, l8, which is the final layer is trained with the new model's weights.
As you would expect the sheer volume of data in the Microsoft model allows this new model to both quickstart and inherit the raw the scale of the learning from the transfer learning procedure.
2. Ensembling: I read about this and did some research but loved the idea of an optimisation on image recognition. For the Snow Leopard example there were many camera shots of Snow Leopards and in some instances images could be grouped together and if some bore bad results the other images in the group could average this out and change the output of the model through a simple statistical procedure. This is especially relevant when you have a many images in a close time sequence. The model will treat them independently unless you use ensembling.
3. Augmentation: I tried this recently with my own example of kittens (inspired by @andyelastacloud) to see whether I could boost the performance of my simple cat recognition model and it worked! This is simply about flipping the photo around and doubling the number of images so that sideways cats could be learned better!
4. GPU offload: You can offload all work to a GPU node(s) in the cluster which significantly speed up training time.
Try out this library and have some fun with images, it makes it easy. What a great abstraction!