Integrating Machine Learning into your .NET applications

There were a lot of great announcements from Build this year, from Visual Studio Live Share to Azure Sphere and everything in between. One thing that I was particularly excited by was the announcement of the preview of ML .NET, a cross-platform, open source machine learning platform for .NET, allowing those writing .NET applications to build, train and deploy machine learning models into their applications without having to necessarily call out to external APIs or libraries written in other languages running on the same system. This current preview release allows for classification and regression models to be built and used, but also brings with it a first draft of a set of APIs for training models and the core components of the framework, meaning that popular libraries such as TensorFlow and CNTK can be included in future. The process for creating a training pipeline is pretty straight forward and I've spent some time trying it out by building a classification model for the Notting Tram twitter feed, trying to determine if a given tweet is a notification about delays, that the trams are running, marketing and a few others. The initial version isn't that accurate as I only had about 70 tweets to train it from which is really no-where near enough, so training is on-going. But it works and I can save out the trained model and load it back again later so I can use it without re-training again. The pipeline for training looks a bit like this.

var pipeline = new LearningPipeline
{
 new TextLoader<TweetData>(DataPath, true, "comma"),
 new Dictionarizer("Label"),
 new TextFeaturizer("Features", "TweetText"),
 new StochasticDualCoordinateAscentClassifier {FeatureColumn = "Features", LabelColumn = "Label"},
 new PredictedLabelColumnOriginalValueConverter {PredictedLabelColumn = "PredictedLabel"}
};

var model = pipeline.Train<TweetData, ClassificationPrediction>();

One of the things I had to work out from the couple of examples on-line is the use of the Dictionarizer, because training only works with numbers and not labels you first need to tell the pipeline to convert your labels in the training data to numbers and then to convert them back again afterwards using the PredictedLabelColumnOriginalValueConverter class. Once it's trained and you have a model you can then run predictions based on some new piece of data, in this case I made up a couple of fake tweets and fed them back in.

var tweets = new[]
{
 new TweetData
    {
        Id = "10001",
        Language = "en",
        TweetText = "We have a disruption to our service due to a fire in a building at the Royal Centre"
    },
 new TweetData
    {
        Id = "10001",
        Language = "en",
        TweetText = "Don’t forget anyone heading to watch Forest v Derby with a match/season ticket can take advantage of our £2 Event Ticket when you travel #TheTramWay"
    },
};

foreach (var tweet in tweets)
{
 var prediction = model.Predict(tweet);
 Console.WriteLine(tweet.TweetText);
 Console.WriteLine($" -- {prediction.PredictedLabel}");
}

Because I based these on some existing tweets I got the answers I was expecting to see, trying it with some others I completely made up didn't really give me the results I wanted but like I said, it was trained on about 70 tweets. The output from this bit of code is the following, where the tweet text is written back out and the predicted label is written below it.

We have a disruption to our service due to a fire in a building at the Royal Centre
  -- Delays
Don't forget anyone heading to watch Forest v Derby with a match/season ticket can take advantage of our £2 Event Ticket when you travel #TheTramWay
  -- Marketing

It's early days for ML .NET but it is pretty exciting and gives developers a great way of being able to integrate machine learning into applications. Maybe you want to provide sentiment analysis on customer comments in that call center application, predict prices in your purchasing tools or anywhere else where you want to enrich data for your users, it's worth having a look and giving this library a try out.

You can find the project over on Github.