Operationalising Data Science is something of a difficult question for us here at Elastacloud, given the wealth of options available (AzureML, Spark, python web service etc). Recently as part of Renewables we’ve been updating our model to the latest and greatest and wanted to move away from using AzureML, which is notoriously bad at scaling. We are investigating a number of long term solutions to this problem, which will most likely involve DataBricks (spark) and ML Pipelines. In the short term we wanted to prove out a solution using Azure Machine Learning Model Management. This provides capabilities such as;
Tracking models in production
Deploying models to production through AzureML Compute Environment with Azure Container Service and Kubernetes
Creating Docker containers with the models and testing them locally
Automated model retraining
Capturing model telemetry for actionable insights.
Once the appropriate artifacts have been created it’s a small set of simple steps to deploy that given model as a web service. One big advantage is the ability to scale the number of nodes in the container service, giving us significantly more control over Azure ML. We can always deploy the derived docker container into other environments should we choose.
However, I digress, the purpose of this post is primarily around how we dealt with integrating this into our existing application. Originally, the AzureML model was invoked directly from an Azure Stream Analytics (ASA) instance (via a UDF). This was not possible using Azure Model Management. Instead, we restricted our ASA to emit the data for which we wanted to make predictions to an Event Hub. We wrote an Azure Function with an Event Hub Trigger (yes, we could call this directly from ASA, but that tightly couples us and inhibits our ability to hive off this stream for other purposes). This Azure Function emits to another Event Hub from which an ASA job reads and write the predictions to a database.
One problem that did keep cropping up during testing was a “Too many requests for service irrmodel (overloaded)” error. This as you can imagine was somewhat unexpected given we have a service that should be responsive to whatever we want to throw at it. It took us a while to track down and understand, but, long story short, the Azure Function was pulling messages off the event hub at too high a rate. Given by default it will attempt to process 64 messages at a time and given ASA batches messages it writes to event hub (~600 messages packed into a single event hub message) we were in effect trying to process ~38400 messages at a time. We reduced this by overriding the defaults in host.json and coupled with batching requests in the Azure Function the error was avoided. The problem now is to figure out the optimal values for message throughput and parallel invocation of the deployed model.