Azure Stream Analytics (ASA) allows real-time analytic computations on streaming data.
Stream Analytics allows one to set both inputs and outputs for the job and the bulk of the logic/operation is done within a T-SQL Query , which makes up the Streaming job.
The ‘Job Topology’ outlines this in the Azure Portal:
Within the input data configuration, Microsoft offer a ‘Reference Data’ datatype which in simple terms is just a lookup table for enriching the incoming data (this can be seen above). Reference Data is a finite data set that is usually static or slowly changing in nature. However, sometimes it may be required that Reference Data is dynamic. To put this into perspective; take for example, analytics being performed on specific keyword combinations in searches made online. Though most the incoming data stream would still be of a slowly changing nature, it is important for the reference data to be dynamically adaptable to be able to correctly function over search terms that have not been previously encountered. The incoming data stream may include a single or more search terms that do not exist in the look up data (reference), what does the stream analytics job do in this situation? This is where dynamic reference data takes play, in being able to handle such scenarios, to correctly enrich the data and output the correct result.
Such was a requirement for a recent project , therefore to achieve Dynamic Reference Data, a query was written thus (this is only an example to demonstrate, there are other ways this can be written):
SELECT T.Value AS <Alias1>,
CASE
WHEN (<Value> LIKE <Value2>) AND (R.Column = <Value3> OR R.Column IS NULL) THEN <Value4>
WHEN …
ELSE <Value5>
END AS <Alias2>
INTO
<Output>
FROM
<EventHubInput> T
LEFT JOIN <ReferenceDataInput> R ON R.Column = T.Value
To re-iterate, the above is only one such example; which has been used for dynamic reference data. The CASE statement is what allows the data to adapt dynamically to the incoming data stream; the great thing is that one can have as many WHEN clauses as needed; enough to cover all scenarios and logic for the bespoke needs.
The ‘Test’ functionality in ASA allows one to check the validity of the query and verify the correct results are being obtained over a sample piece of data, so one doesn’t have to worry about whether the query will correctly run once started. Considering the output is configured correctly, Microsoft Azure will take care of everything else, all that needs to be done is to start the ASA job!