Aug 9, 2017

Socket Host Service (C#)

0 comments

We have been working on integrating a TCPListener written in C# within Azure Service Fabric enabling the streaming of raw data off an Event Hub and back to a socket.

The main advantage for this apart from giving the customer a more flexible approach to development and testing is that it made our life so much easier when debugging our Spark code!

 

The ability for multiple clients to receive their data off the event hub is now supported.

 

The ability for a single client to request data from a date in the past is now supported too!

 

Currently working on integrating SSL Certificate support into the C# codebase using X509Certificate to instantiate the certificate and SslStream to handle the data stream mechanics.

 

Here is an example of a live stream.

 

 

New Posts
  • Logging is hard, especially in microservice world. Fortunately, not with service fabric as it's trivially extendable to meet our complicated needs. In my latest journey I needed to track log requests across multiple microservices calling each other via Service Remoting. There is no easy way to do it, so I opensourced a solution and an article for this on GitHub: https://github.com/aloneguid/logmagic/blob/master/doc/packages/azure-servicefabric.md
  • In one of our Service Fabric services, we host a TCP socket to feed data into another system. The implementation is simple, using a regular TcpClient to accept listeners and read and write to them. As part of the behaviour of this socket, we listen for an initiation command after connecting. To enable the service to be hosted in a live cluster and open up that socket in Azure, we had to add a Load Balancing rule in the Azure Load Balancer, with a Health Probe to determine what nodes the service is deployed to. The probe hits the same TCP socket, and according to documentation, should just try to connect, perform a 3-way handshake, and indicate a node as healthy if it succeeds. This process turned out to be a bit more complicated. When our other system, or a tester application, was hitting the public port, it was timing out. It seemed the port simply wasn't opened by the Load Balancer. When we ran our service locally or logged in to the Service Fabric node, the services were working fine. Only when we hit the public port as configured in the Load Balancer rule did we get timeouts. We noticed from our logging that our service was being connected to by the probe at regular intervals - and that the connection succeeded - but the probe was still marking our services as unhealthy, and ending up closing the public port entirely. If we moved the probe over to a port on the machines we know is open (such as RDP on 3389), we could get traffic through and connect as we expected. As such, we know the problem was in some way related to the handling of the TCP socket in our code. After some experimentation, we determined that the problem was that our socket was never forcefully closed. Our socket kept waiting for an initation command, and the load balancer probe, being unable to send one, times out its connection after a while. Once we added our own timeout on receiving an initiation command, and gracefully shut down the connection if we don't receive one, the probe started recognizing our service as healthy again. So, while the Azure Load Balancer documentation specifies a service will get marked as healthy if the TCP handshake succeeds, it will actually wait for the connection to close before doing so. If the connection doesn't get closed, it gets marked as unhealthy and the public port will be unable to pass through any traffic. Lessons learned :)
  • While developing new solutions, we often use Service Fabric services as controllers.  Executing actions on demand, checking the status of other Azure services, running maintenance routines, cleaning up data, the possibilities with Service Fabric are endless.  Often, that means we have a cluster of services deployed to different environments, but want to disable a particular service. In the Application Parameters, it's possible to set services to run a given instance count, but that number is not allowed to be 0. This has been an open requested issue for a long time ( https://feedback.azure.com/forums/169386-cloud-services-web-and-worker-role/suggestions/743252-allow-a-role-instance-count-of-0 ). To allow our services to be disabled, we wrote something we call a NoopService. It's a service implementation that does nothing except wait: public class NoopService : StatelessService { public NoopService(StatelessServiceContext context) : base(context) {} protected override async Task RunAsync(CancellationToken cancellationToken) { await Task.Run(() => cancellationToken.WaitHandle.WaitOne()); } On it, we also find a static method that will check for a config setting: public static bool NoopEnabled(StatelessServiceContext context) { var configurationPackage = context.CodePackageActivationContext.GetConfigurationPackageObject("Config"); var serviceEnabled = configurationPackage.Settings.Sections["Default"].Parameters["Enabled"].Value; return !serviceEnabled.Equals("true", StringComparison.OrdinalIgnoreCase); } } This code is extracted out to a separate project, since all the services in the solution can use it. In time, we could turn it into a Nuget package, so it can be reused by everyone. Next, we need to configure the config setting that IsEnabled() method looks for, for every service that we might want to disable, in the Application Manifest: <Parameters> <Parameter Name="MyService_InstanceCount" DefaultValue="1" /> <Parameter Name="MyService_Enabled" DefaultValue="true" /> </Parameters> <ServiceManifestImport> <ServiceManifestRef ServiceManifestName="MyServicePkg" ServiceManifestVersion="1.0.0" /> <ConfigOverrides> <ConfigOverride Name="Config"> <Settings> <Section Name="Default"> <Parameter Name="Enabled" Value="[MyService_Enabled]" /> </Section> </Settings> </ConfigOverride> </ConfigOverrides> </ServiceManifestImport> And in the service's config file: <Section Name="Default"> <Parameter Name="Enabled" Value="" /> </Section> Once this basic setup is done, each service's Main method can decide which service to instantiate, the NoopService if the service is disabled in the configuration setting, or the real service if it isn't: ServiceRuntime.RegisterServiceAsync("MyServiceType", context => (NoopService.NoopEnabled(context) ? (StatelessService)new NoopService(context) : new MyService(context))).GetAwaiter().GetResult(); In our Application Parameters file, we can now simply provide a value for the 'Enabled' configuration setting to enable or disable a service by environment. As you can see, the principle behind this disabling is simple, yet offers us a lot of power: we can disable a service without changing its InstanceCount, we can choose exactly which services to enable in which environment, and we can disable a service without impacting any other operation. The service will show up as having healthy instances, but won't do anything.  Additionally, this has proven very useful during debugging on the local cluster. We can disable the entire application except the service we want, and so debug it in complete isolation. Give it a try, let us know what you think!