If you’ve worked with Apache Kafka and .NET, you’re likely aware that the out of the box experience of consuming messages, using Confluent’s client library is geared towards processing records sequentially
If you’ve worked with Apache Kafka and .NET, you’re likely aware that the out of the box experience of consuming messages, using Confluent’s client library is geared towards processing records sequentially. If you want to process multiple records in parallel, you’ve got a few of options that come to mind:
- Create multiple instances of your service, e.g. scale to multiple Kubernetes pods - works, but it’s a waste of resources, and we’re limited by the number of Kafka partitions
- Create multiple instances of the Kafka consumer inside your service - less wasteful, but still not ideal, as we have multiple open connections unnecessarily, plus we’re still limited by the number of partitions
- Use a single consumer, forward the records to be processed in parallel by multiple threads - the best solution in terms of resources, but it means you now need to implement some non-trivial logic to ensure no records are lost (i.e. offset committed before it was actually processed), and that order is maintained (if that’s relevant)
For some time now, I’ve been thinking about implementing a proof of concept that implemented option 3, not only because there aren’t that many options for .NET (but there are some, more on that later), but also because it felt like an interesting problem to tackle. Well, this post is about this proof of concept, a little library I called YakShaveFx.KafkaClientTurbocharger 😁.