Detailed Info
Efficient topic partitioning of Apache Kafka for high-reliability real-time data streaming applications
Authors: | Theofanis P. Raptis, Claudio Cicconetti, Andrea Passarella |
Title: | Efficient topic partitioning of Apache Kafka for high-reliability real-time data streaming applications |
Abstract: | Apache Kafka is a widely-used event streaming platform for reliable high-volume real-time data exchange following a producer–consumer pattern. Despite its popularity, Apache Kafka requires expertise and attention to detail, and there are no default guidelines that can be applied to all use cases without careful consideration. In this paper, we propose a novel approach to optimise the number of partitions and brokers in Apache Kafka, which are two key configuration parameters, under the given characteristics and constraints of the target applications. In particular, we consider the distribution of data-intensive real-time flows exchanged between a set of producers and consumers, which is representative of fog computing environments for ML/AI analytics. We introduce a methodology for modelling the topic partitioning process in Apache Kafka and formulate an optimisation problem to determine the optimal number of partitions to satisfy the application requirements and constraints. We propose two efficient heuristics to solve the optimisation problem, considering the trade-off between resource utilisation and application performance. We evaluate the performance of our approach through numerical simulations, and we demonstrate its practicality by implementing a prototype on an Apache Kafka cluster and conducting experiments in three different scenarios focused on mass consumption vs. production and real-time data streaming. To carry out repeatable experiments in controlled conditions, we developed a reusable framework that fully automatises cluster setup and performance assessment, and we make it available to the community as open-source software. |
Publication type: | Journal |
Title of the journal: | |
Year of Publication | 2024 |
Pages: | 173-188 |
Number, date or frequency of the Journal: | Volume 154, May 2024 |
Publisher: | Elsevier |
URL: | https://zenodo.org/records/10489464 |
DOI | 10.1016/j.future.2023.12.028 |
Menu
- Home
- About
- Experimentation
- Knowledge Hub
- ContactResults
- News & Events
- Contact
Funding
This project has received funding from the European Union’s Horizon 2020 Research and Innovation program under grant agreement No 957337. The website reflects only the view of the author(s) and the Commission is not responsible for any use that may be made of the information it contains.