Abstract
Federated learning (FL) offers a robust and privacy-preserving approach for developing collaborative intrusion detection systems (IDS). However, statistical variance severely hinders its practical application. Although privacy-preserving federated learning models have been used to develop intrusion detection systems for cyberattacks, problems arise when statistical variance is present. In practice, the performance of the FedAvg algorithm is significantly affected by the heterogeneous distribution of customer data in a real-world network. This distribution causes skewness among customer data, resulting in poor detection accuracy, delayed convergence, and model instability. In this paper, presents conduct a comprehensive comparison of the Scaffold algorithm with the FedAvg baseline using the CICIDS2017 datasets. Because the Scaffold algorithm addresses the client skew problem using control variables, it is considered a state-of-the-art federated optimization technique under the heterogeneous partitioning approach. This paper documents the importance of using the Scaffold algorithm as a reliable and essential tool for building high-performance detection systems in a variety of scientific settings. Therefore, our results demonstrate that Scaffold achieved more stable convergence and outperformed FedAvg, with a 15.1% increase in F1-score and a 13.6% higher overall accuracy under highly skewed data distributions. The present evaluation process operates through simulation testing, but physical testbed implementation remains essential for future work to evaluate real-world deployment challenges.