Large datasets of real network flows acquired from the Internet are an invaluable resource for the research community. Applications include network modeling and simulation, identification of securityattacks, and validation of research results. Unfortunately, network flows carry extremely sensitiveinformation, and this discourages the publication of those datasets. Indeed, existing techniques for network flow sanitization are vulnerable to different kinds of attacks, and solutions proposed for microdata anonymity cannot be directly applied to network traces.
In our previous research, we proposed an obfuscation technique for network flows, providing formal confidentiality guarantees under realistic assumptions about the adversary’s knowledge. In this paper, we identify the threats posed by the incremental release of network flows, we propose a novel defense algorithm, and we formally prove the achieved confidentiality guarantees. An extensive experimental evaluation of the algorithm for incremental obfuscation, carried out with billions of real Internet flows, shows that our obfuscation technique preserves the utility of flows for network traffic analysis.