If you need a pic of me for events, here it is.

Research interests:

  • Information leakage estimation for security&privacy
  • Theory, foundations, and privacy-security properties of Machine Learning
  • Methods for distribution-free confident prediction in supervised learning and anomaly detection (e.g., Conformal Predictors)

Have a look at a list of recent projects.

I co-founded the CTF team TU6PM.

I am a (happy) OpenBSD and QubesOS user.

news

Oct 9, 2024 In November, we will launch a cross-user prompt injection competition, where participants will have to interact with a simulated LLM-capable email client to run commands on a user’s behalf. Watch this space for updates!
Aug 14, 2024 One can get a closed-form approximation of the risk against membership inference for DP-SGD, and we released an interactive tool that uses this idea to help trimming DP-SGD’s parameters. We can also get data-dependent guarantees for the risk of attribute inference; code for this is available too. Based on our USENIX24 work.
Aug 10, 2022 Our work on evaluating website fingerprinting in the real world was awarded: i) the Internet Defense Award (2nd place) sponsored by Meta, and ii) a Distinguished Paper Award (USENIX ‘22)!
May 25, 2022 Our work on reconstruction attacks against ML models was accepted by IEEE S&P 2022. Check out Jamie’s wonderful presentation!
Feb 7, 2022 I joined Microsoft Research Cambridge and the Microsoft Security Response Centre (conveniently, both acronymise to “MSRC”). I will work as a Senior Researcher on all things ML, privacy-preserving ML, and security.
Nov 30, 2021 Check out our work on deploying&evaluating website fingerprinting attacks on the Tor network. TL;DR: WF is hard for untargeted attacks. To appear in USENIX ‘22.
May 14, 2021 Our paper “Exact Optimization of Conformal Predictors via Incremental and Decremental Learning has been accepted for presentation&publication at ICML ‘21. This work has also been accepted as a spotlight talk at the DFUQ ‘21 ICML workshop.

Publications

  1. Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition Debenedetti, Edoardo, Rando, Javier, Paleka, Daniel, Florin, Silaghi Fineas, Albastroiu, Dragos, Cohen, Niv, Lemberg, Yuval, Ghosh, Reshmi, Wen, Rui, Salem, Ahmed, and others, arXiv preprint arXiv:2406.07954 2024 [Paper]
  2. Are you still on track!? Catching LLM Task Drift with Activations Abdelnabi, Sahar, Fay, Aideen, Cherubin, Giovanni, Salem, Ahmed, Fritz, Mario, and Paverd, Andrew arXiv preprint arXiv:2406.00799 2024 [Paper]
  3. Closed-Form Bounds for DP-SGD against Record-level Inference Attacks Cherubin, Giovanni, Kopf, Boris, Paverd, Andrew, Tople, Shruti, Wutschitz, Lukas, and Zanella-Béguelin, Santiago In 33rd USENIX Security Symposium (USENIX Security 24) 2024 [Paper] [Url]
  4. Bayes Security: A Not So Average Metric Chatzikokolakis, Konstantinos, Cherubin, Giovanni, Palamidessi, Catuscia, and Troncoso, Carmela In 2023 IEEE 36th Computer Security Foundations Symposium (CSF) 2023 [Paper]
  5. Approximating full conformal prediction at scale via influence functions Martinez, Javier Abad, Bhatt, Umang, Weller, Adrian, and Cherubin, Giovanni In Proceedings of the AAAI Conference on Artificial Intelligence 2023 [Paper]
  6. [Short paper] How do the performance of a Conformal Predictor and its underlying algorithm relate? Cherubin, Giovanni In Conformal and Probabilistic Prediction with Applications 2023 [Paper]
  7. SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning Salem, Ahmed, Cherubin, Giovanni, Evans, David, Koepf, Boris, Paverd, Andrew, Suri, Anshuman, Tople, Shruti, and Zanella-Beguelin, Santiago In 2023 IEEE Symposium on Security and Privacy (SP) 2023 [Paper]
  8. [Short paper] How do the performance of a Conformal Predictor and its underlying algorithm relate? Cherubin, Giovanni In Conformal and Probabilistic Prediction with Applications 2023 [Paper]
  9. Disparate vulnerability: On the unfairness of privacy attacks against machine learning Kulynych, Bogdan, Yaghini, Mohammad, Cherubin, M, Veale, G, and Troncoso, C 2022
  10. Online Website Fingerprinting: Evaluating Website Fingerprinting Attacks on Tor in the Real World Cherubin, Giovanni, Jansen, Rob, and Troncoso, Carmela In 31st USENIX Security Symposium (USENIX Security 22) 2022 [Paper] [Url]
  11. Reconstructing Training Data with Informed Adversaries Balle, Borja, Cherubin, Giovanni, and Hayes, Jamie In 2022 IEEE Symposium on Security and Privacy (SP) 2022 [Paper]
  12. Synthetic Data-what, why and how? Jordon, James, Szpruch, Lukasz, Houssiau, Florimond, Bottarelli, Mirko, Cherubin, Giovanni, Maple, Carsten, Cohen, Samuel N, and Weller, Adrian Royal Society 2022 [Paper]
  13. Exact Optimization of Conformal Predictors via Incremental and Decremental Learning Cherubin, Giovanni, Chatzikokolakis, Konstantinos, and Jaggi, Martin In Proceedings of the 38th International Conference on Machine Learning 2021 [Abs] [Paper] [Url]
  14. (Poster) Fast conformal classification using influence functions Bhatt, Umang, Weller, Adrian, and Cherubin, Giovanni In Proceedings of the Tenth Symposium on Conformal and Probabilistic Prediction and Applications 2021 [Abs] [Paper] [Url]
  15. Reconstructing Training Data with Informed Adversaries Balle, Borja, Cherubin, Giovanni, and Hayes, Jamie In NeurIPS 2021 Workshop Privacy in Machine Learning 2021 [Paper] [Url]
  16. Black-box Security: Measuring Black-box Information Leakage via Machine Learning Cherubin, Giovanni PhD thesis 2019 [PDF]
  17. F-BLEAU: Fast Black-box Leakage Estimation Cherubin, Giovanni, Chatzikokolakis, Konstantinos, and Palamidessi, Catuscia In IEEE Symposium on Security and Privacy (S&P) 2019 [Abs] [Paper] [Video]
  18. Exchangeability martingales for selecting features in anomaly detection Cherubin, Giovanni, Baldwin, Adrian, and Griffin, Jonathan In Proceedings of the Seventh Workshop on Conformal and Probabilistic Prediction and Applications 2018 [Abs] [Paper] [Url] [Slides] [Code]
  19. Majority vote ensembles of conformal predictors Cherubin, Giovanni Machine Learning 2018 [Paper] [Url]
  20. Website Fingerprinting Defenses at the Application Layer Cherubin, Giovanni, Hayes, Jamie, and Juarez, Marc Proceedings on Privacy Enhancing Technologies 2017 [Abs] [Paper] [Code]
  21. Bayes, not Naïve: Security Bounds on Website Fingerprinting Defenses Cherubin, Giovanni Proceedings on Privacy Enhancing Technologies 2017 Best student paper [Paper] [Slides] [Code] [Video]
  22. Hidden Markov Models with Confidence Cherubin, Giovanni, and Nouretdinov, Ilia In Conformal and Probabilistic Prediction with Applications - 5th International Symposium, COPA 2016 [Paper] [Slides] [Code]
  23. Conformal Clustering and Its Application to Botnet Traffic Cherubin, Giovanni, Nouretdinov, Ilia, Gammerman, Alexander, Jordaney, Roberto, Wang, Zhi, Papini, Davide, and Cavallaro, Lorenzo In Statistical Learning and Data Sciences (SLDS) 2015 Best student paper [Paper] [Slides]
  24. Bots detection by Conformal Clustering Cherubin, Giovanni MSc thesis, Royal Holloway University of London 2014 [PDF]

Research Visits

Research Engineer, HP Labs Security Lab, Bristol (August-November 2017)
Supervisors: Jonathan Griffin, Adrian Baldwin

Research Visitor, École Polytechnique, Paris (May; November 2017)
Supervisors: Prof. Catuscia Palamidessi, Kostas Chatzikokolakis

Research Intern, Cornell Tech (June-September 2016)
Supervisor: Prof. Thomas Ristenpart

Academic Service

PC chair of the annual conference on conformal prediction, COPA 2020, COPA 2021. Guest editor of the 2022 Annal of Mathematics and Artificial Intelligence special issue on Conformal Prediction. Co-organiser of the PriML workshop 2021 (@NeurIPS). PC member: NeurIPS 2024, ICLR 2024, IEEE S&P 2022-23 and 2024-25, USENIX 2022-24, SatML 2023-24, ACM CCS 2021, IEEE Euro S&P 2021-22, PETS (2019-2021), COPA 2018. I have also been reviewing for various ML&security conferences and journals (e.g., ICML 2022, Machine Learning journal, Neurocomputing, Financial Cryptography). I was notable reviewer for the SatML editions 2023 and 2024.

I was teaching assistant for the courses: Machine Learning and Data Analysis at Royal Holloway University of London (2014-17). I was teaching assistant for the courses on C programming and Linear Algebra and Geometry at University of Pavia (2011-12). In 2023 and 2024, I gave a lecture on Privacy Preserving Machine Learning at the KU Leuven Summer School on Security & Privacy in the Age of AI

Awards

  • 2022, Internet Defense Prize: Prize awarded by USENIX ‘23 and sponsored by Meta: “Online Website Fingerprinting: Evaluating Website Fingerprinting Attacks on Tor in the Real World”
  • 2022, Distinguished Paper: USENIX ‘23: “Online Website Fingerprinting: Evaluating Website Fingerprinting Attacks on Tor in the Real World”
  • 2017, Best Paper: Andreas Pfitzmann Best Student Paper Award at PETS: “Bayes, not Naïve: Security Bounds on Website Fingerprinting Defenses”
  • 2017, First place at Capture The Flag (CTF) security challenge organised by NCC Group at the Cambridge2Cambridge event
  • 2015, Best Paper: Best student paper award sponsored by HP at SLDS conference: “Conformal Clustering and Its Application to Botnet Traffic”
  • 2014, Best Finalist: Best MSc in Big Data finalist in memory of Prof. Alexey Chervonenkis (Royal Holloway University of London)

Short bio

Since 2022, I am a Senior Researcher at Microsoft Research (Cambridge) and with the Microsoft Response Centre. Before, I was Research Fellow (Safe&Ethical AI) at the Alan Turing Institute in London (2020-21), and a postdoctoral fellow (2019-21) at EPFL (Switzerland) with an EcoCloud grant, collaborating with Carmela Troncoso at the SPRING lab and Martin Jaggi at the MLO lab. I have a PhD in Machine Learning and Information Security from Royal Holloway University of London with the Centre of Doctoral Training (CDT), where I was supervised by Alex Gammerman and advised by Kenny Paterson. I received an MSc in Machine Learning from Royal Holloway University of London in 2014, and a BSc in Mechatronics and Computer Engineering from University of Pavia in 2013.

My research focuses on the privacy and security properties of machine learning models, as well as the theoretical and empirical study of their information leakage; I am especially interested in what metrics one should use to measure the leakage (i.e., the risk with respect to attacks) in these contexts. Additionally, I work on distribution-free uncertainty estimation for machine learning, such as Conformal Prediction, distribution-free learning, and I have a personal interest in the use of Kolmogorov complexity as the basis for machine learning (e.g., Algorithmic Learning Theory).