Giovanni Cherubin
Senior Researcher in Machine Learning & Security at Microsoft (Cambridge)
Research interests:
- Information leakage estimation for security&privacy
- Theory, foundations, and privacy-security properties of Machine Learning
- Methods for distribution-free confident prediction in supervised learning and anomaly detection (e.g., Conformal Predictors)
Have a look at a list of recent projects.
I co-founded the CTF team TU6PM.
I am a (happy) OpenBSD and QubesOS user.
news
Oct 9, 2024 | In November, we will launch a cross-user prompt injection competition, where participants will have to interact with a simulated LLM-capable email client to run commands on a user’s behalf. Watch this space for updates! |
Aug 14, 2024 | One can get a closed-form approximation of the risk against membership inference for DP-SGD, and we released an interactive tool that uses this idea to help trimming DP-SGD’s parameters. We can also get data-dependent guarantees for the risk of attribute inference; code for this is available too. Based on our USENIX24 work. |
Aug 10, 2022 | Our work on evaluating website fingerprinting in the real world was awarded: i) the Internet Defense Award (2nd place) sponsored by Meta, and ii) a Distinguished Paper Award (USENIX ‘22)! |
May 25, 2022 | Our work on reconstruction attacks against ML models was accepted by IEEE S&P 2022. Check out Jamie’s wonderful presentation! |
Feb 7, 2022 | I joined Microsoft Research Cambridge and the Microsoft Security Response Centre (conveniently, both acronymise to “MSRC”). I will work as a Senior Researcher on all things ML, privacy-preserving ML, and security. |
Nov 30, 2021 | Check out our work on deploying&evaluating website fingerprinting attacks on the Tor network. TL;DR: WF is hard for untargeted attacks. To appear in USENIX ‘22. |
May 14, 2021 | Our paper “Exact Optimization of Conformal Predictors via Incremental and Decremental Learning has been accepted for presentation&publication at ICML ‘21. This work has also been accepted as a spotlight talk at the DFUQ ‘21 ICML workshop. |
Publications
-
Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition arXiv preprint arXiv:2406.07954 2024 [Paper]
-
Are you still on track!? Catching LLM Task Drift with Activations arXiv preprint arXiv:2406.00799 2024 [Paper]
-
Bayes Security: A Not So Average Metric In 2023 IEEE 36th Computer Security Foundations Symposium (CSF) 2023 [Paper]
-
Approximating full conformal prediction at scale via influence functions In Proceedings of the AAAI Conference on Artificial Intelligence 2023 [Paper]
-
[Short paper] How do the performance of a Conformal Predictor and its underlying algorithm relate? In Conformal and Probabilistic Prediction with Applications 2023 [Paper]
-
SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning In 2023 IEEE Symposium on Security and Privacy (SP) 2023 [Paper]
-
[Short paper] How do the performance of a Conformal Predictor and its underlying algorithm relate? In Conformal and Probabilistic Prediction with Applications 2023 [Paper]
-
Disparate vulnerability: On the unfairness of privacy attacks against machine learning 2022
-
Reconstructing Training Data with Informed Adversaries In 2022 IEEE Symposium on Security and Privacy (SP) 2022 [Paper]
-
Black-box Security: Measuring Black-box Information Leakage via Machine Learning PhD thesis 2019 [PDF]
Research Visits
Research Engineer, HP Labs Security Lab, Bristol (August-November 2017)
Supervisors: Jonathan Griffin, Adrian Baldwin
Research Visitor, École Polytechnique, Paris (May; November 2017)
Supervisors: Prof. Catuscia Palamidessi, Kostas Chatzikokolakis
Research Intern, Cornell Tech (June-September 2016)
Supervisor: Prof. Thomas Ristenpart
Academic Service
PC chair of the annual conference on conformal prediction, COPA 2020, COPA 2021. Guest editor of the 2022 Annal of Mathematics and Artificial Intelligence special issue on Conformal Prediction. Co-organiser of the PriML workshop 2021 (@NeurIPS). PC member: NeurIPS 2024, ICLR 2024, IEEE S&P 2022-23 and 2024-25, USENIX 2022-24, SatML 2023-24, ACM CCS 2021, IEEE Euro S&P 2021-22, PETS (2019-2021), COPA 2018. I have also been reviewing for various ML&security conferences and journals (e.g., ICML 2022, Machine Learning journal, Neurocomputing, Financial Cryptography). I was notable reviewer for the SatML editions 2023 and 2024.
I was teaching assistant for the courses: Machine Learning and Data Analysis at Royal Holloway University of London (2014-17). I was teaching assistant for the courses on C programming and Linear Algebra and Geometry at University of Pavia (2011-12). In 2023 and 2024, I gave a lecture on Privacy Preserving Machine Learning at the KU Leuven Summer School on Security & Privacy in the Age of AI
Awards
- 2022, Internet Defense Prize: Prize awarded by USENIX ‘23 and sponsored by Meta: “Online Website Fingerprinting: Evaluating Website Fingerprinting Attacks on Tor in the Real World”
- 2022, Distinguished Paper: USENIX ‘23: “Online Website Fingerprinting: Evaluating Website Fingerprinting Attacks on Tor in the Real World”
- 2017, Best Paper: Andreas Pfitzmann Best Student Paper Award at PETS: “Bayes, not Naïve: Security Bounds on Website Fingerprinting Defenses”
- 2017, First place at Capture The Flag (CTF) security challenge organised by NCC Group at the Cambridge2Cambridge event
- 2015, Best Paper: Best student paper award sponsored by HP at SLDS conference: “Conformal Clustering and Its Application to Botnet Traffic”
- 2014, Best Finalist: Best MSc in Big Data finalist in memory of Prof. Alexey Chervonenkis (Royal Holloway University of London)
Short bio
Since 2022, I am a Senior Researcher at Microsoft Research (Cambridge) and with the Microsoft Response Centre. Before, I was Research Fellow (Safe&Ethical AI) at the Alan Turing Institute in London (2020-21), and a postdoctoral fellow (2019-21) at EPFL (Switzerland) with an EcoCloud grant, collaborating with Carmela Troncoso at the SPRING lab and Martin Jaggi at the MLO lab. I have a PhD in Machine Learning and Information Security from Royal Holloway University of London with the Centre of Doctoral Training (CDT), where I was supervised by Alex Gammerman and advised by Kenny Paterson. I received an MSc in Machine Learning from Royal Holloway University of London in 2014, and a BSc in Mechatronics and Computer Engineering from University of Pavia in 2013.
My research focuses on the privacy and security properties of machine learning models, as well as the theoretical and empirical study of their information leakage; I am especially interested in what metrics one should use to measure the leakage (i.e., the risk with respect to attacks) in these contexts. Additionally, I work on distribution-free uncertainty estimation for machine learning, such as Conformal Prediction, distribution-free learning, and I have a personal interest in the use of Kolmogorov complexity as the basis for machine learning (e.g., Algorithmic Learning Theory).