INDY Lab - Robust Collaborative Filtering

Robust Collaborative Filtering

Recommender systems are at the heart of the modern internet. Because the amount of content produced far exceeds what any human can process (or even see), we are dependent on these algorithms to select what information each user is shown. This leads these algorithms to have a tremendous power on what each of us sees, and therefore on society itself, as more and more people use social media as a source of information. This places recommendation algorithms at the center of informational warfare, as recalled by the recent cancellation of the Romanian presidential elections, following its manipulation through TikTok [1, 2]. In this context, developing more secure recommendation algorithms is a crucial challenge for the future of democracy.

In order to pick which content is recommended to which user, most algorithms use collaborative filtering [3]. This term regroups all the techniques that leverage the feedback provided by users to provide users with similar taste with the same recommendations. However, because collaborative filtering uses the data from some users to decide what to show to the others, it is inherently vulnerable to poisoning attacks. Indeed, some users (or fake accounts) can easily lie about their preferences in order to influence the recommendations of all the rest, or those of a specific group of users.

In this project, you will work towards designing a robust collaborative filtering technique that provably bounds the influence of each user on the others [4]. Such an algorithm would ensure that a minority of accounts cannot influence the content the platform displays to the others. You will have to read the relevant literature, think about how you can modify the existing algorithms, and perform some experiments to guide your reflexion.

[1] Did TikTok influence Romania's presidential election? https://www.dw.com/en/did-tiktok-influence-romanias-presidential-election/a-70954832

[2] What happened on TikTok around the Romanian elections? https://globalwitness.org/en/campaigns/digital-threats/what-happened-on-tiktok-around-the-annulled-romanian-presidential-election-an-investigation-and-poll/

[3] Mnih, A., & Salakhutdinov, R. R. (2007). Probabilistic matrix factorization. Advances in neural information processing systems, 20.

[4] Allouah, Youssef, et al. "Robust sparse voting." International Conference on Artificial Intelligence and Statistics. PMLR, 2024.

Requirements:

Python
Linear Algebra
Probabilities

To apply, please send your CV and transcript to oscar.villemaud@epfl.ch