INDY
  • People
  • Research
    Research areas Research applications
  • Publications
  • Teaching
  • Student Projects
  • Contact
  • Available
  • Ongoing
  • Completed
Recommender System with hierarchical structure
Daichi Kuroda

Matrix factorization is one of the most widely used techniques for recommender systems. In this method, the user-item interaction matrix is assumed to exhibit a low-rank structure. We aim to refine this assumption further by exploring hierarchical structures within the user-item matrix and investigating whether incorporating hierarchical relationships can enhance the performance. Additionally, we try to develop a more resilient system to the influence of malicious users.

More details
SAT‑Driven Exact and Heuristic Treewidth & Decomposition
Sepehr Elahi

The goal of this project is to leverage SAT/MaxSAT for exact and approximate tree decompositions, evaluated against PACE 2017 Treewidth benchmarks.

More details
Dynamic Bayesian Optimization for Improving the Performance of Cellular Networks
Anthony Bardou

In this project, the student will use the dynamic Bayesian optimization framework for power control in cellular networks.

More details
Learning Rich Choice Models From Data
Surya Sankagiri

The Bradley-Terry-Luce model (aka Multinomial Logit model) is a classical tool for modelling how humans make choices when presented with a class of alternatives. This project involves exploring generalisations of this model that take into consideration effects of 'human irrationality'.

More details
Matrix Factorisation With Comparison Data
Surya Sankagiri

Matrix Factorisation is a classical modelling framework in recommender systems. In this project, you will perform simulations to see test the performance of gradient descent on a class of nonconvex problems arising from matrix factorisation.

More details
Computational methods for Chance Constraint Optimization
Saeed Masiha

Numerical experiments for several known methods for solving Chance Constraint Optimization.

More details
Stochastic proximal first-order algorithm
Saeed Masiha

In this project, we are going to find a batch-free stochastic proximal first-order algorithm that achieves O(1/\epsilon) gradient oracle complexity when the objective function is smooth and satisfies 2-PL (or quadratic growth).

More details
Robustness and Performance of Spectral Clustering Methods for Community Detection
Daichi Kuroda

This project examines the robustness and performance of various spectral clustering methods, focusing on the effects of different Laplacian operators. It aims to clarify when and why performance differences emerge. The goal is to provide practitioners with principled guidelines for selecting spectral methods in graph clustering.

More details
Web scrapping for hierarchical clustering
Daichi Kuroda

Hierarchical clustering is the task of organizing data into a tree representation. Although the field has been intensively studied, the definition of hierarchy had not been clearly defined. Recently, we proposed a rigorous definition of hierarchy. In this project, you will collect text and metadata from Wikipedia and/or arXiv and evaluate how much agreement our definition of hierarchy shows with the hierarchy derived from the metadata.

More details
Bayesian Optimization for Soft Clustering
Anthony Bardou

In this project, the student will participate in the design, the implementation and the evaluation of clustering algorithms that exploit Bayesian optimization.

More details
Exploring good transformations of edge weights for spectral clustering
Daichi Kuroda

This project seeks to identify optimal edge weight transformations for a range of spectral clustering algorithms, with a focus on maximizing community separation in the embedded space.

More details
Estimation the Number of Clusters in Mixture Models using Data Thinning
Maximilien Dreveton

Determining the optimal number of clusters, K, for a dataset of n data points (X1,…,Xn) is a complex task, especially when K is unknown. A common approach involves running a clustering algorithm for various K values and selecting the one that minimizes an objective function. However, this method can be flawed, as it simultaneously fits and validates the model on the same data, potentially leading to overfitting and biased results.

More details
Distinguishing Random Graphs via Distance Measures
Paula Murmann

This project uses First Passage Percolation (FPP) to distinguish between tree, Erdős–Rényi, and complete graphs based on shortest-path distances from a source node. By simulating FPP on random graphs, we aim to identify structural thresholds and accuracy bounds for graph classification.

More details
Analyzing Network Cascades to Infer Graph Properties
Maximilien Dreveton

Given the infection times of vertices from one or multiple cascades (such as waves of an epidemic) spreading through an unknown graph, what can be inferred about the graph's structure?

More details
Clustering Community Detection Methods and Characterizing Them
Daichi Kuroda

Grouping several community detection algorithms, and investigating the properties of communities each algorithm finds.

More details
Generating random graphs with a given treewidth of k
Sepehr Elahi

The goal of this project is to design and implement methods for generating random graphs that have a given treewidth k, in either Python or Julia.

More details
Collaborative item scoring for recommender systems
Oscar Villemaud

The project consists in studying how to best collect and combine the feedback of different users, which is collected in the form of ratings and comparisons.

More details
Routing trains in Julia
Guillaume Dalle

The Flatland challenge was developed by the Swiss, German and French railway companies. It consists in finding itineraries for a set of trains so that they reach their destination as fast as possible... without colliding. The only problem is that the simulator is coded in Python, which makes it very slow: your job will be to make it fast by translating it to Julia!

More details
Graph algorithms in Julia
Guillaume Dalle

Contribute to a major part of the the Julia ecosystem by implementing efficient graph algorithms. Knowledge of Julia or graph theory is not a prerequisite: consider this project as an opportunity to learn!

More details
Generalization Performance of Stochastic Gradient Methods
Saeed Masiha

The first goal of this project is to use the different notions of algorithmic stability to study the generalization behavior of stochastic first-order optimization and the effect of variance reduction and adaptation on generalization error. The second goal of this project is whether SGD can be seen as performing regularized empirical risk minimization i.e., studying implicit regularization, a popular theory for why SGD generalizes so well.

More details
Collecting a large dataset on human similarity and preference perception

Are you enthusiastic about datascience? Do you have strong web development skills? Then you'd be an ideal candidate to help us conduct a large online study on human similarity and preference choices. The dataset will be the backbone of research on probabilistic choice models and recommender systems.

More details
Few-Shot Learning

Using prior knowledge, Few-Shot Learning aims to (rapidly) generalize to new tasks containing only a few samples of information. We have multiple possible projects available around this topic.

More details
Robust Collaborative Filtering
Oscar Villemaud

Recommender systems are at the heart of the modern internet. They are the ones deciding what people see online. However, most of them are vulnerable to manipulation. Your task will be to develop a personalized recommendation algorithm that limits the influence each user can have on the others.

More details
SIS Model Simulator with Dynamic Vaccination and Visualization
Sepehr Elahi

Develop an SIS model simulator in Python or Julia that incorporates advanced vaccination strategies and dynamic adjustments of infection and recovery rates. Create interactive visualization tools to monitor and analyze the effects of vaccinations and track disease dynamics in real-time.

More details
Bounding the Regret of a Dynamic Bayesian Optimization Algorithm
Anthony Bardou

The goal of this project is to contribute to the theoretical understanding of a dynamic Bayesian optimization algorithm.

More details
Enhancing Unsupervised Learning Through Data Thinning: An Exploration of Sample Splitting
Maximilien Dreveton

Clustering a set of n data points (X1,…,Xn) into an optimal number of clusters K is challenging when K is unknown. A common approach involves running a clustering algorithm for different K values and selecting the K that minimizes the objective function. This method is flawed as it uses the same dataset for both model fitting and validation.

More details
Designing and implementing an object to handle mixed-graphs
Sepehr Elahi

The aim of this project is to design and implement a class for handling mixed graphs, in either Python or Julia.

More details
Simulation of an Infectious Process on Graphs
Paula Murmann

Simulate a mathematical model of an infectious process on graph and compare the simulator to state-of-the-art simulators on real-world data.

More details
Adressing Data Staleness in Dynamic Bayesian Optimization
Anthony Bardou

The goal of this project is to implement and study the performance of a criterion measuring data staleness in a dynamic Bayesian optimization context.

More details
Does degree heterogeneity helps or handicaps graph clustering?
Maximilien Dreveton

Real graphs tend to show a lot of heterogeneity in the node degrees (few hubs with large degrees and a lot of low-degree nodes). Does this heterogeneity help or degrade the performance of community detection algorithms?

More details
Sparsifying a graph by keeping only the shortest paths
Maximilien Dreveton

The goal of this project is to study different applications of graph sparsification.

More details
Recommendation Systems that Learn from Comparisons
Surya Sankagiri

The goal of this project is to design and implement recommendation system algorithms that can learn users' preferences based on data where users select which items they prefer when given a choice of items. In particular, we focus on multi-armed bandit algorithms.

More details
Second-Order Methods in Deep RL

The project seeks to apply second-order methods in complex environments (such as Atari games) and compare their performances with first-order methods empirically in terms of sample complexity and robustness to changes in initializations.

More details
Clustering temporal networks
Maximilien Dreveton

This project seeks to propose and study methods for clustering networks in which the interactions between the nodes vary with time.

More details
How does the presence of communities affect the source localisation?
Maximilien Dreveton

This project seeks to observe how the community structure of a network affects the identification of the source of a spreading process (disease spreading, etc.).

More details
Full-stack Web Developer for Climpact

We are looking for a full-stack Web developer (Python, JavaScript, HTML, CSS) to continue the development of a platform for Climpact, a research project about machine learning and climate. You will be paid CHF 26.- / hour during the Fall semester, on the campus or remotely.

More details
How does the second-order derivative information affect generalization error or test error?
Saeed Masiha

The goal of the project is to compare empirically the generalization error of two stochastic optimization algorithms used in Reinforcement Learning (SCRN and momentum-based SGD).

More details
Global Convergence in Reinforcement Learning

The project seeks to check global properties of the objective function (the expected return) in various reinforcement learning (RL) settings.

More details
Framing in Wikipedia

How does Wikipedia frame controversial topics and how does it change over time?

More details
Predicting Swiss Votes Through Machine Learning

Can we predict the outcome of Swiss popular votes from information available on the web before the vote?

More details
Early Stopping for Time-series Applications

For time-series applications, standard cross-validation, where data is randomly sampled into train and test partitions, is problematic. In this project, we would like to explore possible replacements for cross-validation in such settings.

More details
Algorithms for epidemic contact tracing on networks

The goal of the project is to develop heuristics for a toy model that captures many of the challenges of COVID-19 contact tracing.

More details
Neural Architecture Search without Training using Sensitivity

In this project, we aim to study the output sensitivity in the NAS-Bench-201 search space. The project requires a literature review over NAS algorithms. The sensitivity metric should then be compared to the state-of-the-art NAS benchmarks in terms of cost and performance.

More details
Few-Shot Learning

Using prior knowledge, Few-Shot Learning aims to (rapidly) generalize to new tasks containing only a few samples of information. We have multiple possible projects available around this topic.

More details
Who Makes Law? Understanding the Structure of Lobbying in Brussels

Lobbying is sometimes considering the hidden part of the iceberg in political decisions. In this project, we'll try to understand to what extent is this statement true using quantitative methods from computer science and machine learning.

More details
LawGit: Visualize the Evolution of European Laws

The goal of this data-mining project is to visualize the evolution of European Laws. Computer scientists are lucky to benefit from version-control systems. Let's make the rest of the world do so!

More details
Mining International Climate Negotiations

The United Nations climate negotiations (COP) started in 1995 in Berlin, and the 25th COP will be held in December 2019 in Madrid. Interactions between countries taking part in these negotiations provide a rich environment to study the global competitive dynamics of our world. In this project, we aim at collecting and analyzing a new dataset of collaborations and conflicts between countries, as well as develop models of such dynamics.

More details
Few-shot Semi-Supervised Learning with Meta-Learning

In few-shot classification, we are interested in learning algorithms that train a classifier from only a handful of labeled examples. Meta-learning is a successful approach for few-shot learning. In this project we'll investigate, implement and compare state-of-the-art meta-learning algorithms in the related but more natural setting of semi-supervised learning.

More details
A Closer Look at Meta-Learning: Fast Adaptation of Deep Neural Networks

Meta-learning, also known as learning-to-learn, is a paradigm that exploits cross-task information and training experience to perform well on a new unseen task. The goals of this project are (1) constructing a unifying framework to compare meta-learning algorithms for classification and regression, and (2) investigating algorithmic and theoretical improvements related to meta-learning.

More details
Implementing source localization algorithms for benchmarking

The goal of this project is to implement existing source localization algorithms and add it to the benchmark software recently developed in the INDY lab.

More details
Reinforcement learning for active comparison-based search

Reinforcement learning for active comparison-based search .

More details
Climpact: understanding people's perception of their carbon footprint

Flying to New York is worse than taking a long shower. But is it 10 times worse or 1000 times worse? In this project, we aim at understanding the perception that people have of their actions. Your task will be to i) develop an application to collect relevant data and ii) implement a statistical model of people's perception.

More details
Implementing variants of sparsified back-propagation (MeProp) in deep neural networks

The goal of this project is to first compare the method with other regularization techniques such as Dropout. Second, experiment different improvements or extensions that can be brought to MeProp (for instance, non-constant k over the training epochs, top-k selection vs random-k selection, implementing MeProp to CNNs, etc.).

More details
Robustness analysis of the Metric Dimension

The goal of the project is to analyse how the Metric Dimension of a graph changes with random perturbations. This project is aimed at students with strong theoretical background.

More details
EPFL people search via profile photo comparisons using machine learning techniques

In this project, we are interested in developing a brand new search framework for navigating through the database of EPFL people via comparing their profile photos.

More details
Mining of Political Texts

This project is about scraping, mining, understanding, and modelling political texts. The goal will be to explore new insights of an existing large data set, extending it with new data when needed. It is an open project where all ideas are encouraged.

More details
Detecting epidemic sources in presence of misinformation

How to deal with misinformation about node states when we want to detect the source of an epidemic?

More details
Machine Learning for Networks

The project is a mix of Machine Learning and Networking. In a hybrid network that consists of Wifi and Power Line Communication, the idea is to find the optimal end-to-end route between access points when you know the pairwise links’ capacities. The main task would be to predict the end-to-end throughput (regression problem). No background in Networking is required (but it is of course a plus).

More details
© | | School of Computer and Communication Sciences | Information and Network Dynamics Group | Credits