Information and Network Dynamics group

Our research group is part of the School of Computer and Communication Sciences at EPFL in Lausanne, Switzerland. The group is lead by Matthias Grossglauser and Patrick Thiran. Our research focuses broadly on the statistical modeling of large dynamical systems involving both human and technical agents. Examples include social and information networks, epidemic processes, human mobility and transportation, and recommender systems. Our work lies at the intersection of machine learning, probabilistic modeling, large-scale data analytics, and performance analysis. Here are the research areas we work on:

Graph Mining

Network alignment, network assembly, and network inference

Mobility Mining

Prediction and transfer learning in populations

Epidemics

Monitoring, prediction, and source inference

Distributed Processes on Graphs

Gossiping, voting, and optimization

Discrete-Choice Models

Large-scale inference and ranking

Active Learning

Multi-armed bandits, online optimization, active learning

Wireless and Hybrid Networking

Wireless networking, power-line communication, hybrid networking

Applications

In computational biology, data privacy, medical data analytics, etc.

Recent publications

Multi-Armed Bandit in Action: Optimizing Performance in Dynamic Hybrid Networks
S. Henri, C. Vlachou and P. Thiran
IEEE/ACM Transactions on Networking, 2018-07-27.
[view at publisher]

Abstract: Today's home networks are often composed of several technologies such as Wi-Fi or power-line communication (PLC). Yet, current network protocols rarely fully harness this diversity and use each technology for a specific, pre-defined role, for example, wired media as a backbone and the wireless medium for mobility. Moreover, a single path is generally employed to transmit data; this path is established in advance and remains in use as long as it is valid, although multiple possible paths offer more robustness against varying environments. We introduce HyMAB, an algorithm that explores different multipaths and finds the best one in a mesh hybrid network, while keeping congestion under control. We employ the multi-armed-bandit framework and prove that HyMAB achieves optimal throughput under a static scenario. HyMAB design also accounts for real-network intricacies and dynamic conditions; it adapts to varying environments and switches multipaths when needed. We implement HyMAB on a PLC/Wi-Fi test bed. This is, to the best of our knowledge, the first implementation on a real test bed of multi-armed-bandit strategies in the context of routing. Our experimental results confirm the optimality of HyMAB and its ability to adapt to dynamic network conditions, as well as the gains provided by employing multi-armed-bandit strategies.

Can Who-Edits-What Predict Edit Survival?
A. B. Yardım, V. Kristof, L. Maystre and M. Grossglauser
24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, United Kingdom, August 19-23, 2018.
[view at publisher]

As the number of contributors to online peer-production systems grows, it becomes increasingly important to predict whether the edits that users make will eventually be beneficial to the project. Existing solutions either rely on a user reputation system or consist of a highly specialized predictor that is tailored to a specific peer-production system. In this work, we explore a different point in the solution space that goes beyond user reputation but does not involve any content-based feature of the edits. We view each edit as a game between the editor and the component of the project. We posit that the probability that an edit is accepted is a function of the editor's skill, of the difficulty of editing the component and of a user-component interaction term. Our model is broadly applicable, as it only requires observing data about who makes an edit, what the edit affects and whether the edit survives or not. We apply our model on Wikipedia and the Linux kernel, two examples of large-scale peer-production systems, and we seek to understand whether it can effectively predict edit survival: in both cases, we provide a positive answer. Our approach significantly outperforms those based solely on user reputation and bridges the gap with specialized predictors that use content-based features. It is simple to implement, computationally inexpensive, and in addition it enables us to discover interesting structure in the data.

On the Delays in Time-Varying Networks: Does Larger Service-Rate Variance Imply Larger Delays?
S. Henri, S. Shneer and P. Thiran
Eighteenth ACM International Symposium on Mobile Ad Hoc Networking and Computing, Los Angeles, CA, USA, June 26 - 29, 2018.
[view at publisher]

In all networks, link or route capacities fluctuate for multiple reasons, e.g., fading and multi-path effects on wireless channels, interference and contending users on a shared medium, varying loads in WAN routers, impedance changes on power-line channels. These fluctuations severely impact packet delays. In this paper, we study delays in time-varying networks. Intuitively, we expect that for a given average service rate, an increased service rate variability yields larger delays. We find that this is not always the case. Using a queuing model that includes time-varying service rates, we show that for certain arrival rates, a queue with larger service rate variance offers smaller average delays than a queue with the same average service rate and lower service rate variance. We also verify these findings on a wireless testbed. We then study the conditions under which using simultaneously two independent paths helps in terms of delays, for example, in hybrid networks where the two paths use different physical layer technologies. We show that using two paths is not always better, in particular for low arrival rates. We also show that the optimal traffic splitting between the two paths depends on the arrival rate.

A General Framework for Sensor Placement in Source Localization
B. Spinelli, E. Celis and P. Thiran
IEEE Transactions on Network Science and Engineering, 2017.
[full text] [view at publisher]

When an epidemic spreads in a given network of individuals or communities, can we detect its source using only the information provided by a small set of nodes? We propose a general framework that incorporates two dimensions. First, we can either rely exclusively on a set of selected nodes (i.e., sensors) which always reveal their state independently of any particular epidemic (these are called static), or we can add some sensors (called dynamic) as an epidemic spreads, depending on which additional information is required. Second, the method can either localizes the source after an epidemic has spread through the entire network (offline), or while the epidemic is ongoing (online). We empirically study the performance of offline and online localization both with and without dynamic sensors. Our analysis shows that, by using dynamic sensors, the number of sensors necessary to localize the source is reduced by up to a factor of 10 and that, even with high-variance transmission delays, the source can be localized by using fewer than 5% of the nodes as sensors.

We have open positions!

We are hiring postdocs and PhD students in all our research areas.