2022

TB Hackathon: Development and Comparison of Five Models to Predict Subnational Tuberculosis Prevalence in Pakistan
S. Alba, E. Rood, F. Mecatti, J. M. Ross, P. J. Dodd, S. Chang, M. Potgieter, G. Bertarelli, N. J. Henry, K. E. LeGrand, W. Trouleau, D. Shaweno, P. MacPherson, Z. Z. Qin, C. Mergenthaler, F. Giardina, E. Augustijn, A. Q. Baloch and A. Latif
Tropical Medicine And Infectious Disease, 2022.
[view at publisher]

Pakistan's national tuberculosis control programme (NTP) is among the many programmes worldwide that value the importance of subnational tuberculosis (TB) burden estimates to support disease control efforts, but do not have reliable estimates. A hackathon was thus organised to solicit the development and comparison of several models for small area estimation of TB. The TB hackathon was launched in April 2019. Participating teams were requested to produce district-level estimates of bacteriologically positive TB prevalence among adults (over 15 years of age) for 2018. The NTP provided case-based data from their 2010-2011 TB prevalence survey, along with data relating to TB screening, testing and treatment for the period between 2010-2011 and 2018. Five teams submitted district-level TB prevalence estimates, methodological details and programming code. Although the geographical distribution of TB prevalence varied considerably across models, we identified several districts with consistently low notification-to-prevalence ratios. The hackathon highlighted the challenges of generating granular spatiotemporal TB prevalence forecasts based on a cross-sectional prevalence survey data and other data sources. Nevertheless, it provided a range of approaches to subnational disease modelling. The NTP's use and plans for these outputs shows that, limitations notwithstanding, they can be valuable for programme planning.

On the robustness of the metric dimension of grid graphs to adding a single edge
S. Mashkaria, G. Odor and P. Thiran
Discrete Applied Mathematics, 2022-07-31.
[view at publisher]

The metric dimension (MD) of a graph is a combinatorial notion capturing the minimum number of landmark nodes needed to distinguish every pair of nodes in the graph based on graph distance. We study how much the MD can increase if we add a single edge to the graph. The extra edge can either be selected adversarially, in which case we are interested in the largest possible value that the MD can take, or uniformly at random, in which case we are interested in the distribution of the MD. The adversarial setting has already been studied by [Eroh et. al., 2015] for general graphs, who found an example where the MD doubles on adding a single edge. By constructing a different example, we show that this increase can be as large as exponential. However, we believe that such a large increase can occur only in specially constructed graphs, and that in most interesting graph families, the MD at most doubles on adding a single edge. We prove this for $d$-dimensional grid graphs, by showing that $2d$ appropriately chosen corners and the endpoints of the extra edge can distinguish every pair of nodes, no matter where the edge is added. For the special case of $d=2$, we show that it suffices to choose the four corners as landmarks. Finally, when the extra edge is sampled uniformly at random, we conjecture that the MD of 2-dimensional grids converges in probability to $3+\mathrm{Ber}(8/27)$, and we give an almost complete proof.

The power of adaptivity in source identification with time queries on the path
V. Lecomte, G. Odor and P. Thiran
Theoretical Computer Science, 2022-04-08.
[view at publisher]

We study the problem of identifying the source of a stochastic diffusion process spreading on a graph based on the arrival times of the diffusion at a few queried nodes. In a graph $G=(V,E)$, an unknown source node $v^* \in V$ is drawn uniformly at random, and unknown edge weights $w(e)$ for $e\in E$, representing the propagation delays along the edges, are drawn independently from a Gaussian distribution of mean $1$ and variance $\sigma^2$. An algorithm then attempts to identify $v^*$ by querying nodes $q \in V$ and being told the length of the shortest path between $q$ and $v^*$ in graph $G$ weighted by $w$. We consider two settings: \emph{non-adaptive}, in which all query nodes must be decided in advance, and \emph{adaptive}, in which each query can depend on the results of the previous ones. Both settings are motivated by an application of the problem to epidemic processes (where the source is called patient zero), which we discuss in detail. We characterize the query complexity when $G$ is an $n$-node path. In the non-adaptive setting, $\Theta(n\sigma^2)$ queries are needed for $\sigma^2 \leq 1$, and $\Theta(n)$ for $\sigma^2 \geq 1$. In the adaptive setting, somewhat surprisingly, only $\Theta(\log\log_{1/\sigma}n)$ are needed when $\sigma^2 \leq 1/2$, and $\Theta(\log \log n)+O_\sigma(1)$ when $\sigma^2 \geq 1/2$. This is the first mathematical study of source identification with time queries in a non-deterministic diffusion process.

2021

Sequential metric dimension for random graphs
G. Odor and P. Thiran
Journal of Applied Probability, 2021.
[view at publisher]

In the localization game on a graph, the goal is to find a fixed but unknown target node v* with the least number of distance queries possible. In the j-th step of the game, the player queries a single node v_j and receives, as an answer to their query, the distance between the nodes v_j and v* . The sequential metric dimension (SMD) is the minimal number of queries that the player needs to guess the target with absolute certainty, no matter where the target is. The term SMD originates from the related notion of metric dimension (MD), which can be defined the same way as the SMD except that the player’s queries are non-adaptive. In this work we extend the results of Bollobás, Mitsche, and Prałat on the MD of Erdős–Rényi graphs to the SMD. We find that, in connected Erdős–Rényi graphs, the MD and the SMD are a constant factor apart. For the lower bound we present a clean analysis by combining tools developed for the MD and a novel coupling argument. For the upper bound we show that a strategy that greedily minimizes the number of candidate targets in each step uses asymptotically optimal queries in Erdős–Rényi graphs. Connections with source localization, binary search on graphs, and the birthday problem are discussed.

Switchover phenomenon induced by epidemic seeding on geometric networks
G. Odor, D. Czifra, J. Komjáthy, L. Lovász and M. Karsai
Proceedings of the National Academy of Sciences, 2021.
[view at publisher]

It is a fundamental question in disease modeling how the initial seeding of an epidemic, spreading over a network, determines its final outcome. One important goal has been to find the seed configuration, which infects the most individuals. Although the identified optimal configurations give insight into how the initial state affects the outcome of an epidemic, they are unlikely to occur in real life. In this paper we identify two important seeding scenarios, both motivated by historical data, that reveal a complex phenomenon. In one scenario, the seeds are concentrated on the central nodes of a network, while in the second one, they are spread uniformly in the population. Comparing the final size of the epidemic started from these two initial conditions through data-driven and synthetic simulations on real and modeled geometric metapopulation networks, we find evidence for a switchover phenomenon: When the basic reproduction number R0 is close to its critical value, more individuals become infected in the first seeding scenario, but for larger values of R0, the second scenario is more dangerous. We find that the switchover phenomenon is amplified by the geometric nature of the underlying network and confirm our results via mathematically rigorous proofs, by mapping the network epidemic processes to bond percolation. Our results expand on the previous finding that, in the case of a single seed, the first scenario is always more dangerous and further our understanding of why the sizes of consecutive waves of a pandemic can differ even if their epidemic characters are similar.

Disparity Between Batches as a Signal for Early Stopping
M. Forouzesh and P. Thiran
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2021), Bilbao, Basque Country, Spain, September 13-17, 2021.
[view at publisher]

We propose a metric for evaluating the generalization ability of deep neural networks trained with mini-batch gradient descent. Our metric, called gradient disparity, is the l2 norm distance between the gradient vectors of two mini-batches drawn from the training set. It is derived from a probabilistic upper bound on the difference between the classification errors over a given mini-batch, when the network is trained on this mini-batch and when the network is trained on another mini-batch of points sampled from the same dataset. We empirically show that gradient disparity is a very promising early-stopping criterion (i) when data is limited, as it uses all the samples for training and (ii) when available data has noisy labels, as it signals overfitting better than the validation data. Furthermore, we show in a wide range of experimental settings that gradient disparity is strongly related to the generalization error between the training and test sets, and that it is also very informative about the level of label noise.

Generalization Comparison of Deep Neural Networks via Output Sensitivity
M. Forouzesh, F. Salehi and P. Thiran
25th International Conference on Pattern Recognition, Milan, Italy, January 10-15, 2021.
[view at publisher]

Although recent works have brought some insights into the performance improvement of techniques used in state-of-the-art deep-learning models, more work is needed to understand their generalization properties. We shed light on this matter by linking the loss function to the output's sensitivity to its input. We find a rather strong empirical relation between the output sensitivity and the variance in the bias-variance decomposition of the loss function, which hints on using sensitivity as a metric for comparing the generalization performance of networks, without requiring labeled data. We find that sensitivity is decreased by applying popular methods which improve the generalization performance of the model, such as (1) using a deep network rather than a wide one, (2) adding convolutional layers to baseline classifiers instead of adding fully-connected layers, (3) using batch normalization, dropout and max-pooling, and (4) applying parameter initialization techniques.

War of Words II: Enriched Models of Law-Making Processes
V. Kristof, A. Suresh, M. Grossglauser and P. Thiran
The Web Conference 2021 (WWW '21), Ljubljana, Slovenia, April 19-23, 2021.
[view at publisher]

The European Union law-making process is an instance of a peer- production system. We mine a rich dataset of law edits and intro- duce models predicting their adoption by parliamentary committees. Edits are proposed by parliamentarians, and they can be in conflict with edits of other parliamentarians and with the original proposition in the law. Our models combine three different categories of features: (a) Explicit features extracted from data related to the edits, the parliamentarians, and the laws, (b) latent features that capture bi-linear interactions between parliamentarians and laws, and (c) text features of the edits. We show experimentally that this combination enables us to accurately predict the success of the edits. Furthermore, it leads to model parameters that are interpretable, hence provides valuable insight into the law-making process.

A Variational Inference Approach to Learning Multivariate Wold Processes
J. Etesami, W. Trouleau, N. Kiyavash, M. Grossglauser and P. Thiran
24th International Conference on Artificial Intelligence and Statistics (AISTATS), San Diego, California, USA, April 13-15, 2021.
[full text]

Temporal point-processes are often used for mathematical modeling of sequences of discrete events with asynchronous timestamps. We focus on a class of temporal point-process models called multivariate Wold processes (MWP). These processes are well suited to model real-world communication dynamics. Statistical inference on such processes often requires learning their corresponding parameters using a set of observed timestamps. In this work, we relax some of the restrictive modeling assumptions made in the state-of-the-art and introduce a Bayesian approach for inferring the parameters of MWP. We develop a computationally efficient variational inference algorithm that allows scaling up the approach to high-dimensional processes and long sequences of observations. Our experimental results on both synthetic and real-world datasets show that our proposed algorithm outperforms existing methods.

Metric dimension of critical Galton–Watson trees and linear preferential attachment trees
J. Komjáthy and G. Odor
European Journal of Combinatorics, 2021.
[view at publisher]

The metric dimension of a graph G is the minimal size of a subset R of vertices of G that, upon reporting their graph distance from a distinguished (source) vertex v⋆, enable unique identification of the source vertex v⋆ among all possible vertices of G. In this paper we show a Law of Large Numbers (LLN) for the metric dimension of some classes of trees: critical Galton–Watson trees conditioned to have size n, and growing general linear preferential attachment trees. The former class includes uniform random trees, the latter class includes Yule-trees (also called random recursive trees), m-ary increasing trees, binary search trees, and positive linear preferential attachment trees. In all these cases, we are able to identify the limiting constant in the LLN explicitly. Our result relies on the insight that the metric dimension can be related to subtree properties, and hence we can make use of the powerful fringe-tree literature developed by Aldous and Janson et al.

2020

Augmenting and Tuning Knowledge Graph Embeddings
R. Bamler, F. Salehi and S. Mandt
35th Uncertainty in Artificial Intelligence (UAI) Conference, Tel Aviv, ISRAEL, Jul 22-25, 2019.

Knowledge graph embeddings rank among the most successful methods for link prediction in knowledge graphs, i.e., the task of completing an incomplete collection of relational facts. A downside of these models is their strong sensitivity to model hyperparameters, in particular regularizers, which have to be extensively tuned to reach good performance [Kadlec el al, 2017]. We propose an efficient method for large scale hyperparameter tuning by interpreting these models in a probabilistic framework. After a model augmentation that introduces perentity hyperparameters, we use a variational expectation-maximization approach to tune thousands of such hyperparameters with minimal additional cost. Our approach is agnostic to details of the model and results in a new state of the art in link prediction on standard benchmark data.

A meta-learning approach for genomic survival analysis
Y. L. Qiu, H. Zheng, A. Devos, H. Selby and O. Gevaert
Nature Communications, 2020.
[full text] [view at publisher]

RNA sequencing has emerged as a promising approach in cancer prognosis as sequencing data becomes more easily and affordably accessible. However, it remains challenging to build good predictive models especially when the sample size is limited and the number of features is high, which is a common situation in biomedical settings. To address these limitations, we propose a meta-learning framework based on neural networks for survival analysis and evaluate it in a genomic cancer research setting. We demonstrate that, compared to regular transfer-learning, meta-learning is a significantly more effective paradigm to leverage high-dimensional data that is relevant but not directly related to the problem of interest. Specifically, meta-learning explicitly constructs a model, from abundant data of relevant tasks, to learn a new task with few samples effectively. For the application of predicting cancer survival outcome, we also show that the meta-learning framework with a few samples is able to achieve competitive performance with learning from scratch with a significantly larger number of samples. Finally, we demonstrate that the meta-learning model implicitly prioritizes genes based on their contribution to survival prediction and allows us to identify important pathways in cancer. RNA-sequencing data from tumours can be used to predict the prognosis of patients. Here, the authors show that a neural network meta-learning approach can be useful for predicting prognosis from a small number of samples.

MPGM: Scalable and Accurate Multiple Network Alignment
E. Kazemi and M. Grossglauser
Ieee-Acm Transactions On Computational Biology And Bioinformatics, 2020.
[view at publisher]

Protein-protein interaction (PPI) network alignment is a canonical operation to transfer biological knowledge among species. The alignment of PPI-networks has many applications, such as the prediction of protein function, detection of conserved network motifs, and the reconstruction of species' phylogenetic relationships. A good multiple-network alignment (MNA), by considering the data related to several species, provides a deep understanding of biological networks and system-level cellular processes. With the massive amounts of available PPI data and the increasing number of known PPI networks, the problem of MNA is gaining more attention in the systems-biology studies. In this paper, we introduce a new scalable and accurate algorithm, called MPGM, for aligning multiple networks. The MPGM algorithm has two main steps: (i) SeedGeneration and (ii) MultiplePercolation. In the first step, to generate an initial set of seed tuples, the SeedGeneration algorithm uses only protein sequence similarities. In the second step, to align remaining unmatched nodes, the MultiplePercolation algorithm uses network structures and the seed tuples generated from the first step. We show that, with respect to different evaluation criteria, MPGM outperforms the other state-of-the-art algorithms. In addition, we guarantee the performance of MPGM under certain classes of network models. We introduce a sampling-based stochastic model for generating k correlated networks. We prove that for this model if a sufficient number of seed tuples are available, the MULTIPLEPERCOLATION algorithm correctly aligns almost all the nodes. Our theoretical results are supported by experimental evaluations over synthetic networks.

Scalable and Efficient Comparison-based Search without Features
D. Chumbalov, L. Maystre and M. Grossglauser
37th International Conference on Machine Learning (ICML 2020), Online, July 13-18, 2020.

We consider the problem of finding a target object t using pairwise comparisons, by asking an oracle questions of the form “Which object from the pair (i, j) is more similar to t?”. Objects live in a space of latent features, from which the oracle generates noisy answers. First, we consider the non-blind setting where these features are accessible. We propose a new Bayesian comparison-based search algorithm with noisy answers; it has low computational complexity yet is efficient in the number of queries. We provide theoretical guarantees, deriving the form of the optimal query and proving almost sure convergence to the target t. Second, we consider the blind setting, where the object features are hidden from the search algorithm. In this setting, we combine our search method and a new distributional triplet embedding algorithm into one scalable learning framework called LEARN2SEARCH. We show that the query complexity of our approach on two real-world datasets is on par with the non-blind setting, which is not achievable using any of the current state-of-the- art embedding methods. Finally, we demonstrate the efficacy of our framework by conducting an experiment with users searching for movie actors.

Self-Supervised Prototypical Transfer Learning for Few-Shot Classification
C. Medina, A. Devos and M. Grossglauser
7th ICML Workshop on Automated Machine Learning (AutoML 2020), Vienna, Austria, Jul 12, 2020 – Jul 18, 2020.
[video] [code]

Recent advances in transfer learning and few-shot learning largely rely on annotated data related to the goal task during (pre-)training. However, collecting sufficiently similar and annotated data is often infeasible. Building on advances in self-supervised and few-shot learning, we propose to learn a metric embedding that clusters unlabeled samples and their augmentations closely together. This pre-trained embedding serves as a starting point for classification with limited labeled goal task data by summarizing class clusters and fine-tuning. Experiments show that our approach significantly outperforms state-of the-art unsupervised meta-learning approaches, and is on par with supervised performance. In a cross-domain setting, our approach is competitive with its classical fully supervised counterpart.

Sub-Matrix Factorization for Real-Time Vote Prediction
A. Immer, V. Kristof, M. Grossglauser and P. Thiran
The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining - KDD ’20, Virtual Event, August 23–27, 2020.
[view at publisher]

We address the problem of predicting aggregate vote outcomes (e.g., national) from partial outcomes (e.g., regional) that are revealed sequentially. We combine matrix factorization techniques and generalized linear models (GLMs) to obtain a flexible, efficient, and accurate algorithm. This algorithm works in two stages: First, it learns representations of the regions from high-dimensional historical data. Second, it uses these representations to fit a GLM to the partially observed results and to predict unobserved results. We show experimentally that our algorithm is able to accurately predict the outcomes of Swiss referenda, U.S. presidential elections, and German legislative elections. We also explore the regional representations in terms of ideological and cultural patterns. Finally, we deploy an online Web platform (www.predikon.ch) to provide real- time vote predictions in Switzerland and a data visualization tool to explore voting behavior. A by-product is a dataset of sequential vote results for 330 referenda and 2196 Swiss municipalities.

War of Words: The Competitive Dynamics of Legislative Processes
V. Kristof, M. Grossglauser and P. Thiran
The Web Conference - WWW ’20, Taipei, Taiwan, April 20-24, 2020.
[video] [code] [data] [view at publisher]

A body of law is an example of a dynamic corpus of text documents that are jointly maintained by a group of editors who compete and collaborate in complex constellations. Our goal is to develop predictive models for this process, thereby shedding light on the competitive dynamics of parliamentarians who make laws. For this purpose, we curated a dataset of 450 000 legislative edits introduced by European parliamentarians over the last ten years. An edit modifies the status quo of a law, and could be in competition with another edit if it modifies the same part of that law. We propose a model for predicting the success of such edits, in the face of both the inertia of the status quo and the competition between overlapping edits. The parameters of this model can be interpreted in terms of the influence of parliamentarians and of the controversy of laws.

2019

A User Study of Perceived Carbon Footprint
V. Kristof, V. Quelquejay-Leclère, R. Zbinden, L. Maystre, M. Grossglauser and P. Thiran
Climate Change Workshop at NeurIPS, Vancouver, BC, Canada, December 8-14, 2019.

We propose a statistical model to understand people's perception of their carbon footprint. Driven by the observation that few people think of CO2 impact in absolute terms, we design a system to probe people's perception from simple pairwise comparisons of the relative carbon footprint of their actions. The formulation of the model enables us to take an active-learning approach to selecting the pairs of actions that are maximally informative about the model parameters. We define a set of 18 actions and collect a dataset of 2183 comparisons from 176 users on a university campus. The early results reveal promising directions to improve climate communication and enhance climate mitigation.

Analysis of a Canonical Labeling Algorithm for the Alignment of Correlated Erdős-Rényi Graphs
O. E. Dai, D. Cullina, N. Kiyavash and M. Grossglauser
Joint International Conference on Measurement and Modeling of Computer Systems - SIGMETRICS '19, Phoenix, AZ, USA, June, 2019.
[view at publisher]

Graph alignment in two correlated random graphs refers to the task of identifying the correspondence between vertex sets of the graphs. Recent results have characterized the exact information-theoretic threshold for graph alignment in correlated Erdös-Rényi graphs. However, very little is known about the existence of efficient algorithms to achieve graph alignment without seeds. In this work we identify a region in which a straightforward O(n11/5 log n)-time canonical labeling algorithm, initially introduced in the context of graph isomorphism, succeeds in aligning correlated Erdos-Rényi graphs. The algorithm has two steps. In the first step, all vertices are labeled by their degrees and a trivial minimum distance alignment (i.e., sorting vertices according to their degrees) matches a fixed number of highest degree vertices in the two graphs. Having identified this subset of vertices, the remaining vertices are matched using a alignment algorithm for bipartite graphs. Finally, we show that the implementation of a variant of this algorithm allows for the efficient alignment of large graphs under limited noise.

Learning Hawkes Processes from a Handful of Events
F. Salehi, W. Trouleau, M. Grossglauser and P. Thiran
33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada, December 8-14, 2019.
[code]

Learning the causal-interaction network of multivariate Hawkes processes is a useful task in many applications. Maximum-likelihood estimation is the most common approach to solve the problem in the presence of long observation sequences. However, when only short sequences are available, the lack of data amplifies the risk of overfitting and regularization becomes critical. Due to the challenges of hyper-parameter tuning, state-of-the-art methods only parameterize regularizers by a single shared hyper-parameter, hence limiting the power of representation of the model. To solve both issues, we develop in this work an efficient algorithm based on variational expectation-maximization. Our approach is able to optimize over an extended set of hyper-parameters. It is also able to take into account the uncertainty in the model parameters by learning a posterior distribution over them. Experimental results on both synthetic and real datasets show that our approach significantly outperforms state-of-the-art methods under short observation sequences.

Learning Hawkes Processes Under Synchronization Noise
W. Trouleau, J. Etesami, M. Grossglauser, N. Kiyavash and P. Thiran
36th International Conference on Machine Learning, Long Beach, California, USA, June 9-15, 2019.

Multivariate Hawkes processes (MHP) are widely used in a variety of fields to model the occurrence of discrete events. Prior work on learning MHPs has only focused on inference in the presence of perfect traces without noise. We address the problem of learning the causal structure of MHPs when observations are subject to an unknown delay. In particular, we introduce the so-called synchronization noise, where the stream of events generated by each dimension is subject to a random and unknown time shift. We characterize the robustness of the classic maximum likelihood estimator to synchronization noise, and we introduce a new approach for learning the causal structure in the presence of noise. Our experimental results show that our approach accurately recovers the causal structure of MHPs for a wide range of noise levels, and significantly outperforms classic estimation methods.

Pairwise Comparisons with Flexible Time-Dynamics
L. Maystre, V. Kristof and M. Grossglauser
25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), Anchorage, AK, August 04-08, 2019.
[view at publisher]

Inspired by applications in sports where the skill of players or teams competing against each other varies over time, we propose a probabilistic model of pairwise-comparison outcomes that can capture a wide range of time dynamics. We achieve this by replacing the static parameters of a class of popular pairwise-comparison models by continuous-time Gaussian processes; the covariance function of these processes enables expressive dynamics. We develop an efficient inference algorithm that computes an approximate Bayesian posterior distribution. Despite the flexbility of our model, our inference algorithm requires only a few linear-time iterations over the data and can take advantage of modern multiprocessor computer architectures. We apply our model to several historical databases of sports outcomes and find that our approach a) outperforms competing approaches in terms of predictive performance, b) scales to millions of observations, and c) generates compelling visualizations that help in understanding and interpreting the data.

Controlling Polarization in Personalization: An Algorithmic Framework
L. E. Celis, S. Kapoor, F. Salehi and N. Vishnoi
ACM Conference on Fairness, Accountability, and Transparency (FAT), Atlanta, GA, Jan 29-31, 2019.
[view at publisher]

Personalization is pervasive in the online space as it leads to higher efficiency for the user and higher revenue for the platform by individualizing the most relevant content for each user. However, recent studies suggest that such personalization can learn and propagate systemic biases and polarize opinions; this has led to calls for regulatory mechanisms and algorithms that are constrained to combat bias and the resulting echo-chamber effect. We propose a versatile framework that allows for the possibility to reduce polarization in personalized systems by allowing the user to constrain the distribution from which content is selected. We then present a scalable algorithm with provable guarantees that satisfies the given constraints on the types of the content that can be displayed to a user, but- subject to these constraints- will continue to learn and personalize the content in order to maximize utility. We illustrate this framework on a curated dataset of online news articles that are conservative or liberal, show that it can control polarization, and examine the trade-off between decreasing polarization and the resulting loss to revenue. We further exhibit thefl exibility and scalability of our approach by framing the problem in terms of the more general diverse content selection problem and test it empirically on both a News dataset and the MovieLens dataset.

Classification with Fairness Constraints: A Meta-Algorithm with Provable Guarantees
L. E. Celis, L. Huang, V. Keswani and N. K. Vishnoi
ACM Conference on Fairness, Accountability, and Transparency (FAT), Atlanta, GA, Jan 29-31, 2019.
[view at publisher]

Developing classification algorithms that are fair with respect to sensitive attributes of the data is an important problem due to the increased deployment of classification algorithms in societal contexts. Several recent works have focused on studying classification with respect to specific fairness metrics, modeled the corresponding fair classification problem as constrained optimization problems, and developed tailored algorithms to solve them. Despite this, there still remain important metrics for which there are no fair classifiers with theoretical guarantees; primarily because the resulting optimization problem is non-convex. The main contribution of this paper is a meta-algorithm for classification that can take as input a general class of fairness constraints with respect to multiple non disjoint and multi-valued sensitive attributes, and which comes with provable guarantees. In particular, our algorithm can handle non-convex "linear fractional" constraints (which includes fairness constraints such as predictive parity) for which no prior algorithm was known. Key to our results is an algorithm for a family of classification problems with convex constraints along with a reduction from classification problems with linear fractional constraints to this family. Empirically, we observe that our algorithm is fast, can achieve near-perfect fairness with respect to various fairness metrics, and the loss in accuracy due to the imposed fairness constraints is often small.

A dashboard for controlling polarization in personalization
L. E. Celis, S. Kapoor, F. Salehi, V. Keswani and N. K. Vishnoi
Ai Communications, 2019.
[view at publisher]

Personalization is pervasive in the online space as it leads to higher efficiency for the user and higher revenue for the platform by individualizing the most relevant content for each user. However, recent studies suggest that such personalization can learn and propagate systemic biases and polarize opinions; this has led to calls for regulatory mechanisms and algorithms that are constrained to combat bias and the resulting echo-chamber effect. We present our balanced news feed via a demo that displays a dashboard through which users can view the political leaning of their news consumption and set their polarization constraints. The balanced feed, as generated by the user-defined constraints, is then showcased side-by-side with the unconstrained (polarized) feed for comparison.

[Re] Meta learning with differentiable closed-form solvers
A. Devos, S. Chatel and M. Grossglauser
The ReScience journal C, 2019.

In this work, we present a reproduction of the paper of Bertinetto et al. ”Meta- learning with differentiable closed-form solvers” as part of the ICLR 2019 Reproducibility Challenge. In successfully reproducing the most crucial part of the paper, we reach a performance that is comparable with or superior to the original paper on two benchmarks for several settings. We evaluate new baseline results, using a new dataset presented in the paper. Yet, we also provide multiple remarks and recommendations about reproducibility and comparability. After we brought our reproducibility work to the authorsʼ attention, they have updated the original paper on which this work is based and released code as well. Our contributions mainly consist in reproducing the most important results of their original paper, in giving insight in the reproducibility and in providing a first open-source implementation.

A General Framework for Sensor Placement in Source Localization
B. Spinelli, E. Celis and P. Thiran
IEEE Transactions on Network Science and Engineering, 2019.
[full text] [view at publisher]

When an epidemic spreads in a given network of individuals or communities, can we detect its source using only the information provided by a small set of nodes? We propose a general framework that incorporates two dimensions. First, we can either rely exclusively on a set of selected nodes (i.e., sensors) which always reveal their state independently of any particular epidemic (these are called static), or we can add some sensors (called dynamic) as an epidemic spreads, depending on which additional information is required. Second, the method can either localizes the source after an epidemic has spread through the entire network (offline), or while the epidemic is ongoing (online). We empirically study the performance of offline and online localization both with and without dynamic sensors. Our analysis shows that, by using dynamic sensors, the number of sensors necessary to localize the source is reduced by up to a factor of 10 and that, even with high-variance transmission delays, the source can be localized by using fewer than 5% of the nodes as sensors.

2018

SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient
A. Mishkin, F. Kunstner, D. Nielsen, M. Schmidt and M. E. Khan
32nd Conference on Neural Information Processing Systems (NIPS), Montreal, CANADA, Dec 02-08, 2018.

Uncertainty estimation in large deep-learning models is a computationally challenging task, where it is difficult to form even a Gaussian approximation to the posterior distribution. In such situations, existing methods usually resort to a diagonal approximation of the covariance matrix despite the fact that these matrices are known to result in poor uncertainty estimates. To address this issue, we propose a new stochastic, low-rank, approximate natural-gradient (SLANG) method for variational inference in large, deep models. Our method estimates a "diagonal plus low-rank" structure based solely on back-propagated gradients of the network log-likelihood. This requires strictly less gradient computations than methods that compute the gradient of the whole variational objective. Empirical evaluations on standard benchmarks confirm that SLANG enables faster and more accurate estimation of uncertainty than mean-field methods, and performs comparably to state-of-the-art methods.

Coordinate Descent with Bandit Sampling
F. Salehi, P. Thiran and L. E. Celis
32nd Conference on Neural Information Processing Systems (NIPS), Montreal, CANADA, Dec 02-08, 2018.

Coordinate descent methods usually minimize a cost function by updating a random decision variable (corresponding to one coordinate) at a time. Ideally, we would update the decision variable that yields the largest decrease in the cost function. However, finding this coordinate would require checking all of them, which would effectively negate the improvement in computational tractability that coordinate descent is intended to afford. To address this, we propose a new adaptive method for selecting a coordinate. First, we find a lower bound on the amount the cost function decreases when a coordinate is updated. We then use a multi-armed bandit algorithm to learn which coordinates result in the largest lower bound by interleaving this learning with conventional coordinate descent updates except that the coordinate is selected proportionately to the expected decrease. We show that our approach improves the convergence of coordinate descent methods both theoretically and experimentally.

Stochastic Optimal Control of Epidemic Processes in Networks
L. Lorch, A. De, S. Bhatt, W. Trouleau, U. Upadhyay and M. Gomez-Rodriguez
Machine Learning for Health (ML4H) Workshop at NeurIPS 2018, Palais des Congrès de Montréal, Montréal, Canada, December 8, 2018.

We approach the development of models and control strategies of susceptible-infected-susceptible (SIS) epidemic processes from the perspective of marked temporal point processes and stochastic optimal control of stochastic differential equations (SDEs) with jumps. In contrast to previous work, this novel perspective is particularly well-suited to make use of fine-grained data about disease outbreaks and lets us overcome the shortcomings of current control strategies. Our control strategy resorts to treatment intensities to determine who to treat and when to do so to minimize the amount of infected individuals over time. Preliminary experiments with synthetic data show that our control strategy consistently outperforms several alternatives. Looking into the future, we believe our methodology provides a promising step towards the development of practical data-driven control strategies of epidemic processes.

Optimal Number of Paths with Multipath Routing in Hybrid Networks
S. Henri and P. Thiran
19th IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM), Chania, GREECE, Jun 12-15, 2018.
[view at publisher]

In recent years, multipath routing, i.e., employing several paths simultaneously, has emerged as an efficient way to provide significant throughput gains in local networks. This has been observed both with technologies that are not subject to interference, such as Ethernet, and with technologies that are, such as WiFi, power-line communications (PLC) and LTE. With technologies that are subject to interference, adding more paths is not always beneficial. We investigate the number of simultaneous paths necessary to reach maximal throughput when using multipath routing in multi-hop mesh networks with several self-interfering technologies. We show analytically, numerically and experimentally that the optimal number of paths M-opt is tightly linked with the number of technologies K. For certain classes of networks (in particular, for typical home networks), we prove analytically that M-opt = K, and our analytical findings are verified both with simulations and with experiments on a testbed composed of PLC and two orthogonal WiFi channels. In general networks, our numerical and experimental results show that the throughput loss caused by using at most K simultaneous paths is very small: The relative loss is smaller than 0.05 in 97% of the networks and smaller than 0.1 in 99% of the networks.

Profit Maximizing Logistic Regression Modeling for Credit Scoring
A. Devos, J. Dhondt, E. Stripling, B. Baesens, S. vanden Broucke and G. Sukhatme
IEEE Data Science Workshop (DSW), Lausanne, Switzerland, 2018.
[view at publisher]

On the Combinatorial Version of the Slepian-Wolf Problem
D. Chumbalov and A. Romashchenko
IEEE TRANSACTIONS ON INFORMATION THEORY, 2018.
[view at publisher]

Load Balancing Video Clients in Enterprise Wireless Networks
A. Suresh, S. Mena, D. Tomozei, L. Granai, X. Zhu and S. Ferrari
IEEE Wireless Communications and Networking Conference (WCNC), Apr 15-18, 2018, Apr 15-18, 2018.
[view at publisher]

Given the increasing popularity of real-time video communication over enterprise wireless networks, ensuring good quality of experience is becoming critical. A common problem in such networks is that the clients typically adopt the strategy of associating to the access point with the highest signal strength, which can lead to load imbalance, unfair throughput allocation, and low overall video quality. In this paper, we propose to improve load balancing across video clients by selectively triggering directed roams of chosen clients, while explicitly accounting for the cost of roaming. We formulate the problem of identifying the directed roams to perform as an optimization problem where the objective function reflects both the overall utility of all video clients and the cost due to roaming. We subsequently apply relaxations to the original NP-Hard problem, and propose a three-stage algorithm that efficiently computes an approximate solution. Simulations show that by selectively roaming just 5% of the clients, we increase throughput relative to an initial strongest signal based association by at least 10% for over half of the clients and by at least 20% for over a quarter of the clients. We also increase aggregate throughput, improve fairness and thereby increase the overall video quality.

Multi-Armed Bandit in Action: Optimizing Performance in Dynamic Hybrid Networks
S. Henri, C. Vlachou and P. Thiran
IEEE/ACM Transactions on Networking, 2018-07-27.
[view at publisher]

Abstract: Today's home networks are often composed of several technologies such as Wi-Fi or power-line communication (PLC). Yet, current network protocols rarely fully harness this diversity and use each technology for a specific, pre-defined role, for example, wired media as a backbone and the wireless medium for mobility. Moreover, a single path is generally employed to transmit data; this path is established in advance and remains in use as long as it is valid, although multiple possible paths offer more robustness against varying environments. We introduce HyMAB, an algorithm that explores different multipaths and finds the best one in a mesh hybrid network, while keeping congestion under control. We employ the multi-armed-bandit framework and prove that HyMAB achieves optimal throughput under a static scenario. HyMAB design also accounts for real-network intricacies and dynamic conditions; it adapts to varying environments and switches multipaths when needed. We implement HyMAB on a PLC/Wi-Fi test bed. This is, to the best of our knowledge, the first implementation on a real test bed of multi-armed-bandit strategies in the context of routing. Our experimental results confirm the optimality of HyMAB and its ability to adapt to dynamic network conditions, as well as the gains provided by employing multi-armed-bandit strategies.

Can Who-Edits-What Predict Edit Survival?
A. B. Yardım, V. Kristof, L. Maystre and M. Grossglauser
24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, United Kingdom, August 19-23, 2018.
[code] [view at publisher]

As the number of contributors to online peer-production systems grows, it becomes increasingly important to predict whether the edits that users make will eventually be beneficial to the project. Existing solutions either rely on a user reputation system or consist of a highly specialized predictor that is tailored to a specific peer-production system. In this work, we explore a different point in the solution space that goes beyond user reputation but does not involve any content-based feature of the edits. We view each edit as a game between the editor and the component of the project. We posit that the probability that an edit is accepted is a function of the editor's skill, of the difficulty of editing the component and of a user-component interaction term. Our model is broadly applicable, as it only requires observing data about who makes an edit, what the edit affects and whether the edit survives or not. We apply our model on Wikipedia and the Linux kernel, two examples of large-scale peer-production systems, and we seek to understand whether it can effectively predict edit survival: in both cases, we provide a positive answer. Our approach significantly outperforms those based solely on user reputation and bridges the gap with specialized predictors that use content-based features. It is simple to implement, computationally inexpensive, and in addition it enables us to discover interesting structure in the data.

On the Delays in Time-Varying Networks: Does Larger Service-Rate Variance Imply Larger Delays?
S. Henri, S. Shneer and P. Thiran
Eighteenth ACM International Symposium on Mobile Ad Hoc Networking and Computing, Los Angeles, CA, USA, June 26 - 29, 2018.
[view at publisher]

In all networks, link or route capacities fluctuate for multiple reasons, e.g., fading and multi-path effects on wireless channels, interference and contending users on a shared medium, varying loads in WAN routers, impedance changes on power-line channels. These fluctuations severely impact packet delays. In this paper, we study delays in time-varying networks. Intuitively, we expect that for a given average service rate, an increased service rate variability yields larger delays. We find that this is not always the case. Using a queuing model that includes time-varying service rates, we show that for certain arrival rates, a queue with larger service rate variance offers smaller average delays than a queue with the same average service rate and lower service rate variance. We also verify these findings on a wireless testbed. We then study the conditions under which using simultaneously two independent paths helps in terms of delays, for example, in hybrid networks where the two paths use different physical layer technologies. We show that using two paths is not always better, in particular for low arrival rates. We also show that the optimal traffic splitting between the two paths depends on the arrival rate.

2017

Multiphase digitally controlled oscillator for future 5G phased arrays in 90 nm CMOS
A. Devos, M. Vigilante and P. Reynaert
IEEE Nordic Circuits and Systems Conference (NORCAS), Copenhagen, Denmark, 2017.
[view at publisher]

ChoiceRank: Identifying Preferences from Node Traffic in Networks
L. Maystre and M. Grossglauser
International Conference on Machine Learning, Sydney, Australia, August 6-11, 2017.
[code] [full text]

Understanding how users navigate in a network is of high interest in many applications. We consider a setting where only aggregate node-level traffic is observed and tackle the task of learning edge transition probabilities. We cast it as a preference learning problem, and we study a model where choices follow Luce's axiom. In this case, the O(n) marginal counts of node visits are a sufficient statistic for the O(n^2) transition probabilities. We show how to make the inference problem well-posed regardless of the network's structure, and we present ChoiceRank, an iterative algorithm that scales to networks that contains billions of nodes and edges. We apply the model to two clickstream datasets and show that it successfully recovers the transition probabilities using only the network structure and marginal (node-level) traffic data. Finally, we also consider an application to mobility networks and apply the model to one year of rides on New York City's bicycle-sharing system.

Just Sort It! A Simple and Effective Approach to Active Preference Learning
L. Maystre and M. Grossglauser
International Conference on Machine Learning, Sydney, Australia, August 6-11, 2017.
[full text]

We address the problem of learning a ranking by using adaptively chosen pairwise comparisons. Our goal is to recover the ranking accurately but to sample the comparisons sparingly. If all comparison outcomes are consistent with the ranking, the optimal solution is to use an efficient sorting algorithm, such as Quicksort. But how do sorting algorithms behave if some comparison outcomes are inconsistent with the ranking? We give favorable guarantees for Quicksort for the popular Bradley-Terry model, under natural assumptions on the parameters. Furthermore, we empirically demonstrate that sorting algorithms lead to a very simple and effective active learning strategy: repeatedly sort the items. This strategy performs as well as state-of-the-art methods (and much better than random sampling) at a minuscule fraction of the computational cost.

Back to the Source: an Online Approach for Sensor Placement and Source Localization
B. Spinelli, L. E. Celis and P. Thiran
26th International World Wide Web Conference (WWW), 2017.
[full text] [view at publisher]

Source localization, the act of finding the originator of a disease or rumor in a network, has become an important problem in sociology and epidemiology. The localization is done using the infection state and time of infection of a few designated sensor nodes; however, maintaining sensors can be very costly in practice. We propose the first online approach to source localization: We deploy a priori only a small number of sensors (which reveal if they are reached by an infection) and then iteratively choose the best location to place new sensors in order to localize the source. This approach allows for source localization with a very small number of sensors; moreover, the source can be found while the epidemic is still ongoing. Our method applies to a general network topology and performs well even with random transmission delays.

How CSMA/CA With Deferral Affects Performance and Dynamics in Power-Line Communications
C. Vlachou, A. Banchs, J. Herzen and P. Thiran
Ieee-Acm Transactions On Networking, 2017.
[view at publisher]

Power-line communications (PLC) are becoming a key component in home networking, because they provide easy and high-throughput connectivity. The dominant MAC protocol for high data-rate PLC, the IEEE 1901, employs a CSMA/CA mechanism similar to the backoff process of 802.11. Existing performance evaluation studies of this protocol assume that the backoff processes of the stations are independent (the so-called decoupling assumption). However, in contrast to 802.11, 1901 stations can change their state after sensing the medium busy, which is regulated by the so-called deferral counter. This mechanism introduces strong coupling between the stations and, as a result, makes existing analyses inaccurate. In this paper, we propose a performance model for 1901, which does not rely on the decoupling assumption. We prove that our model admits a unique solution for a wide range of configurations and confirm the accuracy of the model using simulations. Our results show that we outperform current models based on the decoupling assumption. In addition to evaluating the performance in steady state, we further study the transient dynamics of 1901, which is also affected by the deferral counter.

2016

Just One More: Modeling Binge Watching Behavior
W. Trouleau, A. Ashkan, W. Ding and B. Eriksson
22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California, USA, August 13-17, 2016.
[full text] [view at publisher]

Easy accessibility can often lead to over-consumption, as seen in food and alcohol habits. On video on-demand (VOD) services, this has recently been referred to as binge watching, where potentially entire seasons of TV shows are consumed in a single viewing session. While a user viewership model may reveal this binging behavior, creating an accurate model has several challenges, including censored data, deviations in the population, and the need to consider external influences on consumption habits. In this paper, we introduce a novel statistical mixture model that incorporates these factors and presents a first of its kind characterization of viewer consumption behavior using a real-world dataset that includes playback data from a VOD service. From our modeling, we tackle various predictive tasks to infer the consumption decisions of a user in a viewing session, including estimating the number of episodes they watch and classifying if they continue watching another episode. Using these insights, we then identify binge watching sessions based on deviation from normal viewing behavior. We observe different types of binging behavior, that binge watchers often view certain content out-of-order, and that binge watching is not a consistent behavior among our users. These insights and our findings have application in VOD revenue generation, consumer health applications, and customer retention analysis.

Assignment Techniques for Crowdsourcing Sensitive Tasks
L. E. Celis, S. P. Reddy, I. P. Singh and S. Vaya
19th ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW), San Francisco, CA, FEB 27-MAR 02, 2016.
[view at publisher]

Protecting the privacy of crowd workers has been an important topic in crowdsourcing, however, task privacy has largely been ignored despite the fact that many tasks, e.g., form digitization, live audio transcription or image tagging often contain sensitive information. Although assigning an entire job to a worker may leak private information, jobs can often be split into small components that individually do not. We study the problem of distributing such tasks to workers with the goal of maximizing task privacy using such an approach. We introduce information loss functions to formally measure the amount of private information leaked as a function of the task assignment. We then design assignment mechanisms for three different assignment settings: PUSH, PULL and a new setting Tug Of War (TOW), which is an intermediate approach that balances flexibility for both workers and requesters. Our assignment algorithms have zero privacy loss for PUSH, and tight theoretical guarantees for PULL. For TOW, our assignment algorithm provably outperforms PULL; importantly the privacy loss is independent of the number of tasks, even when workers collude. We further analyze the performance and privacy tradeoffs empirically on simulated and real-world collusion networks and find that our algorithms outperform the theoretical guarantees.

Online Collaborative Prediction of Regional Vote Results
V. Etter, M. E. Khan, M. Grossglauser and P. Thiran
2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Montreal, QC, Canada, 17-19 October 2016.
[full text] [view at publisher]

We consider online predictions of vote results, where regions across a country vote on an issue under discussion. Such online predictions before and during the day of the vote are useful to media agencies, polling institutes, and political parties, e.g., to identify regions that are crucial in determining the national outcome of a vote. We analyze a unique dataset from Switzerland. The dataset contains 281 votes from 2352 regions over a period of 34 years. We make several contributions towards improving online predictions. First, we show that these votes exhibit a bi-clustering of the vote results, i.e., regions that are spatially close tend to vote similarly, and issues that discuss similar topics show similar global voting patterns. Second, we develop models that can exploit this bi-clustering, as well as the features associated with the votes and regions. Third, we show that, when combining vote results and features together, Bayesian methods are essential to obtaining good performance. Our results show that Bayesian methods give better estimates of the hyperparameters than non-Bayesian methods such as cross-validation. The resulting models generalize well to many different tasks, produce robust predictions, and are easily interpretable.

Uncovering Latent Behaviors in Ant Colonies
M. Kafsi, R. Braunschweig, D. Mersch, M. Grossglauser, L. Keller and P. Thiran
2016 SIAM International Conference on Data Mining, Miami, Florida, USA, May 5-7, 2016.
[full text] [view at publisher]

Many biological systems exhibit collective behaviors that strengthen their adaptability to their environment, compared to more solitary species. Describing these behaviors is challenging yet necessary in order to understand these biological systems. We propose a probabilistic model that enables us to uncover the collective behaviors observed in a colony of ants. This model is based on the assumption that the behavior of an individual ant is a time-dependent mixture of latent behaviors that are specific to the whole colony. We apply this model to a large-scale dataset obtained by observing the mobility of nearly 1000 Camponotus fellah ants from six different colonies. Our results indicate that a colony typically exhibits three classes of behaviors, each characterized by a specific spatial distribution and a level of activity. Moreover, these spatial distributions, which are uncovered automatically by our model, match well with the ground truth as manually annotated by domain experts. We further explore the evolution of the behavior of individual ants and show that it is well captured by a second order Markov chain that encodes the fact that the future behavior of an ant depends not only on its current behavior but also on its preceding one.

PROPER: global protein interaction network alignment through percolation matching
E. Kazemi, H. Hassani, M. Grossglauser and H. P. Modarres
BMC Bioinformatics, 2016.
[full text] [view at publisher]

Background The alignment of protein-protein interaction (PPI) networks enables us to uncover the relationships between different species, which leads to a deeper understanding of biological systems. Network alignment can be used to transfer biological knowledge between species. Although different PI-network alignment algorithms were introduced during the last decade, developing an accurate and scalable algorithm that can find alignments with high biological and structural similarities among PPI networks is still challenging. Results In this paper, we introduce a new global network alignment algorithm for PPI networks called PROPER. Compared to other global network alignment methods, our algorithm shows higher accuracy and speed over real PPI datasets and synthetic networks. We show that the PROPER algorithm can detect large portions of conserved biological pathways between species. Also, using a simple parsimonious evolutionary model, we explain why PROPER performs well based on several different comparison criteria. Conclusions We highlight that PROPER has high potential in further applications such as detecting biological pathways, finding protein complexes and PPI prediction. The PROPER algorithm is available at http://proper.epfl.ch.

EMPoWER Hybrid Networks: Exploiting Multiple Paths over Wireless and ElectRical Mediums
S. Henri, C. Vlachou, J. Herzen and P. Thiran
ACM Conference on emerging Networking EXperiments and Technologies (CoNEXT) 2016, Irvine, California, USA, December 12-15, 2016.
[full text] [view at publisher]

Several technologies, such as WiFi, Ethernet and power-line communications (PLC), can be used to build residential and enterprise networks. These technologies often co-exist; most networks use WiFi, and buildings are readily equipped with electrical wires that can offer a capacity up to 1 Gbps with PLC. Yet, current networks do not exploit this rich diversity and often operate far below the available capacity. We design, implement, and evaluate EMPoWER, a system that exploits simultaneously several potentially-interfering mediums. It operates at layer 2.5, between the MAC and IP layers, and combines routing (to find multiple concurrent routes) and congestion control (to efficiently balance traffic across the routes). To optimize resource utilization and robustness, both components exploit the heterogeneous nature of the network. They are fair and efficient, and they operate only within the local area network, without affecting remote Internet hosts. We demonstrate the performance gains of EMPoWER, by simulations and experiments on a 22-node testbed. We show that PLC/WiFi, benefiting from the diversity offered by wireless and electrical mediums, provides significant throughput gains (up to 10x) and improves coverage, compared to multi-channel WiFi.

Collaborative Recurrent Neural Networks for Dynamic Recommender Systems
Y. J. Ko, L. Maystre and M. Grossglauser
The 8th Asian Conference on Machine Learning, Hamilton, New Zealand, November 16-18, 2016.
[full text]

Modern technologies enable us to record sequences of online user activity at an unprecedented scale. Although such activity logs are abundantly available, most approaches to recommender systems are based on the rating-prediction paradigm, ignoring temporal and contextual aspects of user behavior revealed by temporal, recurrent patterns. In contrast to explicit ratings, such activity logs can be collected in a non-intrusive way and can offer richer insights into the dynamics of user preferences, which could potentially lead more accurate user models. In this work we advocate studying this ubiquitous form of data and, by combining ideas from latent factor models for collaborative filtering and language modeling, propose a novel, flexible and expressive collaborative sequence model based on recurrent neural networks. The model is designed to capture a user’s contextual state as a personalized hidden vector by summarizing cues from a data-driven, thus variable, number of past time steps, and represents items by a real-valued embedding. We found that, by exploiting the inherent structure in the data, our formulation leads to an efficient and practical method. Furthermore, we demonstrate the versatility of our model by applying it to two different tasks: music recommendation and mobility prediction, and we show empirically that our model consistently outperforms static and non-collaborative methods.

Analysis and Enhancement of CSMA/CA With Deferral in Power-Line Communications
C. Vlachou, A. Banchs, P. Salvador, J. Herzen and P. Thiran
IEEE Journal on Selected Areas in Communications, 2016.
[view at publisher]

Power-line communications are employed in home networking to provide easy and high-throughput connectivity. The IEEE 1901, the MAC protocol for power-line networks, employs a CSMA/CA protocol similar to that of 802.11, but is substantially more complex, which probably explains why little is known about its performance. One of the key differences between the two protocols is that whereas 802.11 only reacts upon collisions, 1901 also reacts upon several consecutive transmissions and thus can potentially achieve better performance by avoiding unnecessary collisions. In this paper, we propose a model for the 1901 MAC. Our analysis reveals that the default configuration of 1901 does not fully exploit its potential and that its performance degrades with the number of stations. Based on analytical reasoning, we derive a configuration for the parameters of 1901 that drastically improves throughput and achieves optimal performance without requiring the knowledge of the number of stations in the network. In contrast, 802.11 requires knowing the number of contending stations to provide a similar performance, which is unfeasible for realistic traffic patterns. We confirm our results and enhancement with testbed measurements, by implementing the 1901 MAC protocol on WiFi hardware.

Assembling a Network out of Ambiguous Patches
L. Yartseva, J. S. Elbert and M. Grossglauser
Allerton, Allerton Park and Retreat Center, Monticello, IL, USA, 2016.
[full text] [view at publisher]

Many graph mining and network analysis problems rely on the availability of the full network over a set of nodes. But inferring a full network is sometimes non-trivial if the raw data is in the form of many small {\em patches} or subgraphs, of the true network, and if there are ambiguities in the identities of nodes or edges in these patches. This may happen because of noise or because of the nature of data; for instance, in social networks, names are typically not unique. \textit{Graph assembly} refers to the problem of reconstructing a graph from these many, possibly noisy, partial observations. Prior work suggests that graph assembly is essentially impossible in regimes of interest when the true graph is Erdos-Renyi. The purpose of the present paper is to show that a modest amount of clustering is sufficient to assemble even very large graphs. We introduce the $G(n,p;q)$ random graph model, which is the random closure over all open triangles of a $G(n,p)$ Erdos-Renyi, and show that this model exhibits higher clustering than an equivalent Erdos-Renyi. We focus on an extreme case of graph assembly: the patches are small ($1$-hop egonets) and are unlabeled. We show that in realistic regimes, graph assembly is fundamentally feasible, because we can identify, for every edge $e$, a subgraph induced by its neighbors that is unique and present in every patch containing $e$. Using this result, we build a practical algorithm that uses canonical labeling to reconstruct the original graph from noiseless patches. We also provide an achievability result for noisy patches, which are obtained by edge-sampling the original egonets.

The Player Kernel: Learning Team Strengths Based on Implicit Player Contributions
L. Maystre, V. Kristof, A. J. González Ferrer and M. Grossglauser
Machine Learning and Data Mining for Sports Analytics 2016, Riva del Garda, Italy, September 19, 2016.
[full text]

In this work, we draw attention to a connection between skill-based models of game outcomes and Gaussian process classification models. The Gaussian process perspective enables a) a principled way of dealing with uncertainty and b) rich models, specified through kernel functions. Using this connection, we tackle the problem of predicting outcomes of football matches between national teams. We develop a player kernel that relates any two football matches through the players lined up on the field. This makes it possible to share knowledge gained from observing matches between clubs (available in large quantities) and matches between national teams (available only in limited quantities). We evaluate our approach on the Euro 2008, 2012 and 2016 final tournaments.

Observer Placement for Source Localization: the Effect of Budgets and Transmission Variance
B. Spinelli, L. E. Celis and P. Thiran
54th Annual Allerton Conference on Communication, Control, and Computing, 2016.
[full text] [view at publisher]

When an epidemic spreads in a network, a key question is where was its source, i.e., the node that started the epidemic. If we know the time at which various nodes were infected, we can attempt to use this information in order to identify the source. However, maintaining observer nodes that can provide their infection time may be costly, and we may have a budget k on the number of observer nodes we can maintain. Moreover, some nodes are more informative than others due to their location in the network. Hence, a pertinent question arises: Which nodes should we select as observers in order to maximize the probability that we can accurately identify the source? Inspired by the simple setting in which the node-to-node delays in the transmission of the epidemic are deterministic, we develop a principled approach for addressing the problem even when transmission delays are random. We show that the optimal observer-placement differs depending on the variance of the transmission delays and propose approaches in both low- and high-variance settings. We validate our methods by comparing them against state-of-the-art observer-placements and show that, in both settings, our approach identifies the source with higher accuracy.

Where You Are is Who You Are: User Identification by Matching Statistics
F. Movahedi Naini, J. Unnikrishnan, P. Thiran and M. Vetterli
IEEE Transactions on Information Forensics and Security, 2016.
[full text] [view at publisher]

Most users of online services have unique behavioral or usage patterns. These behavioral patterns can be used to identify and track users by using only the observed patterns in the behavior. We study the task of identifying users from statistics of their behavioral patterns. Specifically, we focus on the setting in which we are given histograms of users’ data collected in two different experiments. In the first dataset, we assume that the users’ identities are anonymized or hidden and in the second dataset we assume that their identities are known. We study the task of identifying the users in the first dataset by matching the histograms of their data with the histograms from the second dataset. In a recent work [1], [2] the optimal algorithm for this user identification task was introduced. In this paper, we evaluate the effectiveness of this method on a wide range of datasets with up to 50, 000 users, and in a wide range of scenarios. Using datasets such as call data records, web browsing histories, and GPS trajectories, we demonstrate that a large fraction of users can be easily identified given only histograms of their data, and hence these histograms can act as users’ fingerprints. We also show that simultaneous identification of users achieves better performance compared to one-by-one user identification. Furthermore, we show that using the optimal method for identification does indeed give higher identification accuracies than heuristics-based approaches in such practical scenarios. The accuracies obtained under this optimal method can thus be used to quantify the maximum level of user identification that is possible in such settings. We show that the key factors affecting the accuracy of the optimal identification algorithm are the duration of the data collection, the number of users in the anonymized dataset, and the resolution of the dataset. We also analyze the effectiveness of k-anonymization in resisting user identification attacks on these datasets.

2015

Traveling Salesman in Reverse: Conditional Markov Entropy for Trajectory Segmentation
M. Kafsi, M. Grossglauser and P. Thiran
2015 IEEE International Conference on Data Mining, Atlantic City, NJ, USA, November 14-17, 2015.
[full text] [view at publisher]

We are interested in inferring the set of waypoints (or intermediate destinations) of a mobility trajectory in the absence of timing information. We find that, by mining a dataset of real mobility traces, computing the entropy of conditional Markov trajectory enables us to uncover waypoints, even though no timing information nor absolute geographic location is provided. We build on this observation and design an efficient algorithm for trajectory segmentation. Our empirical evaluation demonstrates that the entropy-based heuristic used by our segmentation algorithm outperforms alternative approaches as it is 43% more accurate than a geometric approach and 20% more accurate than path-stretch based approach. We further explore the link between trajectory entropy, mobility predictability and the nature of intermediate locations using a route choice model on real city maps.

Budgeted sensor placement for source localization on trees
L. E. Celis, F. Pavetic, B. Spinelli and P. Thiran
Latin-American Algorithms, Graphs and Optimization Symposium, 2015.
[full text] [view at publisher]

We address the problem of choosing a fixed number of sensor vertices in a graph in order to detect the source of a partially-observed diffusion process on the graph itself. Building on the definition of double resolvability we introduce a notion of vertex resolvability. For the case of tree graphs we give polynomial time algorithms for both finding the sensors that maximize the probability of correct detection of the source and for identifying the sensor set that minimizes the expected distance between the real source and the estimated one.

Kullback-Leibler Proximal Variational Inference
M. E. Khan, P. B. Baqué, F. Fleuret and P. Fua
Advances in Neural Information Processing Systems (NIPS), Montreal, Canada, December 9, 2015.
[full text]

We propose a new variational inference method based on a proximal framework that uses the Kullback-Leibler (KL) divergence as the proximal term. We make two contributions towards exploiting the geometry and structure of the variational bound. Firstly, we propose a KL proximal-point algorithm and show its equivalence to variational inference with natural gradients (e.g. stochastic variational inference). Secondly, we use the proximal framework to derive efficient variational algorithms for non-conjugate models. We propose a splitting procedure to separate non-conjugate terms from conjugate ones. We linearize the non-conjugate terms to obtain subproblems that admit a closed-form solution. Overall, our approach converts inference in a non-conjugate model to subproblems that involve inference in well-known conjugate models. We show that our method is applicable to a wide variety of models and can result in computationally efficient algorithms. Applications to real-world datasets show comparable performance to existing methods.

UAVs using Bayesian Optimization to LocateWiFi Devices
M. Carpin, S. Rosati, B. Rimoldi and M. E. Khan
Bayesopt 2015, Montreal, Canada, December 12, 2015.
[full text]

We address the problem of localizing non-collaborative WiFi devices in a large region. Our main motive is to localize humans by localizing their WiFi devices, e.g. during search-and-rescue operations after a natural disaster. We use an active sensing approach that relies on Unmanned Aerial Vehicles (UAVs) to collect signal-strength measurements at informative locations. The problem is challenging since the measurement is received at arbitrary times and they are received only when the UAV is in close proximity to the device. For these reasons, it is extremely important to make prudent decision with very few measurements. We use the Bayesian optimization approach based on Gaussian process (GP) regression. This approach works well for our application since GPs give reliable predictions with very few measurements while Bayesian optimization makes a judicious trade-off between exploration and exploitation. In field experiments conducted over a region of 1000 × 1000 m2, we show that our approach reduces the search area to less than 100 meters around the WiFi device within 5 minutes only. Overall, our approach localizes the device in less than 15 minutes with an error of less than 20 meters.

Expectation Propagation for Rectified Linear Poisson Regression
Y. J. Ko and M. Seeger
7th Asian Conference on Machine Learning (ACML), Hong Kong, November 20-22, 2015.
[full text]

The Poisson likelihood with rectified linear function as non-linearity is a physically plausible model to discribe the stochastic arrival process of photons or other particles at a detector. At low emission rates the discrete nature of this process leads to measurement noise that behaves very differently from additive white Gaussian noise. To address the intractable inference problem for such models, we present a novel efficient and robust Expectation Propagation algorithm entirely based on analytically tractable computations operating re- liably in regimes where quadrature based implementations can fail. Full posterior inference therefore becomes an attractive alternative in areas generally dominated by methods of point estimation. Moreover, we discuss the rectified linear function in the context of other common non-linearities and identify situations where it can serve as a robust alternative.

Hierarchical Routing Over Dynamic Wireless Networks
D. Tschopp, S. Diggavi and M. Grossglauser
Random Structures & Algorithms, 2015.
[view at publisher]

The topology of a mobile wireless network changes over time. Maintaining routes between all nodes requires the continuous transmission of control information, which consumes precious power and bandwidth resources. Many routing protocols have been developed, trading off control overhead and route quality. In this paper, we ask whether there exist low-overhead schemes that produce low-stretch routes, even in large networks where all the nodes are mobile. We present a scheme that maintains a hierarchical structure within which constant-stretch routes can be efficiently computed between every pair of nodes. The scheme rebuilds each level of the hierarchy periodically, at a rate that decreases exponentially with the level of the hierarchy. We prove that this scheme achieves constant stretch under a mild smoothness condition on the nodal mobility processes. Furthermore, we prove tight bounds for the network-wide control overhead under the additional assumption of the connectivity graph forming a doubling metric space. Specifically, we show that for a connectivity model combining the random geometric graph with obstacles, constant-stretch routes can be maintained with a total overhead of n log2 n bits of control information per time unit.

Fast and Accurate Inference of Plackett-Luce Models
L. Maystre and M. Grossglauser
Neural Information Processing Systems (NIPS), Montreal, Quebec, Canada, December 7-12, 2015.
[full text]

We show that the maximum-likelihood (ML) estimate of models derived from Luce’s choice axiom (e.g., the Plackett–Luce model) can be expressed as the stationary distribution of a Markov chain. This conveys insight into several recently proposed spectral inference algorithms. We take advantage of this perspective and formulate a new spectral algorithm that is significantly more accurate than previous ones for the Plackett–Luce model. With a simple adaptation, this algorithm can be used iteratively, producing a sequence of estimates that converges to the ML estimate. The ML version runs faster than competing approaches on a benchmark of five datasets. Our algorithms are easy to implement, making them relevant for practitioners at large

When Can Two Unlabeled Networks Be Aligned Under Partial Overlap?
E. Kazemi, L. Yartseva and M. Grossglauser
53rd IEEE Communication, Control, and Computing (Annual Allerton Conference), University of Illinois at Urbana-Champaign, September 30-October 2, 2015.
[full text] [view at publisher]

Network alignment refers to the problem of matching the vertex sets of two unlabeled graphs, which can be viewed as a generalization of the classic graph isomorphism problem. Network alignment has applications in several fields, including social network analysis, privacy, pattern recognition, computer vision, and computational biology. A number of heuristic algorithms have been proposed in these fields. Recent progress in the analysis of network alignment over stochastic models sheds light on the interplay between network parameters and matchability. In this paper, we consider the alignment problem when the two networks overlap only partially, i.e., there exist vertices in one network that have no counterpart in the other. We define a random bigraph model that generates two correlated graphs $G_{1,2}$; it is parameterized by the expected node overlap $t^2$ and by the expected edge overlap $s^2$. We define a cost function for structural mismatch under a particular alignment, and we identify a threshold for perfect matchability: if the average node degrees of $G_{1,2}$ grow as $\omega\left( (s^{-2}t^{-1} \log(n) \right)$, then minimization of the proposed cost function results in an alignment which (i) is over exactly the set of shared nodes between $G_1$ and $G_2$, and (ii) agrees with the true matching between these shared nodes. Our result shows that network alignment is fundamentally robust to partial edge and node overlaps.

Electri-Fi Your Data: Measuring and Combining Power-Line Communications with WiFi
C. Vlachou, S. Henri and P. Thiran
ACM Internet Measurement Conference (IMC) 2015, Tokyo, Japan, October 28-30, 2015.
[full text] [view at publisher]

Power-line communication (PLC) is widely used as it offers high data-rates and forms a network over electrical wiring, an existing and ubiquitous infrastructure. PLC is increasingly being deployed in hybrid networks that combine multiple technologies, the most popular among which is WiFi. However, so far, it is not clear to which extent PLC can boost network performance or how hybrid implementations can exploit to the fullest this technology. We compare the spatial and temporal variations of WiFi and PLC. Despite the potential of PLC and its vast deployment in commercial products, little is known about its performance. To route or load balance traffic in hybrid networks, a solid understanding of PLC and its link metrics is required. We conduct experiments in a testbed of more than 140 links. We introduce link metrics that are crucial for studying PLC and that are required for quality-aware algorithms by recent standardizations of hybrid networks. We explore the spatial and temporal variation of PLC channels, showing that they are highly asymmetric and that link quality and link-metric temporal variability are strongly correlated. Based on our variation study, we propose and validate a capacity estimation technique via a metric that only uses the frame header. We also focus on retransmissions due to channel errors or to contention, a metric related to delay, and examine the sensitivity of metrics to background traffic. Our performance evaluation provides insight into the implementation of hybrid networks; we ease the intricacies of understanding the performance characteristics of the PHY and MAC layers.

CSMA/CA in Time and Frequency Domains
J. Herzen, A. Banchs, V. Shneer and P. Thiran
IEEE ICNP, San Francisco, CA, November 10-13, 2015.
[full text] [view at publisher]

It has recently been shown that flexible channelization, whereby wireless stations adapt their spectrum bands on a per-frame basis, is feasible in practice. In this paper, we propose TF-CSMA/CA, an algorithm for flexible channelization that schedules packets in time and frequency domains. TFCSMA/CA is a simple extension of the CSMA/CA protocol used by IEEE 802.11. Contrary to existing channelization schemes, it is entirely distributed and it reacts only to packet collisions, successful transmissions and carrier sensing. With TF-CSMA/CA, when a station is involved in a collision, it performs backoff in both time and frequency domains. Backing off also in the frequency domain allows the transmitters to be much more efficient and aggressive in the time domain, which significantly reduces the severe overheads present with recent 802.11 PHY layers. The main challenge, however, is that the stations need some level of self-organization in order to find spectrum bands of variable widths that minimize interference, while still efficiently using the available spectrum. Using analysis and simulations, we show that such an extension of CSMA/CA to the frequency domain drastically improves both throughput and fairness. Notably, it enables the stations to find interference-free spectrum bands of appropriate size using no communication – relying only on collisions and successes as implicit signals.

Learning Wi-Fi Performance
J. Herzen, H. Lundgren and N. Hegde
IEEE SECON, Seattle, WA, June 22-25, 2015.
[full text] [view at publisher]

Accurate prediction of wireless network performance is important when performing link adaptation or resource allocation. However, the complexity of interference interactions at MAC and PHY layers, as well as the vast variety of possible wireless configurations make it notoriously hard to design explicit performance models. In this paper, we advocate an approach of “learning by observation” that can remove the need for designing explicit and complex performance models. We use machine-learning techniques to learn implicit performance models, from a limited number of real-world measurements. These models do not require to know the internal mechanics of interfering Wi-Fi links. Yet, our results show that they improve accuracy by at least 49% compared to measurement-seeded models based on SINR. To demonstrate that learned models can be useful in practice, we build a new algorithm that uses such a model as an oracle to jointly allocate spectrum and transmit power. Our algorithm is utility-optimal, distributed, and it produces efficient allocations that significantly improve performance and fairness.

Growing a Graph Matching from a Handful of Seeds
E. Kazemi, S. H. Hassani and M. Grossglauser
Proceedings of the VLDB Endowment International Conference on Very Large Data Bases, 2015.
[full text] [view at publisher]

In many graph–mining problems, two networks from different domains have to be matched. In the absence of reliable node attributes, graph matching has to rely on only the link structures of the two networks, which amounts to a generalization of the classic graph isomorphism problem. Graph matching has applications in social–network reconciliation and de-anonymization, protein–network alignment in biology, and computer vision. The most scalable graph–matching approaches use ideas from percolation theory, where a matched node pair “infects” neighbouring pairs as additional potential matches. This class of matching algorithm requires an initial seed set of known matches to start the percolation. The size and correctness of the matching is very sensitive to the size of the seed set. In this paper, we give a new graph–matching algorithm that can operate with a much smaller seed set than previous approaches, with only a small increase in matching errors. We characterize a phase transition in matching performance as a function of the seed set size, using a random bigraph model and ideas from bootstrap percolation theory. We also show the excellent performance in matching several real large-scale social networks, using only a handful of seeds.

Virtually Moving Base Stations for Energy Efficiency in Wireless Sensor Networks
R. Zhang, P. Thiran and M. Vetterli
The Sixteenth ACM International Symposium on Mobile Ad Hoc Networking and Computing, Hangzhou, China, June 22-25, 2015.
[full text] [view at publisher]

Energy efficiency of wireless sensor networks (WSNs) can be improved by moving base stations (BSs), as this scheme evenly distributes the communication load in the network. However, physically moving BSs is complicated and costly. In this paper, we propose a new scheme: virtually moving BSs. We deploy an excessive number of BSs and adaptively re-select a subset of active BSs so as to emulate the physical movement. Beyond achieving high energy efficiency, this scheme obviates the difficulties associated with physically moving BSs. Challenges are that (i) the energy efficiency of BSs should be considered as well, in addition to that of the sensor nodes; (ii) the number of candidate subset of active BSs is exponential with the number of BSs. We show that scheduling the virtual movement of BSs is NP-hard. Then, we propose a polynomial-time algorithm that is guaranteed under mild conditions to achieve a lifetime longer than 62% of the optimal one. In practice, as verified through extensive numerical simulations, the lifetime achieved by the proposed algorithm is always very close to the optimum.

The Beauty of the Commons: Optimal Load Sharing by Base Station Hopping in Wireless Sensor Networks
R. Zhang, F. Ingelrest, G. Barrenetxea, P. Thiran and M. Vetterli
IEEE Journal on Selected Areas in Communications, 2015.
[full text] [view at publisher]

In wireless sensor networks (WSNs), the base station (BS) is a critical sensor node whose failure causes severe data losses. Deploying multiple fixed BSs improves the robustness, yet requires all BSs to be installed with large batteries and large energy-harvesting devices due to the high energy consumption of BSs. In this paper, we propose a scheme to coordinate the multiple deployed BSs such that the energy supplies required by individual BSs can be substantially reduced. In this scheme, only one BS is selected to be active at a time and the other BSs act as regular sensor nodes. We first present the basic architecture of our system, including how we keep the network running with only one active BS and how we manage the handover of the role of the active BS. Then, we propose an algorithm for adaptively selecting the active BS under the spatial and temporal variations of energy resources. This algorithm is simple to implement but is also asymptotically optimal under mild conditions. Finally, by running simulations and real experiments on an outdoor testbed, we verify that the proposed scheme is energy-efficient, has low communication overhead and reacts rapidly to network changes.

Opportunistic Sampling for Joint Population Size and Density Estimation
F. Movahedi Naini, O. Dousse, P. Thiran and M. Vetterli
IEEE Transactions on Mobile Computing, 2015.
[code] [full text] [view at publisher]

Consider a set of probes, called “agents”, who sample, based on opportunistic contacts, a population moving between a set of discrete locations. An example of such agents are Bluetooth probes that sample the visible Bluetooth devices in a population. Based on the obtained measurements, we construct a parametric statistical model to jointly estimate the total population size (e.g., the number of visible Bluetooth devices) and their spatial density. We evaluate the performance of our estimators by using Bluetooth traces obtained during an open-air event and Wi-Fi traces obtained on a university campus.

2014

Variational Gaussian Inference for Bilinear Models of Count Data
Y. J. Ko and M. E. Khan
6th Asian Conference on Machine Learning, Nha Trang, Vietnam, 2014.
[full text]

Bilinear models of count data with Poisson distribution are popular in applications such as matrix factorization for recommendation systems, modeling of receptive fields of sensory neurons, and modeling of neural-spike trains. Bayesian inference in such models remains challenging due to the product term of two Gaussian random vectors. In this paper, we propose new algorithms for such models based on variational Gaussian (VG) inference. We make two contributions. First, we show that the VG lower bound for these models, previously known to be intractable, is available in closed form under certain non-trivial constraints on the form of the posterior. Second, we show that the lower bound is biconcave and can be efficiently optimized for mean-field approximations. We also show that bi-concavity generalizes to the larger family of log-concave likelihoods, that subsume the Poisson distribution. We present new inference algorithms based on these results and demonstrate better performance on real-world problems at the cost of a modest increase in computation. Our contributions in this paper, therefore, provide more choices for Bayesian inference in terms of a speed-vs-accuracy tradeoff.

Buy-It-Now or Take-a-Chance: Price Discrimination Through Randomized Auctions
L. E. Celis, G. Lewis, M. Mobius and H. Nazerzadeh
Management Science, 2014.
[view at publisher]

Increasingly detailed consumer information makes sophisticated price discrimination possible. At fine levels of aggregation, demand may not obey standard regularity conditions. We propose a new randomized sales mechanism for such environments. Bidders can "buy-it-now" at a posted price, or "take-a-chance" in an auction where the top d > 1 bidders are equally likely to win. The randomized allocation incentivizes high-valuation bidders to buy-it-now. We analyze equilibrium behavior and apply our analysis to advertiser bidding data from Microsoft Advertising Exchange. In counterfactual simulations, our mechanism increases revenue by 4.4% and consumer surplus by 14.5% compared to an optimal second-price auction.

Data-driven healthcare: from patterns to actions
M. Grossglauser and H. Saner
European Journal Of Preventive Cardiology, 2014.
[view at publisher]

The era of big data opens up new opportunities in personalised medicine, preventive care, chronic disease management and in telemonitoring and managing of patients with implanted devices. The rich data accumulating within online services and internet companies provide a microscope to study human behaviour at scale, and to ask completely new questions about the interplay between behavioural patterns and health. In this paper, we shed light on a particular aspect of data-driven healthcare: autonomous decision-making. We first look at three examples where we can expect data-driven decisions to be taken autonomously by technology, with no or limited human intervention. We then discuss some of the technical and practical challenges that can be expected, and sketch the research agenda to address them.

Analyzing and Boosting the Performance of Power-Line Communication Networks
C. Vlachou, A. Banchs, J. Herzen and P. Thiran
ACM 10th International Conference on emerging Networking EXperiments and Technologies (CoNEXT), Sydney, Australia, December 2-5, 2014.
[full text] [view at publisher]

Power-line communications are employed in home networking to provide easy and high-throughput connectivity. IEEE 1901, the MAC protocol for power-line networks, employs a CSMA/CA protocol similar to that of 802.11, but is substantially more complex, which probably explains why little is known about its performance. One of the key differences between the two protocols is that whereas 802.11 only reacts upon collisions, 1901 also reacts upon several consecutive transmissions and thus can potentially achieve better performance by avoiding unnecessary collisions. In this paper, we propose a model for the 1901 MAC. Our analysis reveals that the default configuration of 1901 does not fully exploit its potential and that its performance degrades with the number of stations. We derive analytically the optimal configuration parameters for 1901; this drastically improves throughput and achieves optimal performance without requiring the knowledge of the number of stations in the network. In contrast, to provide a similar performance, 802.11 requires knowing the number of contending stations, which is unfeasible for realistic traffic patterns. Our solution can be readily implemented by vendors, as it only consists in modifying existing MAC parameters. We corroborate our results with testbed measurements, unveiling a significant signaling overhead in 1901 implementations.

On the MAC for Power-Line Communications: Modeling Assumptions and Performance Tradeoffs
C. Vlachou, A. Banchs, J. Herzen and P. Thiran
IEEE 22nd International Conference on Network Protocols (ICNP), The Research Triangle, North Carolina, 2014.
[full text] [view at publisher]

Power-line communications are becoming a key component in home networking. The dominant MAC protocol for high data-rate power-line communications, IEEE 1901, employs a CSMA/CA mechanism similar to the backoff process of 802.11. Existing performance evaluation studies of this protocol assume that the backoff processes of the stations are independent (the so-called decoupling assumption). However, in contrast to 802.11, 1901 stations can change their state after sensing the medium busy, which introduces strong coupling between the stations, and, as a result, makes existing analyses inaccurate. In this paper, we propose a new performance model for 1901, which does not rely on the decoupling assumption. We prove that our model admits a unique solution. We confirm the accuracy of our model using both testbed experiments and simulations, and we show that it surpasses current models based on the decoupling assumption. Furthermore, we study the tradeoff between delay and throughput existing with 1901. We show that this protocol can be configured to accommodate different throughput and jitter requirements, and give systematic guidelines for its configuration.

Mining Democracy
V. Etter, J. Herzen, M. Grossglauser and P. Thiran
ACM Conference on Online Social Networks (COSN'14), Dublin, Ireland, October 1-2, 2014.
[full text] [view at publisher]

Switzerland has a long tradition of direct democracy, which makes it an ideal laboratory for research on real-world politics. Similar to recent open government initiatives launched worldwide, the Swiss government regularly releases datasets related to state affairs and politics. In this paper, we propose an exploratory, data-driven study of the political landscape of Switzerland, in which we use opinions expressed by candidates and citizens on a web platform during the recent Swiss parliamentary elections, together with fine-grained vote results and parliament votes. Following this purely data-driven approach, we show that it is possible to uncover interesting patterns that would otherwise require both tedious manual analysis and domain knowledge. In particular, we show that traditional cultural and/or ideological idiosyncrasies can be highlighted and quantified by looking at vote results and pre-election opinions. We propose a technique for comparing the candidates' opinions expressed before the elections with their actual votes cast in the parliament after the elections. This technique spots politicians that do not vote consistently with the opinions that they expressed during the campaign. We also observe that it is possible to predict surprisingly precisely the outcome of nationwide votes, by looking at the outcome in a single, carefully selected municipality. Our work applies to any country where similar data is available; it points to some of the avenues created by user-generated data emerging from open government initiatives, which enable new data-mining approaches to political and social sciences.

Performance Analysis of MAC for Power-Line Communications
C. Vlachou, A. Banchs, J. Herzen and P. Thiran
ACM SIGMETRICS 2014 (poster session), Austin, Texas, USA, 16-20 June, 2014.
[full text] [view at publisher]

We investigate the IEEE 1901 MAC protocol, the dominant protocol for high data rate power-line communications. 1901 employs a CSMA/CA mechanism similar to – but much more complex than – the backoff mechanism of 802.11. Because of this extra complexity, and although this mechanism is the only widely used MAC layer for power-line networks, there are few analytical results on its performance. We propose a model for the 1901 MAC that comes in the form of a single fixed-point equation for the collision probability. We prove that this equation admits a unique solution, and we evaluate the accuracy of our model by using simulations.

Privacy-Preserving Function Computation by Exploitation of Friendships in Social Networks
F. Movahedi Naini, J. Unnikrishnan, P. Thiran and M. Vetterli
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florence, Italy, May 4-9, 2014.
[full text] [view at publisher]

We study the problem of privacy-preserving computation of functions of data that belong to users in a social network under the assumption that users are willing to share their private data with trusted friends in the network. We demonstrate that such trust relationships can be exploited to significantly improve the trade-off between the privacy of users’ data and the accuracy of the computation. Under a one-hop trust model we design an algorithm for partitioning the users into circles of trust and develop a differentially private scheme for computing the global function using results of local computations within each circle. We quantify the improvement in the privacy--accuracy trade-off of our scheme with respect to other mechanisms that do not exploit inter-user trust. We verify the efficiency of our algorithm by implementing it on social networks with up to one million nodes. Applications of our method include surveys, elections, and recommendation systems.

Hiding in the Mobile Crowd: Location Privacy through Collaboration
R. Shokri, G. Theodorakopoulos, P. Papadimitratos, E. Kazemi and J. Hubaux
IEEE Transactions on Dependable and Secure Computing, 2014.
[full text] [view at publisher]

Location-aware smartphones support various location-based services (LBSs): users query the LBS server and learn on the fly about their surroundings. However, such queries give away private information, enabling the LBS to track users. We address this problem by proposing a user-collaborative privacy preserving approach for LBSs. Our solution does not require changing the LBS server architecture and does not assume third party servers; yet, it significantly improves users’ location privacy. The gain stems from the collaboration of mobile devices: they keep their context information in a buffer and pass it to others seeking such information. Thus, a user remains hidden from the server, unless all the collaborative peers in the vicinity lack the sought information. We evaluate our scheme against the Bayesian localization attacks that allow for strong adversaries who can incorporate prior knowledge in their attacks. We develop a novel epidemic model to capture the, possibly time-dependent, dynamics of information propagation among users. Used in the Bayesian inference framework, this model helps analyze the effects of various parameters, such as users’ querying rates and the lifetime of context information, on users’ location privacy. The results show that our scheme hides a high fraction of location-based queries, thus significantly enhancing users’ location privacy. Our simulations with real mobility traces corroborate our model-based findings. Finally, our implementation on mobile platforms indicates that it is lightweight and the cost of collaboration is negligible.

2013

A Bayesian Method for Matching Two Similar Graphs without Seeds
P. Pedarsani, D. R. Figueiredo and M. Grossglauser
Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, October 2013.
[full text] [view at publisher]

Approximate graph matching (AGM) refers to the problem of mapping the vertices of two structurally similar graphs, which has applications in social networks, computer vision, chemistry, and biology. Given its computational cost, AGM has mostly been limited to either small graphs (e.g., tens or hundreds of nodes), or to large graphs in combination with side information beyond the graph structure (e.g., a seed set of pre-mapped node pairs). In this paper, we cast AGM in a Bayesian framework based on a clean definition of the probability of correctly mapping two nodes, which leads to a polynomial time algorithm that does not require side information. Node features such as degree and distances to other nodes are used as fingerprints. The algorithm proceeds in rounds, such that the most likely pairs are mapped first; these pairs subsequently generate additional features in the fingerprints of other nodes. We evaluate our method over real social networks and show that it achieves a very low matching error provided the two graphs are sufficiently similar. We also evaluate our method on random graph models to characterize its behavior under various levels of node clustering.

Mitigating Epidemics through Mobile Micro-measures
M. Kafsi, E. Kazemi, L. Maystre, L. Yartseva, M. Grossglauser and P. Thiran
NetMob, Boston, Massachusetts, USA, May 2013.
[full text]

De-anonymizing Private Data by Matching Statistics
J. Unnikrishnan and F. Movahedi Naini
Allerton Conference on Communication, Control, and Computing, Monticello, Illinois, USA, October 2-4, 2013.
[full text] [view at publisher]

Where to go from here? Mobility prediction from instantaneous information
V. Etter, M. Kafsi, E. Kazemi, M. Grossglauser and P. Thiran
Pervasive And Mobile Computing, 2013.
[view at publisher]

We present the work that allowed us to win the Next-Place Prediction task of the Nokia Mobile Data Challenge. Using data collected from the smartphones of 80 users, we explore the characteristics of their mobility traces. We then develop three families of predictors, including tailored models and generic algorithms, to predict, based on instantaneous information only, the next place a user will visit. These predictors are enhanced with aging techniques that allow them to adapt quickly to the users' changes of habit. Finally, we devise various strategies to blend predictors together and take advantage of their diversity, leading to relative improvements of up to 4%. (C) 2013 Elsevier B.V. All rights reserved.

Wireless Multi-hop Networks Beyond Capacity
A. Aziz, S. Shneer and P. Thiran
19th IEEE International Workshop on Local and Metropolitan Area Networks (LANMAN), 2013.
[view at publisher]

Wireless multi-hop local area networks use in general scheduling schemes that assume the network capacity to be known. Indeed in most of the throughput-optimal algorithms the sources are assumed to send at a rate within the capacity region. However, measurements made on real deployments show that the network capacity is usually difficult to characterize and also time-varying. It is therefore important to understand how the network behaves when the sources attempt to transmit at a rate above capacity. Toward this goal, we show 3-phase regime in the effect of the input rate lambda on the end-to-end throughput mu of a multi-hop network. First, when lambda is smaller than a threshold lambda(1), mu is an increasing function of lambda. Second, when lambda is larger than another threshold lambda(2) > lambda(1), mu is independent of lambda. Third, when lambda(1) < lambda < lambda(2), mu decreases with lambda. To understand this phenomenon, we capture the relation between the end-to-end throughput and the queue stability with a mathematical model that allows us to explain and derive the exact values of the transition points lambda(i). We then validate experimentally our simulation results with measurements on a testbed composed of five wireless routers.

SAW: Spectrum Assignment for WLANs
J. Herzen, R. Merz and P. Thiran
ACM S3 2013, Miami, Florida, USA, September 30, 2013.
[full text]

We consider the problem of jointly allocating channel center-frequencies and bandwidths for IEEE 802.11 wireless LANs (WLANs). The bandwidth used on a link significantly affects both the capacity experienced on this link and the interference produced on neighboring links. Therefore, when jointly assigning both center frequencies and channel widths, a trade-off must be found between interference mitigation and the potential capacity offered on each link. We study this trade-off and present SAW (spectrum assignment for WLANs), a decentralized algorithm that finds efficient configurations. SAW is tailored for 802.11 home networks. It is distributed, online and transparent. It does not require a central coordinator and it constantly adapts the spectrum usage without disrupting network traffic. The algorithm is decentralized and self-organizing; it provably converges towards efficient spectrum allocations. We evaluate SAW using both simulation and a deployment on an indoor testbed composed of off-the-shelf 802.11 hardware. We observe that it dramatically increases the overall network efficiency and fairness, even when some Access Points (APs) do not behave socially.

On the Performance of Percolation Graph Matching
L. Yartseva and M. Grossglauser
COSN'13: Conference on Online Social Networks, Boston, Massachusetts, USA, October 7-8, 2013.
[full text] [view at publisher]

Graph matching is a generalization of the classic graph isomorphism problem. By using only their structures a graph-matching algorithm finds a map between the vertex sets of two similar graphs. This has applications in the de-anonymization of social and information networks and, more generally, in the merging of structural data from different domains. One class of graph-matching algorithms starts with a known seed set of matched node pairs. Despite the success of these algorithms in practical applications, their performance has been observed to be very sensitive to the size of the seed set. The lack of a rigorous understanding of parameters and performance makes it difficult to design systems and predict their behavior. In this paper, we propose and analyze a very simple percolation $\!$-based graph matching algorithm that incrementally maps every pair of nodes $(i,j)$ with at least $r$ neighboring mapped pairs. The simplicity of this algorithm makes possible a rigorous analysis that relies on recent advances in bootstrap percolation theory for the $G(n,p)$ random graph. We prove conditions on the model parameters in which percolation graph matching succeeds, and we establish a phase transition in the size of the seed set. We also confirm through experiments that the performance of percolation graph matching is surprisingly good, both for synthetic graphs and real social-network data.

Launch Hard or Go Home! Predicting the Success of Kickstarter Campaigns
V. Etter, M. Grossglauser and P. Thiran
The first ACM conference on Online Social Networks (COSN'13), Boston, Massachusetts, USA, October 7-8, 2013.
[full text] [view at publisher]

Crowdfunding websites such as Kickstarter are becoming increasingly popular, allowing project creators to raise hundreds of millions of dollars every year. However, only one out of two Kickstarter campaigns reaches its funding goal and is successful. It is therefore of prime importance, both for project creators and backers, to be able to know which campaigns are likely to succeed. We propose a method for predicting the success of Kickstarter campaigns by using both direct information and social features. We introduce a first set of predictors that uses the time series of money pledges to classify campaigns as probable success or failure and a second set that uses information gathered from tweets and Kickstarter's projects/backers graph. We show that even though the predictors that are based solely on the amount of money pledged reach a high accuracy, combining them with predictors using social features enables us to improve the performance significantly. In particular, only 4 hours after the launch of a campaign, the combined predictor reaches an accuracy of more than 76% (a relative improvement of 4%).

The Entropy of Conditional Markov Trajectories
M. Kafsi, M. Grossglauser and P. Thiran
IEEE TRANSACTIONS ON INFORMATION THEORY, 2013.
[view at publisher]

To quantify the randomness of Markov trajectories with fixed initial and final states, Ekroot and Cover proposed a closed-form expression for the entropy of trajectories of an irreducible finite state Markov chain. Numerous applications, including the study of random walks on graphs, require the computation of the entropy of Markov trajectories conditional on a set of intermediate states. However, the expression of Ekroot and Cover does not allow for computing this quantity. In this paper, we propose a method to compute the entropy of conditional Markov trajectories through a transformation of the original Markov chain into a Markov chain that exhibits the desired conditional distribution of trajectories. Moreover, we express the entropy of Markov trajectories—a global quantity—as a linear combination of local entropies associated with the Markov chain states.

Nowhere to Hide: Navigating around Privacy in Online Social Networks
M. Humbert, T. Studer, M. Grossglauser and J. Hubaux
The 18th European Symposium on Research in Computer Security (ESORICS), Egham, United Kingdom, September 9-11, 2013.
[full text] [view at publisher]

In this paper, we introduce a navigation privacy attack, where an external adversary attempts to find a target user by exploiting publicly visible attributes of intermediate users. If such an attack is successful, it implies that a user cannot hide simply by excluding himself from a central directory or search function. The attack exploits the fact that most attributes (such as place of residence, age, or alma mater) tend to correlate with social proximity, which can be exploited as navigational cues while crawling the network. The problem is exacerbated by privacy policies where a user who keeps his profile private remains nevertheless visible in his friends' "friend lists"; such a user is still vulnerable to our navigation attack. Experiments with Facebook and Google+ show that the majority of users can be found efficiently using our attack, if a small set of attributes are known about the target as side information. Our results suggest that, in an online social network where many users reveal a (even limited) set of attributes, it is nearly impossible for a specific user to "hide in the crowd".

Distributed Spectrum Assignment for Home WLANs
J. Herzen, R. Merz and P. Thiran
IEEE Infocom 2013, Torino, Italy, April 15-19, 2013.
[full text] [view at publisher]

We consider the problem of jointly allocating chan- nel center frequencies and bandwidths for IEEE 802.11 wireless LANs (WLANs). The bandwidth used on a link affects sig- nificantly both the capacity experienced on this link and the interference produced on neighboring links. Therefore, when jointly assigning both center frequencies and channel widths, there is a trade-off between interference mitigation and the potential capacity offered on each link. We study this trade- off and we present SAW (spectrum assignment for WLANs), a decentralized algorithm that finds efficient configurations. SAW is tailored for 802.11 home networks. It is distributed, online and transparent. It does not require a central coordinator and it constantly adapts the spectrum usage without disrupting network traffic. A key feature of SAW is that the access points (APs) need only a few out-of-band measurements in order to make spectrum allocation decisions. Despite being completely decentralized, the algorithm is self-organizing and provably converges towards efficient spectrum allocations. We evaluate SAW using both simulation and a deployment on an indoor testbed composed of off-the-shelf 802.11 hardware. We observe that it dramatically increases the overall network efficiency and fairness.

Fairness of MAC protocols: IEEE 1901 vs. 802.11
C. Vlachou, J. Herzen and P. Thiran
IEEE 17th International Symposium On Power Line Communications And Its Applications (ISPLC), Johannesburg, South Africa, 2013.
[full text] [view at publisher]

The MAC layer of the IEEE 1901 standard for power line communications employs a CSMA/CA method similar to, but more complex than, this of IEEE 802.11 for wireless communications. The differences between these protocols raise questions such as which one performs better and under what conditions. We study the fairness of the 1901 MAC protocol in single contention domain networks, where all stations hear each other. We examine fairness at the packet level: a MAC layer protocol is fair if all stations share equitably the medium during a fixed time interval. We focus on short-term fairness, that is, over short time intervals. Short-term fairness directly impacts end-user experience, because unfair protocols are susceptible to introduce substantial packet delays. We evaluate short-term fairness with two metrics: Jain's fairness index and the number of inter-transmissions. We present test-bed results of both protocols and compare them with simulations. Both simulation and test-bed results indicate that 802.11 is fairer in the short-term when the number of stations N is between 2 and 5. However, simulation results reveal that 1901 is fairer in the short-term for N >= 15. Importantly, our test-bed measurements indicate that 1901 unfairness can cause significant additional delay when N = 2. Finally, we confirm these results by showing analytically that 1901 is short-term unfair for N = 2.

2012

Locating the Source of Diffusion in Large-Scale Networks
P. Pinto, P. Thiran and M. Vetterli
Physical Review Letters, 2012.
[full text] [view at publisher]

How can we localize the source of diffusion in a complex network? Due to the tremendous size of many real networks---such as the Internet or the human social graph---it is usually infeasible to observe the state of all nodes in a network. We show that it is fundamentally possible to estimate the location of the source from measurements collected by sparsely-placed observers. We present a strategy that is optimal for arbitrary trees, achieving maximum probability of correct localization. We describe efficient implementations with complexity O(N^{\alpha}) , where \alpha=1 for arbitrary trees, and \alpha=3 for arbitrary graphs. In the context of several case studies, we determine how localization accuracy is affected by various system parameters, including the structure of the network, the density of observers, and the number of observed cascades.

Been There, Done That: What Your Mobility Traces Reveal about Your Behavior
V. Etter, M. Kafsi and E. Kazemi
Mobile Data Challenge by Nokia Workshop, in conjunction with Int. Conf. on Pervasive Computing, Newcastle, UK, June 18-19, 2012.
[full text]

Mobility is a central aspect of our life; the locations we visit reflect our tastes and lifestyle and shape our social relationships. The ability to foresee the places a user will visit is therefore beneficial to numerous applications, ranging from forecasting the dynamics of crowds to improving the relevance of location-based recommendations. To solve the Next Place Prediction task of the Nokia Mobile Data Challenge, we developed several mobility predictors, based on graphical models, neural networks, and decision trees, and explain some of the challenges that we faced. Then, we combine these predictors using different blending strategies, which improve the prediction accuracy over any individual predictor.