ABSTRACT
We address the problem of predicting aggregate vote outcomes (e.g., national) from partial outcomes (e.g., regional) that are revealed sequentially. We combine matrix factorization techniques and generalized linear models (GLMs) to obtain a flexible, efficient, and accurate algorithm. This algorithm works in two stages: First, it learns representations of the regions from high-dimensional historical data. Second, it uses these representations to fit a GLM to the partially observed results and to predict unobserved results. We show experimentally that our algorithm is able to accurately predict the outcomes of Swiss referenda, U.S. presidential elections, and German legislative elections. We also explore the regional representations in terms of ideological and cultural patterns. Finally, we deploy an online Web platform (www.predikon.ch) to provide real-time vote predictions in Switzerland and a data visualization tool to explore voting behavior. A by-product is a dataset of sequential vote results for 330 referenda and 2196 Swiss municipalities.
Supplemental Material
- R. Bamler and S. Mandt. Dynamic word embeddings. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 380--389, 2017.Google ScholarDigital Library
- L. H. Bean. How to predict elections. 1948.Google Scholar
- E. Belanger. Finding and using empirical data for vote and popularity functions in France. French Politics, 2 (2): 235--244, 2004.Google ScholarCross Ref
- R. M. Bell and Y. Koren. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. In Seventh IEEE International Conference on Data Mining (ICDM 2007), pages 43--52. IEEE, 2007.Google ScholarDigital Library
- D. Bertsimas, C. Pawlowski, and Y. D. Zhuo. From predictive methods to missing data imputation: an optimization approach. The Journal of Machine Learning Research, 18 (1): 7133--7171, 2017.Google ScholarDigital Library
- M. Blumenthal. The poblano model, 2008. URL https://web.archive.org/web/20090414152429/http://www.nationaljournal.com/njonline/mp_20080507_8254.php. Accessed: 2020-02-13.Google Scholar
- S. Boyd and L. Vandenberghe. Convex optimization. Cambridge University Press, 2004.Google ScholarCross Ref
- M. Brand. Incremental singular value decomposition of uncertain data with missing values. In European Conference on Computer Vision, pages 707--720. Springer, 2002.Google ScholarDigital Library
- M. Brand. Fast online svd revisions for lightweight recommender systems. In Proceedings of the 2003 SIAM International Conference on Data Mining, pages 37--46. SIAM, 2003.Google ScholarCross Ref
- M. Choy, M. Cheong, M. N. Laik, and K. P. Shung. US presidential election 2012 prediction using census corrected Twitter model. arXiv preprint arXiv:1211.0938, 2012.Google Scholar
- P. Diaconis and R. L. Graham. Spearman's footrule as a measure of disarray. Journal of the Royal Statistical Society: Series B (Methodological), 39 (2): 262--268, 1977.Google ScholarCross Ref
- N. Dwi Prasetyo and C. Hauff. Twitter-based election prediction in the developing world. In Proceedings of the 26th ACM Conference on Hypertext & Social Media, pages 149--158, 2015.Google ScholarDigital Library
- C. Eckart and G. Young. The approximation of one matrix by another of lower rank. Psychometrika, 1 (3), 1936.Google Scholar
- V. Etter, J. Herzen, M. Grossglauser, and P. Thiran. Mining democracy. In Proceedings of the second ACM Conference on Online Social Networks, 2014.Google ScholarDigital Library
- V. Etter, M. E. Khan, M. Grossglauser, and P. Thiran. Online collaborative prediction of regional vote results. In 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2016.Google ScholarCross Ref
- F. Franch. (wisdom of the crowds) 2: 2010 uk election prediction with social media. Journal of Information Technology & Politics, 10 (1): 57--71, 2013.Google ScholarCross Ref
- T. Hastie, R. Tibshirani, G. Sherlock, M. Eisen, P. Brown, and D. Botstein. Imputing missing data for gene expression arrays. 1999.Google Scholar
- R. Kennedy, S. Wojcik, and D. Lazer. Improving election prediction internationally. Science, 355 (6324): 515--520, 2017.Google ScholarCross Ref
- Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42 (8): 30--37, 2009.Google ScholarDigital Library
- J. B. Kristensen, T. Albrechtsen, E. Dahl-Nielsen, M. Jensen, M. Skovrind, and T. Bornakke. Parsimonious data: How a single Facebook like predicts voting behavior in multiparty systems. PloS one, 12 (9), 2017.Google Scholar
- M. S. Lewis-Beck. Election forecasting: Principles and practice. The British Journal of Politics and International Relations, 7 (2): 145--164, 2005.Google ScholarCross Ref
- L. v. d. Maaten and G. Hinton. Visualizing data using t-SNE. Journal of machine learning research, 9 (Nov): 2579--2605, 2008.Google Scholar
- MIT Election Data and Science Lab. U.S. President 1976--2016, 2017. URL https://doi.org/10.7910/DVN/42MVDX. Accessed: 2020-02-06.Google Scholar
- K. P. Murphy. Machine learning: a probabilistic perspective. The MIT Press, 2012.Google ScholarDigital Library
- Norwegian Centre for Research Data. German parliamentary elections, 2020. URL https://nsd.no/european_election_database/country/germany/parliamentary_elections.html. Accessed: 2020-06-16.Google Scholar
- J. Ramteke, S. Shah, D. Godhia, and A. Shaikh. Election result prediction using Twitter sentiment analysis. In 2016 international conference on inventive computation technologies (ICICT), volume 1, pages 1--5. IEEE, 2016.Google ScholarCross Ref
- S. E. Rigdon, S. H. Jacobson, W. K. Tam Cho, E. C. Sewell, and C. J. Rigdon. A Bayesian prediction model for the US presidential election. American Politics Research, 37 (4): 700--724, 2009.Google ScholarCross Ref
- N. Silver. Pollster ratings v3.0, 2008. URL https://fivethirtyeight.com/features/pollster-ratings-v30/. Accessed: 2020-02--13.Google Scholar
- The Swiss Confederation. Democracy, 2019 a. URL https://www.ch.ch/en/demokratie/. Accessed: 2020-02-04.Google Scholar
- The Swiss Confederation. Popular vote, 2019 b . URL https://www.admin.ch/gov/en/start/documentation/votes.html. Accessed: 2020-02-04.Google Scholar
- The Swiss Federal Statistical Office (via opendata.swiss). Real-time data on referenda on vote days, 2020. URL https://opendata.swiss/en/dataset/echtzeitdaten-am-abstimmungstag-zu-eidgenoessischen-abstimmungsvorlagen. Accessed: 2020-02-11.Google Scholar
- O. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, R. Tibshirani, D. Botstein, and R. B. Altman. Missing value estimation methods for DNA microarrays. Bioinformatics, 17 (6): 520--525, 2001.Google ScholarCross Ref
- A. Tumasjan, T. O. Sprenger, P. G. Sandner, and I. M. Welpe. Election forecasts with Twitter: How 140 characters reflect the political landscape. Social science computer review, 29 (4): 402--418, 2011.Google Scholar
- T. Vepsäläinen, H. Li, and R. Suomi. Facebook likes and public opinion: Predicting the 2015 Finnish parliamentary elections. Government Information Quarterly, 34 (3): 524--532, 2017.Google ScholarCross Ref
Index Terms
- Sub-Matrix Factorization for Real-Time Vote Prediction
Recommendations
Vote prediction by iterative domain knowledge and attribute elimination
Data mining the American National Election Study (ANES), a rich but disparate source of information about Americans' vote choices, is the focus of this research. Specifically, we use data mining classification to construct a decision tree to select ...
predicting the vote using legislative speech
dg.o '18: Proceedings of the 19th Annual International Conference on Digital Government Research: Governance in the Data AgeAs most dedicated observers of voting bodies like the U.S. Supreme Court can attest, it is possible to guess vote outcomes based on statements made during deliberations or questioning by the voting members. We show this is also possible to do ...
Co-manifold Matrix Factorization
ICCPR '20: Proceedings of the 2020 9th International Conference on Computing and Pattern RecognitionMatrix factorization plays a fundamental role in collaborative filtering. In collaborative filtering setting, the rating matrix R is very sparse. Thus, infinite number of matrices can fit the observed entries in the rating matrix. Without additional ...
Comments