Hi 👋 I’m Dimitris Spathis

Research Scientist
Google
         


I am a research scientist at Google and a visiting researcher at the University of Cambridge. My work enables AI to handle the messiness of the real world through human-centric, data-efficient, and robust machine learning. I am particularly interested in the following areas:

  • AI for Sequential & Multimodal Data: I develop new AI models that make the most of high-frequency person-generated data through self-supervised learning [CHIL'21], multimodal fusion [WSDM'24], forecasting [KDD'19 oral], and knowledge distillation [UbiComp'21].
  • Accessible Health Sensing: I build AI systems that detect vital health information without specialized equipment, with applications to disease monitoring [NeurIPS'21], cardio fitness [Nature Dig. Medicine'22], sleep disorders [Sci. Reports'22], and more.
  • Robust & Trustworthy AI: I develop reliable ML algorithms for high-stakes applications, focusing on out-of-distribution generalization [ML4H'22, ACLw'17], addressing forgeting [WACV'24], fairness [KDD'24], and ethical considerations [JAMIA'21].

Previously, I was a senior research scientist at Nokia Bell Labs, leading efforts in AI for multimodal health. Before that, I completed a PhD in Computer Science at the University of Cambridge working with Prof. Cecilia Mascolo. During my studies, I was fortunate to work at Microsoft Research, Telefonica Research, and Ocado. I also helped start COVID-19 Sounds, one of the largest studies in audio AI for health.

My research has been published in top venues in artificial intelligence, AI for health, and human-centered signal processing while recent projects have been featured in international media such as the BBC, CNN, Guardian, Washington Post, Forbes, and Financial Times (see more below).

CV

📖 Publications


2024

🦜PaPaGei: Open Foundation Models for Optical Physiological Signals

Arvind Pillai, Dimitris Spathis, Fahim Kawsar, Mohammad Malekzadeh
NeurIPS Workshop on Time Series in the Age of Large Models (TSALM @ NeurIPS'24), Vancouver, Canada
(long paper under review)
Oral presentation

The first step is the hardest: Pitfalls of Representing and Tokenizing Temporal Data for Large Language Models

Dimitris Spathis, Fahim Kawsar
Journal of the American Medical Informatics Association
also presented in: Generative AI for Pervasive Computing Symposium (GenAI4PC) at UbiComp 2023, Cancun, Mexico

Using Self-Supervised Learning Can Improve Model Fairness

Sofia Yfantidou, Dimitris Spathis, Marios Constantinides, Athena Vakali, Daniele Quercia, Fahim Kawsar
International Conference on Knowledge Discovery and Data Mining (KDD'24), Barcelona, Spain
also presented in: Human-centric Representation Learning workshop at AAAI 2024, Vancouver, Canada

CroSSL: Cross-modal Self-Supervised Learning for Time-series through Latent Masking

Shohreh Deldari, Dimitris Spathis, Mohammad Malekzadeh, Fahim Kawsar, Flora Salim, Akhil Mathur
ACM Conference on Web Search and Data Mining (WSDM'24) Merida, Mexico
also presented in: ICML Machine Learning for Multimodal Health Data workshop, Hawaii, USA

Kaizen: Practical self-supervised continual learning with continual fine-tuning

Chi Ian Tang, Lorena Qendro, Dimitris Spathis, Fahim Kawsar, Cecilia Mascolo, Akhil Mathur
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV'24), Hawaii, USA

StatioCL: Contrastive Learning for Time Series via Non-Stationary and Temporal Contrast

Yu Wu, Ting Dang, Dimitris Spathis, Hong Jia, Cecilia Mascolo
ACM International Conference on Information and Knowledge Management (CIKM'24), Boise, USA

OptiBreathe: An Earable-based PPG System for Continuous Respiration Rate, Breathing Phase, and Tidal Volume Monitoring

Julia Romero, Andrea Ferlini, Dimitris Spathis, Ting Dang, Katayoun Farrahi, Fahim Kawsar, Alessandro Montanari
Intl. Workshop on Mobile Computing Systems and Applications (HotMobile'24), San Diego, USA

Balancing Continual Learning and Fine-tuning for Human Activity Recognition

Chi Ian Tang, Lorena Qendro, Dimitris Spathis, Fahim Kawsar, Cecilia Mascolo, Akhil Mathur
AAAI Human-centric Representation Learning workshop (HCRL @ AAAI'24), Vancouver, Canada

2023

The State of Algorithmic Fairness in Mobile Human-Computer Interaction

Sofia Yfantidou, Marios Constantinides, Dimitris Spathis, Athena Vakali, Daniele Quercia, Fahim Kawsar
ACM International Conference on Mobile Human-Computer Interaction (MobileHCI'23), Athens, Greece

Human-centred artificial intelligence for mobile health sensing: challenges and opportunities

Ting Dang, Dimitris Spathis, Abhirup Ghosh, Cecilia Mascolo
Royal Society Open Science

UDAMA: Unsupervised Domain Adaptation through Multi-discriminator Adversarial Training with Noisy Labels Improves Cardio-fitness Prediction

Yu Wu, Dimitris Spathis, Hong Jia, Ignacio Perez-Pozuelo, Tomas I Gonzales, Soren Brage, Nicholas Wareham, Cecilia Mascolo
Machine Learning for Healthcare (MLHC'23), New York, USA

Conditional Neural ODE Processes for Individual Disease Progression Forecasting: A Case Study on COVID-19

Ting Dang, Jing Han, Tong Xia, Erika Bondareva, Chloë Siegele-Brown, Jagmohan Chauhan, Andreas Grammenos, Dimitris Spathis, Pietro Cicuta, Cecilia Mascolo
International Conference on Knowledge Discovery and Data Mining (KDD'23), Long Beach, USA

Recent Advances, Applications and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2022 Symposium

Stefan Hegselmann, Helen Zhou, Yuyin Zhou, Jennifer Chien, Sujay Nagaraj, Neha Hulkund, Shreyas Bhave, Michael Oberst ... Dimitris Spathis, Jun Seita, Bastiaan Quast, Megan Coffee, Collin Stultz, Irene Y Chen, Shalmali Joshi, Girmaw Abebe Tadesse
Technical report

Evaluating Listening Performance for COVID-19 Detection by Clinicians and Machine Learning: A Comparative Study

Jing Han, Marco Montagna, Andreas Grammenos, Tong Xia, Erika Bondareva, Chloë Siegele-Brown, Jagmohan Chauhan, Ting Dang, Dimitris Spathis, Andres Floto, Pietro Cicuta, Cecilia Mascolo
Journal of Medical Internet Research (JMIR), 25

A Summary of the ComParE COVID-19 Challenges

Alican Akman, Harry Coppock, Christian Bergler, Maurice Gerczuk, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Jing Han, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Panagiotis Tzirakis, Anton Batliner, Cecilia Mascolo, Björn Wolfgang Schuller
Frontiers in Digital Health

2022

Longitudinal cardio-respiratory fitness prediction through wearables in free-living environments

Dimitris Spathis*, Ignacio Perez-Pozuelo*, Tomas I. Gonzales, Yu Wu, Soren Brage, Nicholas Wareham, Cecilia Mascolo (*equal contribution)
Nature Digital Medicine, 5(176)
Altmetric Top 5% of all research outputs

Sounds of COVID-19: exploring realistic performance of audio-based digital testing

Jing Han*, Tong Xia*, Dimitris Spathis, Erika Bondareva, Chloë Brown, Jagmohan Chauhan, Ting Dang, Andreas Grammenos, Apinan Hasthanasombat, Andres Floto, Pietro Cicuta, Cecilia Mascolo
Nature Digital Medicine, 5(16)

Breaking away from labels: the promise of self-supervised machine learning in intelligent health

Dimitris Spathis, Ignacio Perez-Pozuelo, Laia Marques-Fernandez, Cecilia Mascolo
Cell Patterns, 3(2)

Detecting sleep outside the clinic using wearable heart rate devices

Ignacio Perez-Pozuelo, Marius Posa, Dimitris Spathis, Kate Westgate, Nicholas Wareham, Cecilia Mascolo, Soren Brage, Joao Palotti
Scientific Reports, 12 (7956)

Exploring Longitudinal Cough, Breath, and Voice Data for COVID-19 Progression Prediction via Sequential Deep Learning: Model Development and Validation

Ting Dang, Jing Han, Tong Xia, Dimitris Spathis, Erika Bondareva, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Andres Floto, Pietro Cicuta, Cecilia Mascolo
Journal of Medical Internet Research (JMIR), 24(6)

Universals and variations in musical preferences: A study of preferential reactions to Western music in 53 countries

David Greenberg, Sebastian Wride, Daniel Snowden, Dimitris Spathis, Jeff Potter, Jason Rentfrow
Journal of Personality and Social Psychology, 122(2)
Altmetric Top 5% of all research outputs

Looking for Out-of-Distribution Environments in Multi-center Critical Care Data

Dimitris Spathis, Stephanie Hyland
Machine Learning for Health(ML4H'22), New Orleans, USA

Turning Silver into Gold: Domain Adaptation with Noisy Labels for Wearable Cardio-Respiratory Fitness Prediction

Yu Wu, Dimitris Spathis, Hong Jia, Ignacio Perez-Pozuelo, Tomas I Gonzales, Soren Brage, Nicholas Wareham, Cecilia Mascolo
Machine Learning for Health(ML4H'22), New Orleans, USA

Investigating Domain-agnostic Performance in Activity Recognition using Accelerometer Data

Apinan Hasthanasombat, Abhirup Ghosh, Dimitris Spathis, Cecilia Mascolo
UbiComp workshop on Human Activity Sensing Corpus & Applications (HASCA @ UbiComp'22), Cambridge, UK

2021

COVID-19 Sounds: A Large-Scale Audio Dataset for Digital Respiratory Screening

Tong Xia*, Dimitris Spathis*, Chloe Brown, Jagmohan Chauhan, Andreas Grammenos, Jing Han, Apinan Hasthanasombat, Erika Bondareva, Ting Dang, Andres Floto, Pietro Cicuta, Cecilia Mascolo
Neural Information Processing Systems (NeurIPS'21), Datasets and Benchmarks Track

Self-supervised transfer learning of physiological representations from free-living wearable data

Dimitris Spathis, Ignacio Perez-Pozuelo, Soren Brage, Nicholas Wareham, Cecilia Mascolo
Conference on Health, Inference, and Learning (CHIL'21), Virtual event, USA

Exploring Automatic COVID-19 Diagnosis via voice and symptoms from Crowdsourced Data

Jing Han, Chloë Brown*, Jagmohan Chauhan*, Andreas Grammenos*, Apinan Hasthanasombat*, Dimitris Spathis*, Tong Xia*, Pietro Cicuta, Cecilia Mascolo
International Conference on Acoustics, Speech, & Signal Processing (ICASSP'21), Toronto, Canada

SelfHAR: Improving Human Activity Recognition through Self-training with Unlabeled Data

Chi Ian Tang, Ignacio Perez-Pozuelo*, Dimitris Spathis*, Soren Brage, Nicholas Wareham, Cecilia Mascolo
Proc. on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT/Ubicomp'21), 5(1)

The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates

Björn W. Schuller, ... Dimitris Spathis, Tong Xia, Pietro Cicuta, Leon J. M. Rothkrantz, Joeri Zwerts, Jelle Treep, Casper Kaandorp
Conference of the International Speech Communication Association (Interspeech'21), Brno, Czechia

Digital Phenotyping and Sensitive Health Data: Implications for Data Governance

Ignacio Perez-Pozuelo, Dimitris Spathis, Jordan Gifford-Moore, Jessica Morley, Josh Cowls
Journal of the American Medical Informatics Association, 28(9)

Anticipatory Detection of Compulsive Body-focused Repetitive Behaviors with Wearables

Benjamin Searle, Dimitris Spathis, Marios Constantinides, Daniele Quercia, Cecilia Mascolo
ACM International Conference on Mobile Human-Computer Interaction (MobileHCI'21), Toulouse, France

Evaluating Contrastive Learning on Wearable Timeseries for Downstream Clinical Outcomes

Kevalee Shah, Dimitris Spathis, Chi Ian Tang, Cecilia Mascolo
Machine Learning for Health (ML4H'21), Virtual event

Federated mobile sensing for activity recognition

Stefanos Laskaridis, Dimitris Spathis, Mario Almeida
ACM International Conference on Mobile Computing and Networking (MobiCom), New Orleans, USA (tutorial)

Wearables, smartphones and artificial intelligence for digital phenotyping and health

Ignacio Perez-Pozuelo, Dimitris Spathis, Emma Clifton, Cecilia Mascolo
Digital Health, Chapter 3

2020

Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data

Chloë Brown*, Jagmohan Chauhan*, Andreas Grammenos*, Jing Han*, Apinan Hasthanasombat*, Dimitris Spathis*, Tong Xia*, Pietro Cicuta, Cecilia Mascolo
International Conference on Knowledge Discovery and Data Mining (KDD'20), San Diego, USA
Oral presentation Cambridge University Hall of Fame Better Future Award

Learning Generalizable Physiological Representations from Large-scale Wearable Data

Dimitris Spathis, Ignacio Perez-Pozuelo, Soren Brage, Nicholas Wareham, Cecilia Mascolo
NeurIPS Machine Learning for Mobile Health workshop (ML4MH @ NeurIPS'20), Vancouver, Canada

Exploring Contrastive Learning in Human Activity Recognition for Healthcare

Chi Ian Tang, Ignacio Perez-Pozuelo, Dimitris Spathis, Cecilia Mascolo
NeurIPS Machine Learning for Mobile Health workshop (ML4MH @ NeurIPS'20), Vancouver, Canada

2019

Sequence Multi-task Learning to Forecast Mental Wellbeing from Sparse Self-reported Data

Dimitris Spathis, Sandra Servia, Katayoun Farrahi, Cecilia Mascolo, Jason Rentfrow
International Conference on Knowledge Discovery and Data Mining (KDD'19), Anchorage, USA
Oral presentation (Top 6%)

Passive mobile sensing and psychological traits for large scale mood prediction

Dimitris Spathis, Sandra Servia, Katayoun Farrahi, Cecilia Mascolo, Jason Rentfrow
International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth'19), Trento, Italy

Pre-PhD (2013-2018)

Interactive dimensionality reduction using similarity projections
Dimitris Spathis, Nikolaos Passalis, Anastasios Tefas
Knowledge-Based Systems, 165

Fast, Visual and Interactive Semi-supervised Dimensionality Reduction
Dimitris Spathis, Nikolaos Passalis, Anastasios Tefas
ECCV Efficient Feature Representation Learning workshop (CEFRL @ ECCV'18), Munich, Germany

Diagnosing Asthma and Chronic Obstructive Pulmonary Disease with Machine Learning
Dimitris Spathis, Panayiotis Vlamos
Health Informatics Journal, 25(3)

Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words
Joan Serra, Ilias Leontiadis, Dimitris Spathis, Gianluca Stringhini, Jeremy Blackburn, Athena Vakali
ACL Abusive Language Online workshop (ALW @ ACL'17), Vancouver, Canada

A comparison between semi-supervised and supervised text mining techniques on detecting irony in greek political tweets
Basilis Charalampakis, Dimitris Spathis, Elias Kouslis, Katia Kermanidis
Engineering Applications of Artificial Intelligence, 51

Detecting Irony on Greek Political Tweets: A Text Mining Approach
Basilis Charalampakis, Dimitris Spathis, Elias Kouslis, Katia Kermanidis
International Conference on Engineering Applications of Neural Networks, Rhodes, Greece

Glocal News: An Attempt to Visualize the Discovery of Localized Top Local News, Globally
Dimitris Spathis, Theofilos Mouratidis, Spyros Sioutas, Athanasios Tsakalidis
International Conference on Conceptual Modeling, Hong Kong, China

Theses


Machine learning to model health with multimodal mobile sensor data
PhD thesis
University of Cambridge, 2021

Learning to interact with high-dimensional data
MSc thesis
Aristotle University, 2017

Patents


Apparatus & method for generating feature embeddings
Nokia, US20240273404A1 (filed 2023, published 2024)

Apparatus, method, and computer program for transfer learning
Nokia, US20240127057A1 (filed 2022, published 2024)

🧐 Academic service


Leadership & Organizer roles:

Program Committee Member: AAAI 2021-2024, IJCAI 2020, KDD 2020-2023, FAccT 2023, SIAM SDM 2022, Sensiblend @ Ubicomp 2021, Mobiquitous 2022.

Reviewer: NeurIPS, ICLR, ICML, AAAI, IJCAI, KDD, CHI, Ubicomp/IMWUT, CHIL, Nature Digital Medicine, WACV, Nature Scientific Reports, ICASSP, Expert Systems with Applications, Neurocomputing, WWW/The Web Conference, Engineering Applications of Artificial Intelligence, ICWSM, and more.

📢 Invited talks


Evidence from industry – what are you really using AI for? (panel) Multimodal AI for Real-World Signals and the Role of Language Multimodal, data-efficient, and robust AI for real-world biosignals & the role of generative models Multimodal AI for real-world signals – does the key to specialized models lie in language? Human-centric AI for health signals with applications in fitness and activity modeling Self-Supervised Learning for Health Signals Representation learning for cardio-fitness prediction in free-living environments AI-powered Wearables Transforming Mobile Health
  • 📍 AI Summit, London Tech week, London, UK — June 16, 2022
Self-supervised learning for health signals AI to model Human Behaviour and Health Deep sequence learning for large-scale inference of human behaviour from mobile sensor data
  • 📍 MRC Epidemiology Unit, University of Cambridge, UK — March 5, 2019
  • 📍 Ocado, Barcelona, Spain & Hatfield, UK (remote) — July 10, 2019
Fast, Visual and Interactive Semi-supervised Dimensionality Reduction
  • 📍 Facebook PhD Open House, London, UK — October 25, 2018

🎒 Mentoring


I enjoy collaborating with PhD and thesis students, usually as part of an internship in our lab. Here are some recent research projects I supervised:

  • Chi Ian Tang (University of Cambridge): Self-supervised and continual learning
  • Benjamin Searle (University of Cambridge): Capturing compulsive behaviours w/ wearables
  • Kevalee Shah (University of Cambridge): Benchmarking contrastive learning algorithms
  • Chuen Low (University of Cambridge): Attention models for timeseries
  • Yu Yvonne Wu (University of Cambridge): Weakly-supervised and self-supervised learning
  • Shohreh Deldari (UNSW Sydney): Multimodal self-supervised learning
  • Sofia Yfantidou (Aristotle University): Machine learning fairness
  • Francesco Pase (University of Padova): Self-supervised federated learning
  • Aashish Kolluri (National University of Singapore): Multimodal adapters for large models
  • Ryuhaerang Choi (KAIST): Data-centric multi-task learning
  • Arvind Pillai (Dartmouth College): Foundation models for physiological signals

I have also been a teaching assistant for the following undergraduate courses:

🎠 Playground


“The next big thing in technology often starts off looking like a toy”

Quantifying name-dropping

Communitypoprefs.com is a data visualization website, where we present every pop-culture reference over the course of 5 seasons of the TV series Community.


Map out your music taste on Spotify

Visualizing my favourite songs on Spotify with dimensionality reduction and anomaly detection. Data essay published in Cuepoint Magazine, Medium's premier music publication.


Children books and childish language?

Text mining Game of Thrones, Harry Potter, Hunger Games and Lord of the Rings books. Data essay featured in Medium's Editor Picks.


Anonymize kids' faces before posting online

Mobile app with face recognition, age estimation, & emotion recognition to blur kids or replace their face with emotion-based emoji. Developed during HackZurich 2018.


Discover top local news globally

Glocalne.ws was a mashup of Google News and Google Maps. Unfortunately it is now defunct due to API discontinuance.


Composing music and text with Recurrent Neural Networks

Training neural networks on massive amounts of musical notation and literature and letting them create their own art. Essay in Greek but you can still see/listen to the results.

🕳️🐇 Personal


Non-academic things about me: I love music, both playing and listening. I am mostly into art rock and indie folk, with the occasional exception of some well-crafted pop. Although I am an accordionist by training, over the last few years I've been playing mostly piano and ukulele. In a previous life, I performed with the critically acclaimed band The Children of the Oldness (aka Kore Ydro) and recorded the album "Consortium in Amato" (listen here).

I also enjoy street photography and in particular playing with light—photography comes from Greek φως (light) and γραφή (writing), or drawing with light. A sample of my shots is on Flickr and one of my landscapes was featured in the Huffington Post.

Lastly, and perhaps most importantly, I'm always on the lookout for ways to move items from the "non-academic list" to the "academic list"—let me know if you'd like to help!