MLCI: Machine-Learned Comorbidity Index accepted at ICML 2026
The paper proposes MLCI, which maps diagnosis codes to a single scalar by maximizing the normalized Hilbert-Schmidt Independence Criterion.
TL;DR
- 01The paper proposes MLCI, which maps diagnosis codes to a single scalar by maximizing the normalized Hilbert-Schmidt Independence Criterion.
- 02A Machine-Learned Comorbidity Index (MLCI) by Suleman Baloch, Kishlay Jha, Alberto M.
- 03Polgreen and Bijaya Adhikari was submitted to arXiv on 16 Jun 2026 (arXiv:2606.17450) and accepted at the 43rd International Conference on Machine Learning (ICML 2026) in Seoul, South Korea.
A Machine-Learned Comorbidity Index (MLCI) by Suleman Baloch, Kishlay Jha, Alberto M. Segre, Philip M. Polgreen and Bijaya Adhikari was submitted to arXiv on 16 Jun 2026 (arXiv:2606.17450) and accepted at the 43rd International Conference on Machine Learning (ICML 2026) in Seoul, South Korea. The 35-page paper proposes a single admission-level score that "maps diagnosis codes to a single scalar" by maximizing the normalized Hilbert-Schmidt Independence Criterion, abbreviated nHSIC, against multiple clinical outcomes (paper abstract).
What is the Machine-Learned Comorbidity Index?
MLCI is an admission-level comorbidity score that learns a scalar mapping from diagnosis codes to risk by maximizing nHSIC between the learned score and several clinical outcomes. The method is explicitly designed to capture nonlinear, outcome-specific relationships rather than relying on linear, rule-based weights. The authors supply a theoretical characterization of when a unified, informative ordering across outcomes can be achieved and implement MLCI to evaluate that theory on benchmark electronic health record datasets.
How does MLCI differ from Charlson and Elixhauser?
MLCI departs from traditional comorbidity indices by addressing two limitations the paper identifies: Charlson and Elixhauser are, the authors write, "largely mortality-centric and do not align well with other clinical outcomes," and their linear, rule-based structures cannot capture nonlinear risk-outcome dependence. In contrast, MLCI optimizes a dependence criterion (nHSIC) across multiple outcomes so the learned score directly reflects outcome-specific and nonlinear relationships embedded in diagnosis codes.
What evidence do the authors provide?
The paper reports empirical results on multiple benchmark EHR datasets showing that MLCI outperforms strong baselines across multiple evaluation metrics. The authors do not publish numeric performance figures in the arXiv metadata page, but summarize that MLCI captures nonlinear risk-outcome dependence and yields superior evaluation outcomes compared with established baselines. The submission metadata includes the arXiv identifier arXiv:2606.17450 and notes the paper length as 35 pages.
Why it matters
Comorbidity scores are widely used for risk adjustment and patient stratification; shifting from mortality-centric, linear indices to a learned score that aligns with multiple clinical outcomes could change how researchers and health systems compare patients and evaluate interventions. If MLCI’s nHSIC-based objective generalizes across datasets, it would let analysts produce a single admission-level ordering that is informative for several downstream decisions instead of patching separate, outcome-specific adjustments.
What to watch
Look for the ICML 2026 presentation in Seoul and the paper’s conference materials, which will likely include the detailed numerical comparisons and code or data references that the arXiv page does not display. The next concrete signal will be the conference proceedings where the authors’ empirical tables and experimental settings should appear.
Written by The Brieftide · Source: arXiv
The Brieftide Daily · 06:00
Briefs like this one, in your inbox every morning.
Continue reading
More in Multimodal AIVisual-Seeker: visual-native multimodal search surpasses rivals
Zhengbo Zhang and 12 co-authors submitted Visual-Seeker on 13 Jun 2026.
Gemma 4 12B: unified, encoder-free multimodal model for laptops
Google DeepMind’s 12B model brings encoder-free vision and native audio to laptops, runs on 16GB memory and is released under Apache 2.0.
Hugging Face Spaces agents.md: chain image to 3D splats
An agent used two Hugging Face Spaces and their agents.md files to auto-generate images, reconstruct 3D Gaussian splats.
LLM Research Papers 2026 (Jan–May): Curated list and trends
Sebastian Raschka assembled a curated list of LLM papers bookmarked from January through May 2026.