Algebraic Model Counting: Analysis of Optimal Decision Trees
Hiroki Arimura introduces ADTC, a semiring framework and dynamic program (O^*(n^{O(Δ)})) plus the emtrees tool for exhaustive decision‑tree.
TL;DR
- 01Hiroki Arimura introduces ADTC, a semiring framework and dynamic program (O^*(n^{O(Δ)})) plus the emtrees tool for exhaustive decision‑tree.
- 02Hiroki Arimura proposes Algebraic Decision Tree Counting, or ADTC, a semiring-based framework that exhaustively analyzes optimal and near-optimal decision trees.
- 03The paper, submitted 2 July 2026 to arXiv (arXiv:2607.02069), pairs an algorithmic result with a software implementation called emtrees and a plan to present at ECML-PKDD 2026.
Algebraic Model Counting for Global Analysis of Optimal Decision Trees
Hiroki Arimura proposes Algebraic Decision Tree Counting, or ADTC, a semiring-based framework that exhaustively analyzes optimal and near-optimal decision trees. The paper, submitted 2 July 2026 to arXiv (arXiv:2607.02069), pairs an algorithmic result with a software implementation called emtrees and a plan to present at ECML-PKDD 2026.
What is ADTC and what does it compute?
ADTC reframes optimization, counting, and sampling of decision-tree hypotheses as a unified sum-of-products computation over a semiring R. The framework aggregates model metrics into algebraic values and builds a global model profile that captures trade-offs across criteria such as accuracy, size, and fairness. ADTC introduces model behavior tensors to combine multiple metrics via convolution products over a tensor semiring.
ADTC turns diverse analysis tasks into algebraic computations. Instead of treating selection, counting and sampling as separate pipelines, the approach computes semiring summaries across the entire hypothesis space. The model behavior tensors give a structured way to record multi-criteria outcomes for each tree and to convolve those outcomes across subproblems.
How does the algorithm scale and what can it handle?
The paper shows the decision-tree hypothesis space is doubly exponential in the maximum depth Δ, yet presents a dynamic programming algorithm whose running time is O^(n^{O(Δ)}) in the number of features n, where O^ hides polynomial factors. The method handles complex constraints composed of multiple tree metrics by aggregating semiring values into tensors and combining them with convolution products.
Those two complexity statements set the technical bounds: the raw hypothesis space explodes with Δ, and the proposed algorithm pushes the cost into a polynomial-exponential dependence on n parameterized by Δ. The tensor-based aggregation supports multi-criteria constraints so practitioners can jointly profile accuracy, size, and fairness across all optimal and near-optimal trees rather than inspecting single solutions.
Why it matters
ADTC offers a route to global, evidence-based model selection for decision trees by turning an intractably large hypothesis space into structured algebraic objects that can be counted, optimized, and sampled. That matters in sensitive domains where designers need to understand trade-offs across many competing metrics instead of relying on one selected tree. The availability of a concrete complexity bound, O^*(n^{O(Δ)}), and a software artifact, emtrees, makes the approach actionable for datasets where n and Δ are in a workable regime.
How was ADTC validated?
Arimura demonstrates the utility of ADTC through the emtrees software applied to real-world datasets. The paper positions those experiments as illustrations of how ADTC constructs a model profile and supports evidence-based selection among optimal and near-optimal decision trees. The work is slated for the Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML-PKDD 2026, LNCS, Naples, Italy, 7-11 September 2026.
What to watch
Look for the ECML-PKDD 2026 proceedings and the emtrees software release notes to see concrete experimental results and reproducible artifacts. The next signals to check are scaling reports for larger n and Δ and any follow-up that expands the tensor semiring approach to other hypothesis classes beyond decision trees.
Paper: "Algebraic Model Counting for Global Analysis of Optimal Decision Trees", Hiroki Arimura, arXiv:2607.02069 (submitted 2 July 2026). Software: emtrees. Complexity: O^*(n^{O(Δ)}). Conference: ECML-PKDD 2026, 7-11 September 2026, Naples, Italy.
Written by The Brieftide · Source: arXiv
The Brieftide Daily · 06:00
Briefs like this one, in your inbox every morning.
Continue reading
More in Computational BiologyAlgorithm Co-occurrence Networks: Mapping Academic Influence
Large-scale NLP algorithm co-occurrence networks built from full-text papers reveal temporal.
TxBench-PP: 100 preclinical pharmacology tasks, top score 59.3%
TxBench-PP is a verifiable benchmark of 100 small-molecule preclinical decisions across 11 models and 4.
MLCI: Machine-Learned Comorbidity Index accepted at ICML 2026
The paper proposes MLCI, which maps diagnosis codes to a single scalar by maximizing the normalized Hilbert-Schmidt Independence Criterion.
Digital Twin Clinical Decision Support AI, EMBC 2026 paper
An online adaptive CDSAS combines treatment effect estimation, a patient digital twin and reinforcement learning; validated in simulation.