Economies of scope in data aggregation: Evidence from health data

Bruno Carballa-Smichowski et al.

Information Economics and Policy2025https://doi.org/10.1016/j.infoecopol.2025.101146article
AJG 2ABDC A
Weight
0.37

Abstract

Economies of scope in data aggregation (ESDA) are generated by the combination of complementary datasets involving the same observations. We estimate ESDA by progressively and randomly adding health and socioeconomic variables (predictors) to the machine-learning models we use to predict health outcomes. We find a positive effect of the number of variables on prediction quality, while holding the number of observations constant. We observe a positive relationship between variable complementarity and ESDA. ESDA show signs of increasing returns followed by decreasing returns. We further observe a long tail of highly contributing predictors in our data. These findings indicate that the nature of returns to scope in data aggregation may depend on the distribution of the predictors' information content. This underscores the importance of variable characteristics in determining ESDA's potential to create data barriers to entry. These results can help policymakers in designing data sharing initiatives such as the European Union's Common European Data Spaces. • Economies of scope in data aggregation (ESDA) exist if adding variables increases the quality of a prediction. • We corroborate their presence by merging health and socio-economic data to predict health outcomes. • We observe a positive relation between variable complementarity and ESDA. • ESDA do not exhibit decreasing returns. • The nature of returns to ESDA seems to depend on the distribution of the information content of the predictors.

1 citation

Open via your library →

Cite this paper

https://doi.org/https://doi.org/10.1016/j.infoecopol.2025.101146

Or copy a formatted citation

@article{bruno2025,
  title        = {{Economies of scope in data aggregation: Evidence from health data}},
  author       = {Bruno Carballa-Smichowski et al.},
  journal      = {Information Economics and Policy},
  year         = {2025},
  doi          = {https://doi.org/https://doi.org/10.1016/j.infoecopol.2025.101146},
}

Paste directly into BibTeX, Zotero, or your reference manager.

Flag this paper

Economies of scope in data aggregation: Evidence from health data

Flags are reviewed by the Arbiter methodology team within 5 business days.


Evidence weight

0.37

Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40

F · citation impact0.16 × 0.4 = 0.06
M · momentum0.53 × 0.15 = 0.08
V · venue signal0.50 × 0.05 = 0.03
R · text relevance †0.50 × 0.4 = 0.20

† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.