A Total Error Framework with a Special Focus on Digital Data
Ingegerd Jansson & Lilli Japec
Abstract
A changing survey landscape with increasing nonresponse rates and survey costs has caused organizations to explore new data sources for statistics production. There is great potential to use new types of data for statistics production, especially when blending them with existing data. We present a total error framework that covers all types of data sources, but our examples focus on digital data. We define digital data to include data from social networks, traditional business systems and Internet of Things. We review and build on existing frameworks for surveys, administrative‐, found‐ and digital trace data. The framework describes steps, concepts and error sources when single‐source or multiple‐source statistics are produced based on digital or survey data. Blending data sources is a vital step in the framework. Furthermore, the unified framework offers terminology to describe and document errors in digital data, aligned with terminology used in the classical Total Survey Error framework. We also provide indicators for evaluating the quality of statistics produced based on single or multiple data sources.
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.50 × 0.4 = 0.20 |
| M · momentum | 0.50 × 0.15 = 0.07 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.