Cookies?
Library Header Image
LSE Theses Online London School of Economics web site

Challenging the gold standard: a methodological study of the quality and errors of web tracking data

Bosch Jover, Oriol (2023) Challenging the gold standard: a methodological study of the quality and errors of web tracking data. PhD thesis, London School of Economics and Political Science.

[img] Text - Submitted Version
Download (3MB)
Identification Number: 10.21953/lse.00004556

Abstract

Measuring what people consume and do online is crucial across the social sciences. In the last few years, web tracking data has gained popularity, being considered by most as the gold standard for measuring online behaviours. This thesis studies whether this prevailing notion holds true. Specifically, through a combination of traditional survey and computational methods, I assess the quality of web tracking data, its associated errors, and the consequences of these. The thesis is comprised of three distinct papers. In the first paper, inspired by the Total Survey Error, I present a Total Error framework for digital traces collected with Meters (TEM). The TEM framework describes the data generation and the analysis process for web tracking data and documents the sources of bias and variance that may arise in each step of this process. The framework suggests that metered data might indeed be affected by the error sources identified in our framework and, to some extent, biased. The second paper adopts an empirical approach to address a key error identified in the TEM framework: researchers’ failure to capture data from all the devices and browsers that individuals utilize to go online. The paper shows that tracking undercoverage is highly prevalent when using commercial panels. Additionally, through a simulation study, it demonstrates that web tracking estimates, both univariate and multivariate, are often substantially biased due to tracking undercoverage. The third paper explores the validity and reliability of web tracking data when used to measure media exposure. Merging traditional psychometric and computational techniques, I conduct a multiverse analysis to assess the predictive validity and true-score reliability of thousands of web tracking measures of media exposure. The findings show that web tracking measures have an overall low validity but remarkably high reliability. Additionally, results suggest that the design decisions made by researchers when designing web tracking measurements can have a substantial impact on their measurement properties. Collectively, this thesis challenges the prevailing belief in web tracking data as the gold standard to measure online behaviours. Methodologically, it illustrates how computational methods can be used to adapt survey methodology techniques to assess the quality of digital trace data.

Item Type: Thesis (PhD)
Additional Information: © 2023 Oriol J. Bosch
Library of Congress subject classification: H Social Sciences > H Social Sciences (General)
Sets: Departments > Methodology
Supervisor: Sturgis, Patrick and Kuha, Jouni
URI: http://etheses.lse.ac.uk/id/eprint/4556

Actions (login required)

Record administration - authorised staff only Record administration - authorised staff only

Downloads

Downloads per month over past year

View more statistics