Search publications

Reset filters Search by keyword

No publications found.

 

File-based localization of numerical perturbations in data analysis pipelines.

Authors: Salari AKiar GLewis LEvans ACGlatard T


Affiliations

1 Department of Computer Science and Software Engineering, Concordia University, Montreal, QC, Canada.
2 Department of Biomedical Engineering, McGill University, Montreal, QC, Canada.
3 Montreal Neurological Institute, McGill University, Montreal, QC, Canada.

Description

File-based localization of numerical perturbations in data analysis pipelines.

Gigascience. 2020 Dec 02; 9(12):

Authors: Salari A, Kiar G, Lewis L, Evans AC, Glatard T

Abstract

BACKGROUND: Data analysis pipelines are known to be affected by computational conditions, presumably owing to the creation and propagation of numerical errors. While this process could play a major role in the current reproducibility crisis, the precise causes of such instabilities and the path along which they propagate in pipelines are unclear.

METHOD: We present Spot, a tool to identify which processes in a pipeline create numerical differences when executed in different computational conditions. Spot leverages system-call interception through ReproZip to reconstruct and compare provenance graphs without pipeline instrumentation.

RESULTS: By applying Spot to the structural pre-processing pipelines of the Human Connectome Project, we found that linear and non-linear registration are the cause of most numerical instabilities in these pipelines, which confirms previous findings.

PMID: 33269388 [PubMed - in process]


Keywords: NeuroimagingOperating SystemsPipelinesReproducibility


Links

PubMed: https://www.ncbi.nlm.nih.gov/pubmed/33269388

DOI: 10.1093/gigascience/giaa106