ReproduciblE MODELing in the Geosciences
Software development has become an integral part of the geosciences as models and data processing get more sophisticated. Paradoxically, it poses a threat to scientific progress as the pillar of science, reproducibility, is seldomly reached. Software code tends to be either poorly written and documented or not shared at all; proper software licenses are rarely attributed. This is especially worrisome as scientific results have potential controversial implications for stakeholders and policymakers and may influence the public opinion for a long time.
In recent years, progress towards open science has led to more publishers demanding access to data and source code alongside peer-reviewed manuscripts. Still, recent studies find that results can rarely be reproduced.
In this project, we conduct a poll among the geoscience community which is advertised via scientific blogs (AGU, EGU), research networks (researchgate.net and mailing lists), and social media. Therein, we strive to investigate the causes for that lack of reproducibility. We take a peek behind the curtain and unveil how the community develops and maintains complex code and what that entails for reproducibility. Our survey includes background knowledge, community opinion, and behaviour practices regarding reproducible software development.
We postulate that this lack of reproducibility might be rooted in insufficient reward within the scientific community, insecurity regarding proper licencing of software and other parts of the research compendium as well as scientists’ unawareness about how to make software available in a way that allows for proper attribution of their work. We question putative causes such as unclear guidelines of research institutions or that software has been developed over decades, by researchers’ cohorts without a proper software engineering process and transparent licensing.
To this end, we also summarize solutions like the adaption of modern project management methods from the computer engineering community that will eventually reduce costs while increasing the reproducibility of scientific research.
Preliminary results and data will be available mid-August 2021.