The importance of software for research
At the 2015 SIAM Conference on Computational Science and Engineering (CSE), I was invited to speak at a panel titled “The Future of CSE as a Discipline.” My remarks there ended with one provocative statement: We’re not a discipline, until we value software. Until we value it enough to share it, to demand it as part of our publication process. Until we recognize it as core instrument and method of our endeavor, and give viable careers within academia for scientific software developers. Until we seek quality assurance in software by using version control and testing, and we have training in place for the younger generation of computational and data scientists to produce reliable claims to new knowledge.
Software is far more important in research than common wisdom would have it. According to a 2014 survey of researchers in the UK (417 respondents), 92% of academics use research software, 69% say that their research would not be practical without it, and 56% develop their own software \citep{hettrick}. In a similar survey targeting members of the US National Postdoctoral Association, 95% of the 209 respondents said they use research software, and 63% stated that it would not be practical to conduct their work without software \citep*{katz2017}. The fact is that, today, software written specifically for research purposes is ubiquitous—in data acquisition (from instruments), data analysis, data visualization, modeling and simulation, transforming or “cleaning” data, and automating various digital tasks. Like academic writing, researchers’ professional development ought to include training in research computing.
As I narrated in my short piece in Science, “A Hard Road to Reproducibility” \citep*{Barba_2016}, I struggled as a doctoral student when working with code left over by another student in our lab. I received no training in programming, and likely neither did the previous student, although his ability was surely superior to mine. I was and still am a second-rate programmer. But I have come to understand that a key challenge to raise the standards of reproducibility in science rests on how we develop, curate, use and publish research software.
Open-source software
Reproducible research is vitally connected to open-source software, open data and open science. \citet*{Claerbout_1992}, pioneers of the reproducible-research movement, advocated for the merging of a research publication with the underlying computational analysis, and proposed that the ideal in this context is to use a public license that allows others to use, copy and redistribute the software.