Research Software Development at UCL: Where we're at and how to do it better

9 July 2020

In our recent survey “IT and your research at UCL” we asked researchers who said they wrote their own code a few questions about the extent to which they felt they were following certain best practices.

The reliability and openness of software written by researchers is a subject that has received considerable attention over the past few months, following publication of the code behind some of the key COVID-19 epidemiological models, so it seems like a good time to look at what our survey tells us about the state of research software development at UCL. In this article we summarise each of the best practices mentioned in the survey and suggest some resources to help with learning the basics or taking things to the next level.

Version control

I keep track of different versions of my code as I am developing it and can easily switch between versions as needed: 53% of UCL researchers agree

If software development is a journey, it’s a good idea to lay a trail of breadcrumbs behind you so you can find your way back if you come to a dead end. That’s not entirely what version control is, but it’s one of the key features. Version control tools like Git make it easier to branch out in different directions, write notes to future you about what you’ve done and why, tag versions so you know exactly which code you generated your results with, merge different branches together, and much more. Version control is really the first step you should take towards adopting good software development practices.

As a minimum, to be able to use a version control tool like Git effectively, we recommend learning how to setup a local git repository, how to commit changes, how to create branches, and how to switch between different branches or return to earlier versions as needed.

Learning resources

Read the section on the Turing way guide about version control
Git Essential Training: The Basics (LinkedIn Learning, 2h 55m)
Learn the basics at a Software Carpentry Workshop or request a git-only training session from the research software development group.

Collaboration

I share my code with other people, and they are able to suggest improvements and contribute directly to development: 47% of UCL researchers agree

The relatively low percentage of researchers who agreed with the statement above most likely reflects the fact that, for many people, their programming never really extends beyond developing code for personal use. There are many benefits to involving other people in your code development though: It forces you to think more carefully about how easy your code is to understand and use (i.e., variable names, comments, documentation, and of course version control); and, more eyes on the code means more (helpful) criticism (more bugs found, tests written, issues anticipated, solutions found). Why not ask a colleague to reciprocate on developing each other’s code?

Learning resources

Read the section on the Turing way guide about Collaborating on GitHub/GitLab
The Turing Way also has a section with recommendations for carrying out collaborative code reviews
Writing readable source code (Software Sustainability Institute guide)

Automated testing

I can repeatedly and easily test the reliability of my code each time I make changes to it: 67% of UCL researchers agree

How do you know the improvement you just made to your code hasn’t broken it? How do you know your code reliably produces correct output in every situation? You test it of course, but much like any research hypothesis, you should be looking for evidence that your code is wrong rather than confirmation that it is right.

It’s notable from the survey that more researchers agreed with the statement above – that they were able to repeatably and easily test the reliability of their code – than with the statement about version control. This suggests that in many cases, researchers are taking a manual approach to testing, which means the range of tests they are running probably lacks the coverage necessary to test all the ways in which the code might fail.

To learn more about testing, familiarise yourself with concepts like unit testing, automated build and test functions, continuous integration, and test-driven development.

Learning resources

Read section about code testing on the Turing way guide
Testing your software (SSI guide)
Adopting automated testing (SSI guide)
Unit Testing in Python (LinkedIn Learning, 1h 29m)

Usability

Another researcher in my field could run and use my code using the documentation and examples provided: 66% of UCL researchers agree

Usability is very much a sliding scale, where the more effort you put into things like documentation, tutorials, build automation, cross-platform support and the like, the less effort it will take for someone else to use your software. If reuse is something you really want to promote, then you should be thinking carefully about these things, but it’s also worth bearing in mind if you want to make sure your research is reproducible and replicable.

Learning resources

Read about the concept of Documentation Driven Development and Tutorial-driven development

Publishing Code

I feel confident that I could publish my code and cite it in a paper: 47% of UCL researchers agree

It can be hard to know when the right time is to publish software, and the reality is that it will never be in a perfect state, so it’s often easier to just get your software out in the open early if you can. Publishing code goes beyond storing it in a public repository though; submitting your software to a repository such as Zenodo or UCL’s Research Data Repository is something you should do when you’d like to release an “official” version of your code; i.e., the version you used to generate published results. This allows you to mint a DOI (a link to a persistent record in the repository) which you and others can use to cite your code in a paper. You may want to consider generating unique DOIs for later releases when you reach major milestones in development.

It’s also possible to go a step further and publish software in a peer-reviewed journal, which will enable you to receive academic credit for the software itself. Explore the learning resources to find out how.

Learning resources

Read the section on the Turing way guide about Credit for reproducible research
Find a journal where you could publish your software
See the GitHub guide to Making Your Code Citable

Reproducibility

I am confident that another researcher in my field would be able to reproduce my results given the information I have published: 67% of UCL researchers agree

Publishing software and making it open access is only part of the solution when it comes to reproducibility: each of the practices mentioned above contribute towards improving the transparency of your work by making it easier for people to understand, install, run and test your software.

UCL’s Statement on Transparency in Research (agreed unanimously by the Academic Committee last year), sets out an expectation that researchers will take actions towards making their research open and to support reproducibility where appropriate in the context of their domain. Good software development practices can help towards meeting those expectations. The extra work involved in adopting these practices might seem a daunting prospect, but incremental change can be achieved through continuous improvement and developing good habits through practice.

Research Software Development at UCL: Where we're at and how to do it better

Version control

Learning resources

Collaboration

Learning resources

Automated testing

Learning resources

Usability

Learning resources

Publishing Code

Learning resources

Reproducibility

Further reading

Related News