DiSSCo futures keynote - Digital infrastructures: context & future

Annotated slide deck for keynote talk “Digital infrastructures: context & future” from the DISSCO futures event held on 7-9th February at the Royal Library of Belgium, Brussels. Slides are available as a google docs slidedeck.


Digital infrastructures: context & future

Title slide


Aim

We’re aiming to discuss current status of global biodiversity data infrastructure and its future directions, with a focus on work towards FAIR and the digital extended specimen. I really like the description of this digital infrastructure session from the program: “(research infrastructures are) not only about bringing data together but also about transforming the data and the ways scientists work with it


Context: personal and institutional

My career has been in biodiversity informatics but I transitioned mid-career from software development into research. I’m interested in open science (particularly it’s take-up) & how we design & build for participation. In moving over to research, I’ve been keen to explore how we can use software development practices to facilitate research, especially to make our work more explicit and reusable. These range from quite technical practices (reuse, automation, version control, dependency management, continuous integration etc) but also processes about communication, design & inclusion. (Image credit: jesse orrico on unsplash)


Where we are today: botanical information online

I’m going to focus on progress with botanical collections… We’ve digitised a huge amount of data and made it available online - 88 million metadata records and 38 million images (figures from gbif.org). We also have comprehensive taxonomies and distributional data that we can use to manage and explore this data, and digitised and born-digital literature provides context for how these data have been used as evidence. We also have metadata records about collections institutes and their staff (see eg the Global Registry of Scientific Collections). Building on this data and expertise, we’ve been investigating how we can apply new techniques like machine learning to botanical data and images. In terms of where we do computational work, we have compute available for researchers to use in environments that are pre-populated with data - all the researcher needs to do is bring their idea. Finally - we now have a much more explicit understanding of the different activities that are involved in specimen research and curation - see for example the Bionomia project which crowdsources the attribution of specimens to the researchers who have collected or identified them. (Image credit: RBG Kew)


Wider context: evolving research culture

These advances sit in a wider context of an evolving research culture. Right across different disciplines we are developing training resources which equip researchers to manipulate data efficiently and to work in this online research environment. Just as we have a better understanding of the different roles in taxonomic research, we have a better understanding of the roles that are required for research to be conducted efficiently. Many of us here will either identify as, employ or work with research software engineers - people who facilitate research but who perhaps didn’t get traditional academic career credit through authorship & grants. (Image credit: created by Scriberia with The Turing Way community. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807 )


We're still learning how to move activities online

(You can boo at this slide if you like!) We were progressing nicely with an evolving research culture… …and then COVID made us move everything online We all learned a lot in this process - how we think, work and interact. Personally I found out that I often think quite spatially, so shrinking my working life to the size of a laptop screen was very difficult for the first few weeks. (Image credit: World Health Organisation)


Too many tabs

Just another tab? As we deliver more data and working environments online, we should try to make sure that we’re not just overloading researchers with yet another tab in their crowded browser window.


Working in Kew's herbarium: FAIR in a physical resource

When our “work from home” order stopped, it was a real pleasure to come back to Kew and explore the physical working environment that we have built for researchers to interact with the specimens. I looked at this from a new perspective. I think a better awareness of working environments can set us a new challenge: how do we maximise the use of digital working environments, are there different tools and interfaces that can enable research and collaboration? (Image credit: RBG Kew)


Institutional scale: comprehensive digitisation

Some work in progress at Kew - at two very different scales. We are running a comprehensive digitisation project to mobilise specimen metadata and images, this also involves implementing a new collections management system (EarthCape).


Researcher scale: prototyping tools

We are also exploring what we need to deliver at research time to facilitate best use of the data (specimens, literature, descriptions) that is already available, without enforcing a particular way of working.


Specimen comparison and grouping interface

One part of this research tool is to enable a user to retrieve specimens and lay them out on a virtual working board where similar specimens can be grouped, and the researcher can make small notes or display relevant pages from digitised literature.


An aim: the digital extended specimen (DES)

In biodiversity informatics, we have been discussing how we could build a system that supports the “Digital Extended Specimen”. Work on research time tooling can help us make these discussions a little more concrete and to fit with activities that researchers are already undertaking. (Image credit: Jagoba Malumbres-Olarte in M. E. Mabry et al., “Monographs as a nexus for building extended specimen networks using persistent identifiers,” Bulletin of the Society of Systematic Biologists, vol. 1, no. 1, Jan. 2022, doi: 10.18061/bssb.v1i1.8323.)


Getting to our DEStination

We won’t make a fundamental shift in a single step, but we can safely transition in small steps. (Image credit: created by Scriberia with The Turing Way community. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807 )


Conclusions

When we’re thinking about infrastructure, let’s remember its also about people and their working environments. When we’re planning future directions, let’s think about how we do research and who is involved. Longer term aims like the digital extended specimen can be discussed more meaningfully if we include people where they are now, speak a common language and show a relevant path to destination.


Contacts and links

I’d be pleased to discuss this further - drop in a comment below, mail me at n.nicolson@kew.org or find me on mastodon (@nickynicolson@mastodon.social) or twitter(@nickynicolson). You can find out more about Kew’s scientific work at www.kew.org/science and the home of the echinopscis tool is echinopscis.github.io.