Digitization workflows

This module briefly covers digitization workflows and community resources.

Digitization Workflows

This video (07:20) from iDigBio on Digitization Workflows identifies five clusters (or stages) in the process of digitizing natural history collection objects using digital images, and these stages can be easily adapted to other biodiversity data sources.

If you are unable to watch the embedded video, you can download it locally. (MP4 - 26.8 MB)

As the video highlights, digitization protocols vary from institution to institution, but it is essential that the chosen protocol is agreed, documented and respected.

We do not teach digitization, per se, during the Biodiversity Data Mobilization course, as it can easily stand as a week-long course on its own, instead we focus on basic introduction to biodiversity data capture. However, we want to provide you with resources on digitization as we know many are interested in this.

There are many ways to organize digitization efforts and so digitization can seem daunting. It is important to remember that in most cases someone else has already tried to digitize the same types of specimens and objects that you are planning to.

Some steps in the process may include:

  • Pre-digitization curation and staging: This includes the preparation of the data source for the digitization process, including the assignment of unique identifiers that will help to refer to the source without error and to keep all derived information together.

  • Image capture: This includes a fair amount of planning, not only on the image capture itself (e.g. definition of the work sequence, selection of adequate hardware), but also on how and where the images will be stored and handled.

  • Image processing: This includes quality control, file conversion, etc.

  • Electronic data capture: The core of the digitization process, includes capturing key information in a database. The video highlights that the most common method of entering the information is through a keyboard, but more and more institutions are turning to advanced data entry technologies.

  • Georeferencing: Geographical information is very important for biodiversity analysis, so digitization projects should seek to extract the most accurate geographical information possible.

DiSSCo

DiSSCo (Distributed System of Scientific Collections) maintains a series of digitization guides on their community hub. There is information on to best prepare collections for digitization, how to digitize collections, and how then to publish the associated data.

iDigBio

iDigBio (Integrated Digitized Biocollections) has produced several videos that discuss the digitization process: