Labs is back and more
Way back in time, like 2019 (!) we had the idea to do something different with Notes from Nature expeditions. Rather than transcribing or even annotating specimens that have flowers, we simply wanted your help in finding labels on herbarium sheets. Why? So we can starting building a toolkit for training an algorithm to automatically find different types of labels contained on a specimen and the type of text that label contains. This means that your contributions help to create a training set that will serve as a key basis for machine learning approaches we’ll be employing. So we prototyped an expedition called Label Babel, and long story short, it works! We can use this approach for image segmentation. Even better news – we got funding to continue this line of thinking and this new Label Babel 2.0 expedition is a successor that is smarter and better. While a smidge more work than Label Babel 1.0, the new version will help us get ALL THE LABELS.
The longer term goal is to make better use of automated tools in order to make transcription more efficient. For example, if we can automatically identity the label and the text it contains then we can try to have another algorithm read the text and try to interpret it. The goal being that we may be able to automatically transcribe certain specimens. Those specimens then would not need to be put through the same process at Notes from Nature. We would instead focus on specimens that truly need humans to see them.
All this might make you wonder if these algorithm might ever replace the need for community scientists like yourself. We don’t think that will be the case for a very, very long time since there are so many specimens in the world that still need to be transcribed. In addition, there are also lots of other kinds of tasks that community scientists can help with such as measurements and counts, just to name a few. The fact is that we still have so much data to digitize and mobilize for a variety of uses. We are looking for ways to make that process more efficient so the data is available for anyone who wants to use it to help solve critical problems related to biodiversity.
If this sounds interesting to you, them please check out our new expedition Label Babel 2 in the Labs Project.
— The Notes from Nature Team