Label Babel 2 complete!
Label Babel 2 is complete! It actually completed a few weeks ago, so we wanted to give you an update about next steps for this project. Before we do that we want to thank each and everyone who helped with this expedition. It was a large one and very different from our typical transcription based tasks. We appreciate everyone’s willingness to try something new and make a contribution to our new project.
Label Babel was focused on automatic label detection in the specimen images. More specifically, the data generated from those expeditions will be used to train an algorithm to detect labels and pull them out of the image in the future. We are still working through the data so we’ll be able to say more about that specific part of the project and the outcomes from Label Babel 2 in the coming months.
Our next step in the larger project is to automatically pull the text out of the labels using a method called OCR (optical character recognition). OCR has been around for a long time and this is certainly not the first attempt to do this for biodiversity specimens. What we are striving to do is build off of what has been done in the past and ultimately develop a human in the loop workflow. This means that we anticipate that some specimens can be transcribed automatically, but many will still require human interaction. Human interaction is where you and Notes from Nature fit into the picture. As great as our new algorithm turns out to be, there are some tasks which only humans can do.
Next up will be an expedition where we ask you to look over the OCR results and ask you to tell us how it did. Please keep an eye out for that announcement.
— The Notes from Nature (and Digi-Leap) Team.