The Notes from Nature team is very excited about WeDigBio 2021. The event will take place October 14 – 17. We hope you join us! In the mean time if you are interested in hosting an expedition on Notes from Nature for the event please fill out our interest form as soon as you are able.
More about WeDigBio:
Worldwide Engagement for Digitizing Biocollections (WeDigBio), is a global data campaign, virtual science festival, and local outreach opportunity, all rolled into one. This 4-day, twice a year event mobilizes participants to create digital data about biodiversity specimens, including specimen slides, plants on sheets, insects on pins and more. This year you can expect lots of online events and webinars that you can join as your scheduling and interest allows.
— The Notes from Nature Team
With an extensive collection of plants from Alberta and around the world, the University of Calgary Herbarium is dedicated to the collection, preservation, and documentation of plant specimens. We are excited to announce our first expedition with Notes from Nature for transcribing specimens. This expedition focuses on specimens from the Rose and Legume families collected in Alberta. From the clovers growing in the cracks of the sidewalk, to the wild strawberries found in the mountains, there is a wide range of specimens to see in this expedition.
Through the help of volunteers such as yourself, data regarding these specimens would swiftly become available to all who share an interest in botanical specimens. We hope you enjoy your time getting to know the Roses and Legumes of Alberta and we thank you for your participation!
You can try out the new expedition on the Notes from Nature Herbarium Project.
— The University of Calgary Herbarium Team
Our next step in the larger project is to automatically pull the text out of the labels using a method called OCR (optical character recognition). OCR has been around for a long time and this is certainly not the first attempt to do this for biodiversity specimens. There are many challenges to OCR of museum specimens (e.g. different handwritings and fonts) and no one solution has come forward to resolve this challenge. What we are striving to do is build off of what has been done in the past and develop a human in the loop workflow. This means that we anticipate that some specimens can be transcribed automatically, but many will still require human eyes. This is where you and Notes from Nature can be a huge help!
Next up will be an expedition where we ask volunteers to look over the OCR results and tell us how it did. Hence the name, OC – Are They Good or Not? Get it? We are all about the puns. We’ll present images of the original label and OCR output and ask volunteers to tell us what errors, if any, they see. We don’t need to know everything single error letter by letter. We just need to know if there are errors and what kind they are. For example, if a word or letter is present in the original label, but not the OCR output that is called a deletion. You may notice that some images are side by side while some are presented as one on top of the other. We did this in order to make the images fit as best we could within the image viewer.
If this sounds fun to you, please head over to the Labs Project and give it a try.
— The Notes from Nature Team
The pausing of the MI-Bug project gives its creators, Erika Tucker (formerly of the University of Michigan Museum of Zoology Insect Division) and Justin Schell (Director of the Shapiro Design Lab) a chance to reflect on what’s been accomplished so far with the generous support of Notes From Nature, but also the many volunteers who make such work possible!
Like so many other collections around the world, the University of Michigan Museum of Zoology (UMMZ) Insect Collection has millions of specimens (about 4.5 million) housed in its collection, less than 10% of which are digitized and accessible. These undigitized, and usually uncataloged, specimens represent “dark data”. This means in order to utilize the millions of specimens and their associated ecological data, someone has to physically go into the collection and search for it. And know to go look for it in the first place. Often no small feat! By digitizing the specimens and their associated data, we are making the data available in a usable format and accessible to researchers around the world.
When Tucker joined the UMMZ Insect Collection, new cataloging protocols were implemented that incorporated imaging the specimens with all their labels – not just directly transcribing the specimen data into the museum database. While this takes some additional upfront time processing the specimens, it also allows the UMMZ Insect Collection to participate in Notes from Nature. By utilizing the amazing community of volunteers on Notes from Nature, an incredible amount of time is saved in the long run digitizing specimens. This has allowed the museum to mobilize many, many, more important specimens and ecological records much faster than it would have otherwise been able to.
This setup of in-person specimen imaging combined with volunteer transcription has also provided a unique opportunity for volunteers to get a sneak peek behind the scenes at the museum. The Talk Board and feedback from so many talented volunteers has additionally facilitated the ability for Tucker to engage with volunteers and share her knowledge and enthusiasm of museum and insect related topics. In a time when the museum had to shut its doors to many of its normal visitors and volunteers due to the pandemic, this project and the interactions with its volunteers has really been essential to continuing museum productivity, as well as keeping Tucker sane.
In terms of numbers, since the launch of MI-Bug in April of 2020, more than 1400 different volunteers from across the world contributed more than 82,000 transcriptions. Their efforts resulted in nearly 27,000 specimen labels from grasshoppers, crickets, and wasps from the Insect Division’s collection. Data from these specimens can now be included into projects like the Global Biodiversity Information Facility and contribute to research projects in Michigan and around the world!
There are additional people to thank for the success of MI-Bug so far:
- Michael Denslow of Notes From Nature, who was excited to include MI-Bug in Notes From Nature and shared his technical knowledge and experience as we developed and implemented MI-Bug
- am.zooni for her expert moderation and assistance on the Zooniverse Talk Boards
- Alexandria Rayburn, Mark Ramirez, and Tony Sexton, former School of Information students at the University of Michigan who helped with the initial development and build of MI-Bug
- Robert McIntyre, Lauren Havens, and Kat Hagedorn at the University of Michigan Library, for assistance with loading images into Zooniverse
- Max Ansorge and Amber Ma, Shapiro Design Lab Residents, for helping with data cleaning
- Peregrine Ke-Lind, Alan Ching, Chloe Weise, Yeaeun Park, Ellen James, Siena McKim, Tom Hayek, Ellen James, Henry Smith, Neha Bhomia, Troyer Wallance-Evan, Elizabeth Postema, Andrea Lin, for producing the many images used in this project
While the project is temporarily paused as Erika moves on from the University of Michigan, we hope to be back with more specimens soon, so we can continue engaging with the wonderful Notes From Nature community!
— Justin Schell and Erika Tucker
Today is our 8th anniversary!
We are thankful for your contributions every day and for making this project possible, but today we are especially thankful. We wouldn’t still be here without our amazing volunteers, science partners, data providers, the Zooniverse team and of course for our sponsor the National Science Foundation for keeping the project going day after day and year after year.
Please help us celebrate 8 years of Notes from Nature by doing a few transcriptions today! We also encourage you to celebrate by taking a walk outside in your local area and seeing what kinds of plants and critters you might find!
— The Notes from Nature Team
We wanted to present some preliminary results of the Label Babel 2 expedition. The results from you all look spectacular. The vast majority of the data looks like the image below and will make excellent training data for the models.
The blue boxes in the image above are the outlines that you drew. The red boxes are the final crop for the labels; where we merged the blue box into a single “best” interpretation of the label. This is beautiful! There is close agreement on where & what the labels are. The only things we wanted identified as labels are outlined in blue. The tag, stamp, and ruler/color guide are not outlined, which is correct. The majority of the data looks this good.
The data from this expedition was generally great, but there were some wrinkles in the output. For example, there are some challenges in terms of processing the data in order to find the best interpretation of the labels. Some people outlined the wrong things or nothing at all, but by far the most common problem (unique to this expedition) was the lumping of several labels into a single outline (see image below). Here, we have added a new color “green” that shows several labels together. Unfortunately these kinds of entries can’t be used for training the models in the next step of our process.
Next, we go on to automating the label finding process by using the data you provided to train an automated process. After that, we will automate the classifications using the labels you provided around whether text was typewritten. We plan on using an artificial neural network that does both in one swell foop. We will use these annotations as the training data for this neural net.
We are eager to see the results and to use this data as well. We’ll give another update on the progress in a few weeks. In the meantime, we want to thank all of the participants in this expedition and say to also note how impressed we are with the results.
— The Notes from Nature Team
Label Babel 2 is complete! It actually completed a few weeks ago, so we wanted to give you an update about next steps for this project. Before we do that we want to thank each and everyone who helped with this expedition. It was a large one and very different from our typical transcription based tasks. We appreciate everyone’s willingness to try something new and make a contribution to our new project.
Label Babel was focused on automatic label detection in the specimen images. More specifically, the data generated from those expeditions will be used to train an algorithm to detect labels and pull them out of the image in the future. We are still working through the data so we’ll be able to say more about that specific part of the project and the outcomes from Label Babel 2 in the coming months.
Our next step in the larger project is to automatically pull the text out of the labels using a method called OCR (optical character recognition). OCR has been around for a long time and this is certainly not the first attempt to do this for biodiversity specimens. What we are striving to do is build off of what has been done in the past and ultimately develop a human in the loop workflow. This means that we anticipate that some specimens can be transcribed automatically, but many will still require human interaction. Human interaction is where you and Notes from Nature fit into the picture. As great as our new algorithm turns out to be, there are some tasks which only humans can do.
Next up will be an expedition where we ask you to look over the OCR results and ask you to tell us how it did. Please keep an eye out for that announcement.
— The Notes from Nature (and Digi-Leap) Team.
The specimens you are transcribing in this expedition are a portion of the Milwaukee Public Museum’s slide collection. They were all collected by Dr. Omar Amin, a professor at University of Wisconsin – Parkside, and, along with other slides, form the basis of his work on the internal and external parasites of animals in the region.
You will notice there’s a fair amount of repetition in the collection – a very limited pool of hosts and parasites, and perhaps wonder: why so many? A dog flea is a dog flea, after all. But there’s a lot more to unlock in these slides. First, there’s host specificity. Fleas have some host fidelity (they’re called dog fleas for a reason, after all), but collections like this can give us quantifiable information about how often those fleas pop up on other hosts. In this collection, you’ll notice several instances of squirrel fleas (Orchopeas howardi) collected from the opossum (Didelphis virginiana). Collections like this one, taken in conjunction with collections of squirrel fleas from all over the country, can help scientists work out how common it is to find squirrel fleas on opossums, or if these fleas are just freaks. We can glean additional data, too–like if there’s a seasonality to flea abundance (e.g., infestations are more common at certain times of year), if male vs. female hosts are more likely to have parasites, and what the ratio of male to female parasite is on a given species during a given year.
Having a collection of parasite slides is essential to documenting the natural world both of today and of the past. Your digitization efforts on these slides, or any other community science transcription project, helps unlock this material for scientists, veterinarians, and public health officials.
Visit the Terrestrial Parasite Tracker (TPT) project today to give this expedition a try.
— Julia Colby
Vertebrate & Invertebrate Collections Manager, Milwaukee Public Museum
Happy Earth Day everyone!
On this Earth Day we’d like to highlight a few expeditions from our home institution, the Florida Museum of Natural History. The museum has world class collections, innovative research and so much more. We have three University of Florida expeditions that are running today and that we’d love to have you try out.
- The first is Label Babel 2, which we have written about before. This expedition is currently over 70% complete and it would be great to push it over the finish line soon so we can move onto the next phase of the project.
- The second is in the ever popular butterflies and moths project. This A Lotta Catocala expedition contains various species of underwing moths. The McGuire Center for Lepidoptera & Biodiversity contains not only extensive scientific collection, but also a 6,400-square-foot living butterfly rainforest.
- The third is a special WeDigFLPlants expedition that features Dogwoods and relatives from the University of Florida herbarium. This one is relatively small with just over 180 images to transcribe. It would be fantastic to complete it in a single day!
On this Earth Day we’d also like to honor the NfN community for partnering with us to conserve and make available knowledge about the natural world. The NfN project gives you the opportunity to make a scientifically important contribution towards that goal every single day.
Happy Earth Day to all.
– The Notes from Nature Team
We closed out the last day of WeDigBio with over 5,400 classifications. That puts Notes from Nature at 22,067 for the entire event. We are so very thankful for your contributions and wonderful discoveries over the last several days. WeDigBio 2021 was another success and Notes from Nature is thrilled to be involved in this ongoing event.
We want to express our appreciation to everyone who contributed. Thanks to all the data providers, scientists, moderators, presenters and the Zooniverse team for keeping the system running behind the scenes. Most of all, our appreciation goes out to all the volunteers. Your contributions are sincerely appreciated and every classification that is completed brings us closer to filling gaps in our knowledge of global biodiversity and our natural heritage.
There are still lots of expeditions from a wide variety of organisms available on our site. We hope you found the event rewarding and that you will return again soon. In case you missed it, we posted a new video about Notes from Nature. Check it out and let us know what you think. It’s also on Facebook and Twitter if you want to share it and help spread the word.
— The Notes from Nature Team