Training the Machines I

Artificial intelligence. Machine learning. I’m sure you’ve heard these buzzwords; they’re all the rage in technology lately. As it turns out, they are all the rage in biodiversity informatics too. The newest Notes from Nature expedition for plant specimens from the genus Prunus – which includes many plants you’re familiar with like cherries, almonds, and peaches – is part of a larger project on machine learning led by scientists at the Florida Museum of Natural History. Over the past year, we have scored thousands of images of digitized herbarium specimens from the genera Prunus and Acer for different character states – the presence of fruits, flowers, and unfolded leaves. We used these manually scored images to train and test a machine learning algorithm to see how well it is able to identify these characters on its own.

To get a better idea of whether or not our machine learning algorithm performs better than a human given the task of scoring these images, we recruited volunteers to score images of Prunus and Acer herbarium specimens using our criteria. On average, the volunteers were able to properly identify flowers, fruits, and unfolded leaves more than 95% of the time. That is pretty good! This led us to wonder if scoring by citizen scientists could create a training set comparable to the training set we made by spending hours poring over thousands of specimen images.


A flowering Prunus virginiana specimen from the University of Wisconsin Madison herbarium.

If this effort proves to be a success, crowdsourcing could be a great way to coordinate efforts to expand the possibilities of machine learning to new groups of plants! This could expand datasets about plant phenology – the study of the timing of life cycle changes in plants – at a more rapid pace than is possible right now. The phenology of plants is known to be closely linked to environmental conditions that plants experience. As the Earth’s climate changes in new ways, the impact of these changes may affect species of plants differently, depending on the areas where they are found and the different characteristics of the species. In order to fully understand the changes plant phenology is undergoing due to current climate change, though, we need a stronger understanding of plant phenology in the past. Herbarium specimens inherently carry phenological information, but it is not easily accessible in a usable format for researchers. Help us learn more about plant phenology and machine learning methods by participating in this Prunus phenology expedition!


If you are unsure what criteria we are using to determine scoring for flowers, fruits, or unfolded leaves, click on the “need some help with this task?” (outlined here in red) link to view the volunteer handbook.

For Notes From Nature volunteers who have participated in plant phenology expeditions in the past, it is important to note that our criteria for scoring the presence of flowers, fruits, or unfolded leaves may differ from previous expeditions. We encourage all volunteers for this expedition to view the help materials for these scorings by clicking the “need more help with this task?” link on the scoring page. This will undoubtedly help you make more accurate decisions! Also, you may come across some specimens where it is very hard to tell whether a specific trait is present or absent. Don’t stress, and just make your best possible guess! Happy scoring!

Laura Brenskelle (naturalista), University of Florida


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: