Taxonomy and Notes From Nature

A few volunteers have recently asked some questions about taxonomy in Notes From Nature. This seems to be a big question that comes up as part of the herbarium interface since this is one of the two collections where volunteers are asked to transcribe the scientific name that is present on the label.

Most of the questions and comments are about accepted versus unaccepted names. Before we get into that issue, let touch on the task that is being completed. Volunteers are asked to transcribe the scientific name (usually just genus and specific epithet) without the authorship of the name as it appears on the label. The genus and specific epithet are transcribed into the “Scientific Name” field. Here is an example:

Saccharum giganteum (Walter) Pers.

Saccharum is the genus.

giganteum is the specific epithet, sometime called species name.

(Walter) Pers. is the authorship for the name.

The authorship can be left off since this saves time and can most of the time be easily looked up automatically by querying existing databases. Another potential complication is that sometimes authorship is abbreviated and sometime not. For example, “L.” is the same as “Linnaeus.”

Databases such as ITIS list plant names as being ‘accepted’ or ‘ not accepted’. This terminology is a bit confusing for a few reasons. First, accepted names are neither static nor are they absolute; they are open to different opinions by different experts and what is accepted today may be different from what is accepted a year from now. These differences usually reflect new studies or information about the relationships among different taxa. The other issue is that one source may accept one name while another accepts a different one. Here is an example:

Saccharum giganteum is accepted by ITIS, but is not by Weakley’s Flora. Weakley considers Saccharum giganteum to be synonym of Erianthus giganteus while the opposite if true for ITIS. Both sources agree that the two names exist, but they have different opinions about which is currently accepted.

Saccharum giganteum, Sunnybell Prairie, Coosa Valle Prairies, Floyd County, Georgia 1

We are very excited to see folks doing some research about these names. At Notes From Nature we strongly encourage our volunteers to learn more about the work that we do and hope that everyone learns something about museums and biodiversity as part of the process. Below are a few links where volunteers can look up more information about the different taxa that they encounter, but there is no need to include that information in the transcription or Talk page. However, there are at least two exceptions to this. First would be the discovery of a misspelling or typo. Any scientific names that you discover to be misspelled should be corrected in the transcription. The other would be if you have a question, concern, noted some other oddity on the label, or just want to chat about something you have seen.


Flora of North America:

The Integrated Taxonomic Information System (ITIS):

Encyclopedia of Life:

The PLANTS Database:

Notes From Nature volunteer Mr Kevvy’s has generated a very useful set of custom dictionaries. They can be found here:

Grow your piece of the patchwork! Every little bit helps.

Each square of this colorful patchwork represents a Notes From Nature volunteer who has contributed transcriptions to the Herbarium project. The size of the square corresponds to the number of transcriptions done by the individual. Some folks, like those in the top left corner, have transcribed thousands of herbarium specimens and those in the lower right have completed a few. The figure represents 188,184 transcriptions and it would not be complete without the efforts of each of the 3,805 volunteers. How many transcriptions does it take to get to the largest box, you might be wondering? 18,782!
NFN Tree Map
Visualizations like this can highlight the fact that some people get really into transcribing! What keeps you coming back to do more?
Grow your piece of the patchwork and transcribe a few herbarium specimens today!
This is a guest post by Libby Ellwood, a Postdoctoral Fellow at Florida State University in Tallahassee Florida, U.S.A.
Special thanks to Jessica Luo for the R code to create the treemap.

We are very excited to announce our next set of herbarium images!

These specimens come from Southeastern Louisiana University located in the southeastern United States. Wow, that’s really southeastern! Southeastern Louisiana University is a medium sized university which has an herbarium housed in its Biology department. These kinds of small to medium sized collections are fairly common around the southeastern United States. The SERNEC project has estimated there to be over 230 others! We are actively working to get them all digitized so that the data can be made available to anyone that wishes to use it.

One thing that is unique about this set of images is that it contains almost every specimen housed in this collection. That means that once this set of images is transcribed the whole collection will be digitized.

Small collections such as this one play a critical role in the documentation of biodiversity. Most small to medium sized museum collections house specimens primarily from their local area, since this is the most common place that curators and students go to collect. This means that each of these small collections makes a unique contribution to our knowledge of the local biodiversity by filling in important gaps.

The region where this collection is located is also one of the top biodiversity hotspots in the United States and is home to nearly 3,000 species of native plants.

Notes from Nature Profile: New Team Member Raphael LaFrance

Notes from Nature is SUPER EXCITED to introduce Raphael (Rafe) LaFrance, who is working on Notes from Nature in a part time role to help out with some needed improvement to the NFN interfaces and general usability.  YAY! More about Rafe below.  Also, again, thanks to our volunteers for sticking with Notes from Nature, and hoping that you’ll soon see improvements with Rafe now on board.


Name: Rafe LaFrance

Title: Informatics Specialist

Where do you work primarily?  I work at the University of Florida’s Museum of Natural History (Prof. Guralnick’s lab) in the field of biodiversity informatics. We are currently using computers, field data, and museum data to track where organisms are and have been in the environment.  We also track how organisms respond to changes in their environment over time.  All of this is done with an eye towards the value of data for decisions in a policy and management framework.

What you do in your day job?  I have a few roles. I help design and program a couple of web sites related to the Museum’s research.  I also help with the preparation and analysis of research data.  And, I also work as a general IT and programming support when that is needed.

What’s your role with NfN and what do you hope to gain from it accomplish?  If relevant, how will your research benefit?  Notes from Nature is one of the web sites that I help develop.  One goal I have for NfN is to continue the tradition of listening to the needs of the citizen and research scientists and make improvements to the Notes from Nature web site based upon those needs.  I have already heard several ideas that will make NfN more fun and easier to use.  I hope to get them to you ASAP.  Another goal is to streamline the process of getting the images and data to the citizen scientists so they can continue to make the highest quality contributions to science at their typical brisk pace.  And finally, I want find new and useful ways to present the citizen scientists’ results back to the research scientists.  All of which is a long winded way of saying that I want to help push the envelope of what the collaboration between citizen and research scientists can accomplish.

What’s the most exciting aspect of citizen science work from your point-of-view?  Like most of us here in the Zooniverse, I have a keen interest in science and would love to help in the research.  Well, here it is!  Research scientists need this data; it contains vital details needed for their research.  Not only are we making real contributions we’re doing it at a rate that couldn’t be done by the researchers alone.  We get to — are encouraged to —  comb through the hidden archives of museums.  We see things that most museum goers don’t get a chance to see and we get to talk with top notch researchers about their data.  I started off being curious about biology but doing citizen science has not only increased my enthusiasm for science in general it has sharpened my appreciation for the process of science.  I now have a better understanding of what questions research scientists actually ask.  How do they go about answering them.  What data do they need to arrive at the answers.  From my point of view it has made me appreciate science even more than I already did.

A great shout-out to our volunteers!

An article was just published which highlights the wonderful work that our volunteers have been doing.

It is called Citizen Volunteers Pitch in on Digitization Backlog and is in the journal BioScience. Once again a sincere thanks for the enormous efforts of our volunteers!

Updated FAQ and Useful tools: Herbarium Interface

The following is an updated FAQ that includes the topics covered in our first Notes From Nature FAQ post ( We are most thankful to our dictated volunteers who made great suggestion to improve and clarify some important issues. The discussion and suggestions can be found here:

Note that this FAQ only covers issues related to the herbarium interface (SERNEC). We will be developing specific FAQs for all the Notes From Nature interfaces over the coming months.

1.) Interpretation: In general, you should minimize interpretation of open-ended fields and enter information verbatim. This way, we can better achieve consensus when checking multiple records against one another (see below, on that process). However, some discretion would be nice. Here are examples:

Interpretation that you should make: Simple spacing errors (e.g. “3miN. of Oakland” should be “3 mi N. of Oakland”)

Interpretation you should leave to us: Don’t interpret abbreviations, we’ll sort that out. (e.g. “Convict Lk.” )

2.) Not in English: Transcribe exactly as written. Match label content to transcription fields as best as you can. Non-English labels should be rarely encountered in the herbarium interface, but may occasionally occur.

3.) Abbreviations: Transcribe exactly as written.

4.) Spelling mistakes: Transcribe exactly as written, unless you have looked it up and are absolutely certain of a simple spelling mistake. In this case, you can enter the correct spelling.

5.) Problem records: If you come across a problem record that may need to be addressed by a scientist, like a faulty image or a record with illegible handwriting, you can flag the record by commenting on it (e.g. with the hashtag #error) and indicate what is in error. Note that the hash tag #scientist is also frequently used for this purpose.

6.) Provinces: Geographic provinces (e.g. Coastal Plain, Piedmont) should go into the Location field.

7.) Capitalization: Sometimes information may be in all capital letters on the labels. Unless this is an abbreviation, you should capitalize only the first letter of every word in your transcription (e.g. “COASTAL PLAIN PROVINCE” should be “Coastal Plain Province”).

8.) Many collectors: In many cases, collectors may be listed on different lines of the label with no punctuation separating them. In your transcription, please separate the collectors with commas.

9.) Missing information: What should you do when there is no information available for a field? When information is not given on the label, you should leave the field blank (in the case of open-ended fields) or select “Unknown” or “Not Shown” in the drop-down lists

10.) Inconsistent collector names: You will often find several variations of the same collector name (e.g. “R. Kral” or “R.Kral”, “RWG” or “R.W.Garrison”). We are asking for the collector names to be typed as written. This is a somewhat complicated issue since same collectors might appear to be very similar but aren’t always the same. It can take know a lot of about the collector and where they deposited specimens to be able to make a definitive decision.

Interpretation that you should make: Simple spacing errors (e.g. “R.Kral” should be “R. Kral”)

Interpretation you should leave to us: Don’t interpret abbreviations, we’ll sort that out. (e.g. “RWG” should remain “RWG”)

11.) Many scientific names: For SERNEC Herbarium specimens, copy only the most recent name. This can be determined based on the date that appears on the ‘annotation label.’ If you do not see a date then enter the name that appears on the primary label.

When the latest determination uses an abbreviation for the genus name, because the genus is the same as the previous/original determination, the genus name should be written out in full. Examples: ,

The “determination label” or later added determination information should have everything spelled out, however this is not always the case. If the first letter is the same it is safe to assume the same genus is being used. For example, J. marginatus would = Juncus marginatus and “Juncus” would be written out.

12.) Varieties and subspecies: Record the subspecies, but omit the scientific author’s name. So “Cyperus odoratus var. squarrosus (Britton) Jones, Wipff & Carter” becomes “Cyperus odoratus var. squarrosus”. “Echinodorus cordifolius (Linnaeus) Grisebach ssp. cordifolius” becomes “Echinodorus cordifolius ssp. cordifolius”.

13.) Scientific name: Provide the most recent name, whether it is a species name (a two-word combination of the genus and what is called the “specific epithet” in botanical nomenclature) or a one-word name that is at a higher taxonomic rank (e.g., just the genus or family name). Names at higher taxonomic ranks than species are used when a more precise identification has not been made.  The species name should typically take the form of a genus name that begins with a capital letter and a specific epithet that begins with a lowercase letter.  If any of the names are given in all capitals, such as “CYPERUS ODORATUS”, the name should be entered using the typical convention, “Cyperus odoratus” in this case.

14.) Latitude and Longitude: How do you enter latitude and longitude values, and where do these values go? Enter exactly as written, you can find symbols in Word or by searching online (e.g. 33° 62’ 22” N  116° 41’ 42” W). You can also produce the degree symbol ° using key combinations (alt + 0 on a mac; alt + 0176 on a PC, with the key pad on the right side of your keyboard). This information should go into the “Location” field.

15.) Special Characters: What should you type when there is a special character in a text string, such as a degree symbol or language-specific characters? You can do a google search for the symbol or copy and paste it from Microsoft Word symbols. There are also key combinations for common symbols. As mentioned above, you can produce the degree symbol ° using key combinations (alt + 0 on a mac; alt + 0176 on a PC, with the key pad on the right side of your keyboard).

16.) Elevation: Enter elevation verbatim into the “Habitat and Description” field.

17.) County: If the county is not given on the label, please find the appropriate county using google search or other tools highlighted below. However, if there are multiple potential counties for a locality, please leave the county field blank.

18.) Checking your transcription: You can use the link to the left of the “Finish Record” button (e.g. “1/9” or “9/9”) to check the information that you entered. Just click on any of the fields to make any necessary edits to your transcription.

19.) When is a record finished?: These blog posts describe the data checking process that uses 4 transcriptions of the same record to derive a consensus.

20.) Question: Should powerlines go in the location (because it helps you find a place), or habitat (because they imply a more open space and different microclimate)? Example:


This should go in the Habitat field. It could help narrow down a location, but it says more about habitat where the plant was growing.

21.) Question: What do you enter when a record has two different counties? Example:


This doesn’t happen very often. It usually indicates that the collector wasn’t entirely sure which county they were in e.g. at the boundary between the two. When you encounter this, I would suggest going with the first county listed.

I did do a bit of sleuthing and in this case I think the collectors were trying to indicate that they were on the county line. The Flint River does have a road crossing near the Spalding / Fayette County line.

22.) Question: What do you enter when a record has two different dates?


You should enter the first date only. This is also very uncommon on herbarium label so we chose to collect only one date.

23.) Question: On this record, would you rather have the scientific name as ‘unidentified’ or as the supposition?


This is a tough one! I can tell that the original collector (Carter) and the annotator (Kral) agree that it is in the genus Rhynchospora, but they just can’t get any further than that. Ideally you would just enter “Rhynchospora”, but leaving it blank (skipping it) would be acceptable. If the scientific name is blank or can’t be figured out then it should be skipped.

24.) Question: If “s.n.” (sine numerum = no number) is listed as the Collector Number, is it better to leave the field empty or actually put “s.n.” in it?


It is recommend to leave it blank, since ideally we would just have actual numbers in that field. Also many people – experts and non-experts – don’t know what s.n. refers to.

25.) Question: Should “floodplain” be in Habitat? I’m inclined to put it there as it describes a growing condition as floodplains are fertilized when flooded, other plants drowned, etc.


Yes, please put it into the habitat field.

26.) Question: What is the convention for transcribing a date range as opposed to one specific day? (ie first, last or midway through the range)


Enter the first date only. See also #22 above.   It is worth noting the conventions in other collecting disciplines is to take a range of dates (e.g. insects and CalBug) but it isn’t for herbarium specimens.

27.) Question: If a specimen is cultivated at one location from cuttings/seeds/rhizomes collected at a second location, which should be the transcribed country/state/county/location, the first or second?


Enter the place where it was actually collected. In this case the cultivated place. I haven’t seen the label, but it is likely a good idea to indicate the cultivated information in the habitat field.

28.) Question: Although we transcribe only the latest determination if there are multiple, should we also transcribe multiple synonyms in the same determination if they are listed, or just the first? (ie “Cyperus echinatus [=C. ovularis]”)


No. There is no need to add the synonyms, just enter the first or primary name. In this case “Cyperus echinatus.”

29.) Question: Should we also transcribe multiple collector numbers as written? ie “123 & 4567″ (Probably an obvious “yes” but isn’t formally in the Standards.)


This could indicate that each collector gave the specimen a number in the field. This is an uncommon practice and even when it happens it doesn’t go on the same label. In this case, I suggest entering it exactly as is.

30.) Question: Should we transcribe location information that is printed into the template of the label rather than being added? (such as “Plants of the Great Dismal Swamp” or “Flora of Fort…” etc.)


This is a bit of a judgment call, but in general the answer is yes if it is not indicated elsewhere. For example you often see “Plants of North Carolina” and the state is also indicated as North Carolina. In this case, the template really doesn’t give us any new information and it should not be entered. One should also be careful of institutional templates. For example, “Herbarium of Florida State University, Tallahassee.” Labels could have the name of a museum in Florida, but the specimen could be collected in Virginia.

31.) Question: Should we transcribe “Collected as part of a survey…” and other info that doesn’t relate to this specimen per se?


No. We do not expect you to transcribe this information. While it is interesting and potentially important we are also interested in keeping the process efficient and not overly time consuming.

32.) Question: Should we transcribe “sheet # of #” or other information indicating that this specimen is part of a set, but again is not just about this one per se?


No. We do not expect you to transcribe this information.

33.) Question: Should we transcribe re-examination? ie “This specimen was examined as part of a study of…” that occurs years after the original label.


No. This is part of a series of information that relates to annotations of the specimens. It is not considered to be core information that we are trying to collect.

34.) Question: Should we transcribe personal comments that clearly have nothing to do with the specimen? (Thinking Philip E. Hyatt here for some reason).


No. See #33 above which covers a similar issue. But if you find something awesome, interesting, etc. please post it in the talk forum!

35.) Question: If a word is hyphenated across two lines, do we remove the hyphen and join it? (Not including hyphenated word pairs of course. This is probably also an obvious “yes” but should be in the Standards formally.)


Yes, please remove the hyphen.

36.) Question: Should we transcribe Habitat/Description (or other specimen-relevant) info in later, separate determinations? (sometimes the person who made it adds a comment with further info about the specimen, i.e. its condition or maturity.)


Yes. If the annotation clearly contains information added by the collector that fits into one of the fields then add it.


Some Useful Tools (discovered or developed by Notes From Nature users)

Counties and Cities: Good tools for finding counties etc. are lists on wikipedia, there are lists of municipalities in each state of the USA (there are also similar lists for others). For example, (via the linkbox you can also change the state).


Uncertain Localities: Geographic Names Information System, U.S. Geological Survey.

Mapping tool with topo quads: To find uncertain counties or localities

Collector Names: Harvard University Herbarium maintains a database of collectors ( Note that many collectors that are encountered may not be in this database.

Hard-to-read text: Use “Sheen”, the visual webpage filter, for some hard-to-read handwriting written in pencil. (Tip was from the War Diary Zooniverse project)

Special symbols: You should be able to find symbols in word or by doing a google search and copy and paste. Here are a few:

– degree symbol for coordinates:  °

– plus minus: ±

– fractions: ⅛ ¼ ⅓ ⅜ ½ ⅝ ⅔ ¾ ⅞

– non-English symbols: Ä ä å Å ð ë ğ Ñ ñ õ Ö ö Ü ü Ž ž

The Plant List: Search for scientific names of plants –

List of Trees:

Integrated Taxonomic Information System (ITIS):

Mr Kevvy’s has generated a very useful set of custom dictionaries. They can be found here:

These dictionaries are a wonderful resource. It should be noted that scientific names can have gender based differences. You will see the specific epithet (commonly called the “species name”) with male and female genera spellings. An example albiflora is feminine and albiforus is masculine. The Carolina-poppy is Argemone albiflora (not albiflorus). Both albiflora and albiflorus are correctly spelled, but in this case albiflorus should never be used with the genus Argemone.

Simultaneous Transcription Blitzes a Success!

This is a cross-post with iDigBio and authored by Austin Mast, Richard Carter, and Libby Ellwood.

On Saturday, March 28, from 8–noon 31 volunteers gathered in computer labs at Valdosta State University (participants shown above) and Florida State University (participants shown to left) for a transcription blitz benefiting the VSU Herbarium and FSU’s Robert K. Godfrey Herbarium.  The blitz used two online transcription platforms: Zooniverse’sNotes from Nature and the Botanical Society of Britain and Ireland’s Herbaria@home.  In the end, 1748 transcriptions were completed—an average of 56 per person for the 4-hour event. This was the second in what is planned to be a series of digitization blitzes ocoordinated by iDigBio, the Southeastern Regional Network of Expertise and Collections Thematic Collections Network, and FSU’s Robert K. Godfrey Herbarium.  The FSU images that were transcribed had been generated during the first of these, an imaging blitz in September 2014.  Prizes and event-branded thank-you gifts for the blitz were paid for by the contributors to the successful December 2014 crowdfunding campaign by FSU’s herbarium. Richard Carter (VSU), Austin Mast (FSU) and Libby Ellwood (FSU) served as local organizers.

Each location began the blitz with a 30-minute introduction to the local herbarium and the importance of specimens to research and education, as well as an orientation to the transcription platform.  After
participants gained familiarity with the procedure, the organizers introduced one of four games that they used to prompt the participants to think more deeply about what they were seeing in the specimen images.  These games included Habitat Bingo (shown on right), Morphology Bingo, Timeline Tracker, and Geo Locator, which required participants to use habitat terms, morphology, collection dates, and collection localities, respectively.  The organizers introduced a new game every hour, and small prizes were given out to multiple winners for each game.  FSU also ran a raffle for a native blueberry plant, and transcribers received one raffle ticket for every five specimens transcribed.  The remaining prizes were raffled off as well at the end of the event.  Organizers provided a coffee break midway through the event.

In FSU’s formal post-event survey, 60% of participants reported a greater familiarity with biodiversity research specimens and their use in research and education; 100% agreed or strongly agreed that
biodiversity research collections merit public funding; and 73% strongly agreed that they enjoyed the transcription blitz and would participate again.  The most popular game was Geo Locator, with 87% finding it enjoyable or very enjoyable; each of the other games were enjoyed by 67% or more of the participants.  Half of the FSU participants had not visited FSU’s Robert K. Godfrey Herbarium in the past; 67% had not transcribed specimen labels previously; and 67% had not previously participated in a citizen science project.  VSU ran an informal post-event survey, and VSU’s participants particularly liked the bingo games. We do not yet have long-term data on the “stickiness” of the experience, but we know that at least two participants continued transcribing specimens online in the first 24 hours after the blitz.

Organizers learned a few lessons with this first transcription blitz.  First, the event produced progress in digitization for the herbaria and built community support for the collections.  Second, the event was easier to host than FSU’s 2014 imaging blitz, since it only required access to computers and browser software to do the digitization on the day of the event, rather than relatively complicated imaging stations.  Third, both transcription platforms were dependable—we engaged two in case of technical difficulties with one during the event (e.g., if high traffic led to a server crash).  However, orienting the group to the second one mid-morning took time away from transcribing.  It would be most efficient to just use one platform and hold the other as a backup.  Fourth, the games were a high point for many of the participants and did not result in any drop in the rate of transcription relative to online, distributed participants.  In fact, onsite participants had a higher rate of transcription.

Herbarium transcriptions on Notes from Nature average 3.65 minutes per specimen when done by distributed participants, and, if one takes into account the 45 minutes used for orientation and break, onsite participants averaged 3.48 minutes per specimen.  Fifth, orienting them to each new game took time away from transcribing.  Two games played over a longer period would have fit more comfortably into the schedule.  Sixth, a few subjects are critical for the training: the parts to a scientific name and what should be entered (e.g., should the author or varietal name be included), what an annotation label represents, how to identify the extent of a habitat description and a locality description, and the meaning of “s.n.”  And finally, there is room for development of more structurd interactions between simultaneous events.  The two computer labs were projected onto the screen (shown to right) and participants waved to each other, but otherwise activities at the two sites were independent of one another.

Development of these cross-site interactions will become more important with the Worldwide Engagement for Digitization of Biocollections (WeDigBio) event planned for October 2015.  WeDigBio has the goal of recruiting organizers for dozens of these events during a four-day period.  The next transcription blitz at FSU will be in conjunction with the Florida Native Plant Society annual meeting on Friday, May 29, from 8–noon.


Get every new post delivered to your Inbox.

Join 856 other followers

%d bloggers like this: