Updated FAQ and Useful tools: Herbarium Interface (version 3)
The following is an updated FAQ that includes the topics covered in our previous Notes from Nature FAQ post.
We are most thankful to our dedicated volunteers, who not only made great suggestions to improve and clarify some important issues, but have also completed over 490,000 transcriptions the majority of which are on the herbarium interface. These transcriptions are now being added to the SERNEC project portal on a regular basis. After that they may also be picked up by “aggregators” such as GBIF and iDigBio.
Note that this FAQ only covers issues related to the Herbarium interface. While this FAQ will cover the majority of Herbarium expeditions we always recommend reading the tutorial and help text when starting a new expedition.
1.) Interpretation: In general, you should minimize interpretation of open-ended fields and enter information verbatim. This way, we can better achieve consensus when checking multiple records against one another (see below, on that process). However, some discretion would be nice. Here are examples:
Interpretation that you should make: Simple spacing and capitalization errors (e.g. “3miN. of oakland” should be “3 mi N. of Oakland”).
Interpretation you should leave to us: Don’t interpret abbreviations, we’ll sort that out. (e.g. “Convict Lk.”).
2.) Non-English text: While we are currently focused on English language labels, on occasion you may encounter labels in other languages. Transcribe these exactly as written (do not translate to English). Match label content to transcription fields as best as you can. There is a helpful list of comment accent marks later in this document.
3.) Spelling mistakes: Transcribe exactly as written, unless you have looked it up and are absolutely certain of a simple spelling mistake. In this case, you can enter the correct spelling. When you make a correction, please use the Done&Talk button to add a comment describing the change; it’s also recommended that you provide a reliable web citation for the change if it’s anything other than a spelling correction of a common word. You can include #error or another relevant hashtag in your comment to flag the type of correction you made.
4.) Problem records: If you come across a problem record that may need to be addressed by a Researcher, or member of the project team, like a faulty image or other problem record, you can flag the record by commenting on it with #error or another relevant hashtag.
5.) Capitalization: Sometimes information may be in all capital letters on the labels. Unless this is an abbreviation, you should capitalize only the first letter of every word in your transcription (e.g. “COASTAL PLAIN PROVINCE” should be transcribed as “Coastal Plain Province”).
6.) Multiple/conflicting information: Some labels may have more than one instance of a piece of information, such as:
- Scientific names: For Herbarium specimens, transcribe only the most recent name. This can be determined based on the date that appears on the “annotation label” If you do not see a date then enter the name that appears on the primary label. The “determination label” or later added determination information should have everything spelled out, however this is not always the case. If the first letter is the same it is safe to assume the same genus is being used. Here is an example. In the case of the linked image, “D.” abbreviates “Dryopteris”, so you would enter “Dryopteris intermedia”.
- Collectors: In some cases, collectors may be listed on different lines of the label with no punctuation separating them. In your transcription, separate the names with commas. Transcribe the collector names as shown on the label, including honorifics (Mrs., Dr.). It isn’t uncommon for museums to have individual ways of entering collectors’ names so it is always best to review the help text for this specific field.
- Collector numbers: ie “123 & 4567” This could indicate that each collector gave the specimen a different number in the field. This is an uncommon practice and even when it happens it usually doesn’t go on the same label, but if you find one it should be entered exactly as is.
- Dates or date/day ranges: You should enter the earliest date listed only. Multiple dates are uncommon on herbarium labels so in most expeditions we choose to collect only one date. It is worth noting the convention in other collecting disciplines is to take a range of dates (e.g. insects in CalBug) but it isn’t for herbarium specimens.
- Locations: If a specimen is cultivated at one location from cuttings/seeds/rhizomes collected at a different location, enter the place where the specimen was cultivated in the Location field and enter the place where the seeds were collected in the Habitat and Description field.
7.) Missing information: When information in a Herbarium field is not given on the specimen label, you should leave the field blank (in the case of text entry fields) or select “Unknown” or “Not Shown” in the drop-down lists. If information on the specimen label has been verified to be missing from an Herbarium field dropdown list please advise with a Talk post. There are two well-known caveats for this:
- Dade county (Florida, United States) appears to be missing from the County list, but it is present and should be transcribed as “Miami-Dade” (it was renamed in 1997).
- Ivory Coast appears to be missing from the Country list, but it is present and should be transcribed as its French name “Cote d’Ivoire”.
8.) Inconsistent collector names: You may see several variations of the same collector name (e.g. “R. Kral” or “R.Kral”, “RWG” or “R.W.Garrison”) on different labels. We are asking for the collector name(s) to be transcribed as written on the label. This is a somewhat complicated issue since same collectors might appear to be very similar but aren’t always the same. It can take a lot of knowledge about the collector and where they deposited specimens to be able to make a definitive decision.
Interpretation that you should make: Simple spacing or capitalization errors (e.g. “R.kral” should be “R. Kral”)
Interpretation you should leave to us: Don’t interpret abbreviations, we’ll sort that out. (e.g. “RWG” should remain “RWG”)
9.) Scientific name: Provide the most recent name, whether it is a species name (a two-word combination of the genus and what is called the “specific epithet” in botanical nomenclature) or a one-word name that is at a higher taxonomic rank (e.g., just the genus or family name). Names at higher taxonomic ranks than species are used when a more precise identification has not been made. The name should typically take the form of a genus name that begins with a capital letter (genus) and a specific epithet that begins with a lowercase letter. If any of the names are given in all capitals, such as “CYPERUS ODORATUS”, the name should be entered using the typical convention, “Cyperus odoratus” in this case.
Varieties and subspecies: Record the subspecies, but omit the scientific author’s names. So “Cyperus odoratus var. squarrosus (Britton) Jones, Wipff & Carter” should be transcribed as “Cyperus odoratus var. squarrosus”. “Echinodorus cordifolius (Linnaeus) Grisebach ssp. cordifolius” should be transcribed as “Echinodorus cordifolius ssp. cordifolius”.
Be sure to reference #6 above for information related to annotation labels.
10.) Special Characters: What should you type when there is a special character in a text string, such as a degree symbol or language-specific characters? You can do an online search for the symbol or copy and paste it from your word processor’s symbols menu. Some commonly encountered symbols are included at the end of this document.
11.) County: If the county is not stated on the label, please find the appropriate county using an online search or other tools highlighted below. However, if there are multiple potential counties for a locality and it can’t be determined which is correct, please choose the Unknown County option from the County dropdown for U.S. locations; otherwise leave County blank.
12.) Splitting Location and Habitat: Often location and habitat terms will be mixed together, even being interleaved in the same sentence. Some simple guidelines when splitting them apart into separate fields to try to ensure consensus:
- Most times, general/non-specific locales are Habitat, and specific ones are Location, as only very rarely is a species found in the one place the specimen was obtained from (examples: “along road” would be Habitat as it describes the environment the plant grows in, but “along Smith Road” would be location as it describes the specific road where this specimen was found. “Bank of Smith River” would be split into “Bank” Habitat and “Smith River” Location.) In general, there is no need to repeat information in the two fields.
- Don’t introduce punctuation if possible, instead use what is there; sentences need not end with terminal punctuation (i.e., a period or exclamation point) if there is nothing after. There may be occasions when leaving it out would change the meaning of the text, in those cases it’s OK to make an addition.
- Drop unnecessary dangling non-terminal punctuation as needed. For example, “Dry roadside, east of Smithville” would result in “Dry roadside” Habitat, dropping the dangling comma as it is doesn’t terminate a sentence properly, but “Dry roadside. East of Smithville.” would keep the period to “Dry roadside.” Habitat as it does terminate the sentence.
- Capitalize new sentences (as in the example above) caused by the split.
- Data that goes into Habitat/Description:
- Added information in later labels: occasionally in a later determination the scientist will add information about the specimen, i.e., its condition or maturity; this should be included after the primary label’s data (this also applies to other fields as well though it is far less likely to find additional info for them)
- Floodplain describes a habitat. This often occurs with a river name, so for “Mississippi River floodplain”, include of the text in the Habitat field. Since in this case it wouldn’t be accurate to just have “Mississippi River” in the locality field.
- Power lines: as they may help narrow a location but say more about the habitat in which the plant grows as power line corridors are usually cleared of larger shrubs and trees.
- “n=” followed by a number; this is the number of chromosomes.
- Elevation/Altitude information should be entered into the Location field, if there isn’t a separate field for Elevation. Enter elevation verbatim in the units stated on the label.
- Data that goes into Location:
- Latitude and Longitude: Enter exactly as written. See special characters below for how to generate the degree symbol ° (or you can copy it right from here).
- Public Land Survey System: This is the T (township), R (range) and S (section) data used to establish location. For example, SW1/4 NW1/4 S13, T1SR20E refers to the southwest quarter of the northwest quarter of Section 13 of Township 1 South Range 20 East). Quarter sections “1/4” should be written as 3 characters, not one (¼).
- Provinces: Geographic provinces (e.g., Coastal Plain, Piedmont) go into the Location field but administrative provinces of countries (e.g., Alberta in Canada) go in the State/Province field.
14.) Information to Omit/Skip: The following data should not be transcribed (unfortunately, for the sake of consensus, even if you want to). However if you do find something interesting, feel free to use Done&Talk to post a comment about it.
- Synonyms listed adjacent to the primary determination (example: for “Cyperus echinatus [=C. ovularis]” only transcribe “Cyperus echinatus”)
- Common names of species; as many species have multiple common names, some of which are only locally used.
- Information printed into the label/template and not added by the collector, unless it both isn’t present in the data the collector added, and would be transcribed if it was (for example, a “Plants of Florida” label title wouldn’t be transcribed as the data would already indicate Florida in the State field, but “Plants of Fort Smith” title should be entered as “Fort Smith” in Location if this wasn’t present elsewhere).
- Information already entered into one of the dropdown fields. For example, if the label indicates “collected in Smithville, Jones Co.” because county ”Jones” will already be chosen in the dropdown“, Jones Co.” shouldn’t also be transcribed into the Location text as this would be redundant. However if it has “found in northern Jones Co.” then this should be transcribed verbatim into Location as well, as it is new information and would be meaningless if “Jones Co.” was removed.
- “Collected as part of a survey…” and similar “This specimen was examined as part of a study of…” entries, as it is part of a series of information that relates to annotations of the specimens and is not considered to be core information that we are trying to collect.
- “Sheet # of #” entries or other information indicating that this specimen is part of a set
- Hyphens that break a word across two lines. For example “speci-” at the end of a line and “men” at the beginning of the next line would be transcribed as “specimen” without the hyphen.
- Personal comments by the collector that do not relate to the specimen.
15) “s.n.” as the collector number; this stands for the Latin sine numerum meaning “without number”. In this case you should enter “s.n.” in the Collector Number field.
Some Useful Tools (discovered or developed by Notes From Nature users)
Counties and Cities: Good tools for finding counties etc. are lists on Wikipedia, there are lists of municipalities in each state of the U.S.A. (there are also similar lists for other countries). For example, https://en.wikipedia.org/wiki/List_of_municipalities_in_Florida (via the linkbox you can also change the state).
Uncertain Localities: Geographic Names Information System, U.S. Geological Survey.
For locations outside the U.S.: Geonames.org http://www.geonames.org/
Mapping tool with topo quads: To find uncertain counties or localities http://mapper.acme.com
Collector Names: Harvard University Herbarium maintains a database of collectors (http://kiki.huh.harvard.edu/databases/botanist_index.html). Note that many collectors that are encountered may not be in this database.
Hard-to-read text: Use “Sheen”, the visual webpage filter, for some hard-to-read handwriting written in pencil. (Tip was from the War Diary Zooniverse project) https://chrome.google.com/webstore/detail/sheen/mopkplcglehjfbedbngcglkmajhflnjk?hl=en-GB
Special symbols: You should be able to find symbols in word or by doing an online search and copy and paste. Here are a few:
– degree symbol for coordinates: °
– plus minus: ±
– fractions: ⅛ ¼ ⅓ ⅜ ½ ⅝ ⅔ ¾ ⅞
– non-English symbols: Ä ä å Å ð ë ğ Ñ ñ õ Ö ö Ü ü Ž ž
Other symbols may be found on Penn State’s Symbol Codes: Accents, Symbols and Foreign Scripts page: http://sites.psu.edu/symbolcodes/codehtml/
ClipX: Freeware Windows clipboard enhancer that saves the last 1,024 items copied to the clipboard and allows them to be pasted through its icon in the system tray. Nothing short of a lifesaver for Ornithology but quite helpful in Herbarium too: http://clipx.en.softonic.com/
The Plant List: Search for scientific names of plants – http://www.theplantlist.org/
Integrated Taxonomic Information System (ITIS): Along with The Plant List, another recognized resource for plant scientific names (as well as animals, fungi, bacteria and more) http://www.itis.gov/
Dates: If all parts of the date are written with numerals and it’s unclear which part is the day and which is the month (for example, 2-4-91) https://en.wikipedia.org/wiki/Date_format_by_country identifies which date format (day-month-year or month-day-year) is commonly used in each country.
Mr Kevvy has generated a very useful set of custom dictionaries. They can be found here:
These dictionaries are a wonderful resource. It should be noted that scientific names can have gender based differences. You will see the specific epithet (commonly called the “species name”) with male and female genera spellings. An example albiflora is feminine and albiflorus is masculine. The Carolina-poppy is Argemone albiflora (not albiflorus). Both albiflora and albiflorus are correctly spelled, but in this case albiflorus should never be used with the genus Argemone.