Simultaneous Transcription Blitzes a Success!
This is a cross-post with iDigBio and authored by Austin Mast, Richard Carter, and Libby Ellwood.
On Saturday, March 28, from 8–noon 31 volunteers gathered in computer labs at Valdosta State University (participants shown above) and Florida State University (participants shown to left) for a transcription blitz benefiting the VSU Herbarium and FSU’s Robert K. Godfrey Herbarium. The blitz
used two online transcription platforms: Zooniverse’sNotes from Nature and the Botanical Society of Britain and Ireland’s Herbaria@home. In the end, 1748 transcriptions were completed—an average of 56 per person for the 4-hour event. This was the second in what is planned to be a series of digitization blitzes ocoordinated by iDigBio, the Southeastern Regional Network of Expertise and Collections Thematic Collections Network, and FSU’s Robert K. Godfrey Herbarium. The FSU images that were transcribed had been generated during the first of these, an imaging blitz in September 2014. Prizes and event-branded thank-you gifts for the blitz were paid for by the contributors to the successful December 2014 crowdfunding campaign by FSU’s herbarium. Richard Carter (VSU), Austin Mast (FSU) and Libby Ellwood (FSU) served as local organizers.
Each location began the blitz with a 30-minute introduction to the local herbarium and the importance of specimens to research and education, as well as an orientation to the transcription platform. After
participants gained familiarity with the procedure, the organizers introduced one of four games that they used to prompt the participants to think more deeply about what they were seeing in the specimen images. These games included Habitat Bingo (shown on right), Morphology Bingo, Timeline Tracker, and Geo Locator, which required participants to use habitat terms, morphology, collection dates, and collection localities, respectively. The organizers introduced a new game every hour, and small prizes were given out to multiple winners for each game. FSU also ran a raffle for a native blueberry plant, and transcribers received one raffle ticket for every five specimens transcribed. The remaining prizes were raffled off as well at the end of the event. Organizers provided a coffee break midway through the event.
In FSU’s formal post-event survey, 60% of participants reported a greater familiarity with biodiversity research specimens and their use in research and education; 100% agreed or strongly agreed that
biodiversity research collections merit public funding; and 73% strongly agreed that they enjoyed the transcription blitz and would participate again. The most popular game was Geo Locator, with 87% finding it enjoyable or very enjoyable; each of the other games were enjoyed by 67% or more of the participants. Half of the FSU participants had not visited FSU’s Robert K. Godfrey Herbarium in the past; 67% had not transcribed specimen labels previously; and 67% had not previously participated in a citizen science project. VSU ran an informal post-event survey, and VSU’s participants particularly liked the bingo games. We do not yet have long-term data on the “stickiness” of the experience, but we know that at least two participants continued transcribing specimens online in the first 24 hours after the blitz.
Organizers learned a few lessons with this first transcription blitz. First, the event produced progress in digitization for the herbaria and built community support for the collections. Second, the event was easier to host than FSU’s 2014 imaging blitz, since it only required access to computers and browser software to do the digitization on the day of the event, rather than relatively complicated imaging stations. Third, both transcription platforms were dependable—we engaged two in case of technical difficulties with one during the event (e.g., if high traffic led to a server crash). However, orienting the group to the second one mid-morning took time away from transcribing. It would be most efficient to just use one platform and hold the other as a backup. Fourth, the games were a high point for many of the participants and did not result in any drop in the rate of transcription relative to online, distributed participants. In fact, onsite participants had a higher rate of transcription.
Herbarium transcriptions on Notes from Nature average 3.65 minutes per specimen when done by distributed participants, and, if one takes into account the 45 minutes used for orientation and break, onsite participants averaged 3.48 minutes per specimen. Fifth, orienting them to each new game took time away from transcribing. Two games played over a longer period would have fit more comfortably into the schedule. Sixth, a few subjects are critical for the training: the parts to a scientific name and what should be entered (e.g., should the author or varietal name be included), what an annotation label represents, how to identify the extent of a habitat description and a locality description, and the meaning of “s.n.” And finally, there is room for development of more structurd interactions between simultaneous events. The two computer labs were projected onto the screen (shown to right) and participants waved to each other, but otherwise activities at the two sites were independent of one another.
Development of these cross-site interactions will become more important with the Worldwide Engagement for Digitization of Biocollections (WeDigBio) event planned for October 2015. WeDigBio has the goal of recruiting organizers for dozens of these events during a four-day period. The next transcription blitz at FSU will be in conjunction with the Florida Native Plant Society annual meeting on Friday, May 29, from 8–noon.
Success!
Lemur success kid
This week Notes From Nature achieved an amazing milestone. Our volunteers transcribed a huge set of herbarium images. This set contained over 50,000 specimen images from two museum collections in the southeast United States. The collections are from Florida State University and Valdosta State University. Both of these collections are part of a broader initiative to digitize museum specimens from the biologically rich southeast United States (sernec.org). The resulting dataset will represent a valuable resource for research. This research will inform areas such as the response of vegetation to global change, human development, and rapid migrations of introduced species, just to name a few!
We want to acknowledge this amazing feat and thank our dedicated volunteers for their efforts. We are truly humbled and impressed by your contributions. You rock!
We will have more images coming soon and look forward to future successes. As one of volunteers recently stated, transcribing these specimens will have a lasting impact on our knowledge of biodiversity.
Thank you!
This one is for the birds
The bird ledgers at the Natural History Museum London stretch back over 250 years, and for those of you who have helped do over 319,000 (and counting) transcriptions of items from those ledger, you probably can tell just how old the specimens are just from the handwriting. The science team at Notes from Nature have also puzzled over those ledger entries, even for some of the more careful penmanship. We also know how amazing many of you are at solving the sometimes challenging puzzles found in these ledgers. Its an enormous collection and tremendous job that you are helping to accomplish.
The bird ledgers also have proven to be an adventure in citizen science development — it has required a new type of interface, and some different thinking about how to measure effort. At the end of 2014, we finally got counting working per row of the ledger, which is somewhat equivalent to a specimen record for other collections in Notes From Nature. Now we have set it up so we can count your individual row-by-row effort and offer you some rewards for completing work on these ledgers.
We are therefore pleased to announce three new bird badges, which you can acquire when you complete transcription work on the Ornithology collections. While working on an individual ledger page, you won’t see your badges earned until you hit “Next Ledger” — at that point, you should see the badge you earned show up both in the transcription interface and in your profile page.

You get this badge for transcribing one Ornitological record

You get this badge for transcribing 25 Ornithological records

You get this badge for transcribing 200 Ornithological records
This data really is for the birds – for understanding their past distributions and diversity, all in service of better understanding the future of this amazing, diverse and often inspiring group of animals with whom we share this planet.
A special collection of bird ledgers now available on Notes from Nature
Notes from Nature is excited to make available a special new set of Ornithology ledgers! They are available now and are from one of the true pioneers in our understanding of birds in India, Nepal, Pakistan and other nearby regions. We are also fortunate to have an expert on that pioneer, Allan Octavian Hume, to tell us more about him. Below, Robert Prys-Jones has written more about this amazing early naturalist, whose collections ledger records at the Natural History Museum London are now available to you to help digitize!
Allan Octavian Hume
Allan Octavian Hume (1829-1912) was responsible for presenting to the NHM the largest bird collection it has ever received, approaching 65,000 bird skins and 20,000 eggs, very largely from the former British India (i.e. modern India, Nepal, Pakistan, Bangladesh, Sri Lanka and Myanmar). However, he was a most extraordinary person in many other ways also. Starting out as a young administrator in the East India Company in 1849, he was officiating magistrate in Etawah, N.W. Provinces, at the time of the Indian Mutiny in 1857, playing a heroic but humane role in pacifying his area of responsibility, in recognition of which he was awarded the Companion of the Order of the Bath (CB). During his subsequent rise through the ranks of the British Raj to hold one of its most senior posts from 1871 to 1879, as Secretary of the Department of Revenue, Agriculture and Commerce, he dazzled with his efficiency but upset some of his superiors through his conviction that the core of his role was not to raise revenue but to improve the lives of ordinary Indian people. Eventually his outspoken views resulted in his demotion, followed by his subsequent retirement at the end of 1881.
These 20 years from the early 1860s to the early 1880s, ones of massive and increasing work responsibility, were when almost the entirety of Hume’s ornithological contribution was made, strictly as a consuming hobby. Gradually building up a network of correspondents and specimen contributors (his “coadjutors”) spanning British India, he made it his aim to transform knowledge of the region’s avifauna. Facilitating this, he published relentlessly, notably in a journal Stray Feathers that he founded and produced, employed a professional bird curator and collector (William Davison) from the early 1870s and mounted major expeditions to poorly-known areas.
Hume’s resignation from the British Raj coincided with his own diminishing interest in ornithology, something exacerbated by his deep but relatively brief involvement with Theosophy at this time, during which he became both a vegetarian and increasingly unwilling to procure specimens in the name of science. At the same time, now unconstrained by his job, his involvement with radical politics blossomed: he played a seminal role in the founding of the Congress Party, which would eventually become the main vehicle of the Indian independence movement and was the party of government in independent India as recently as a couple of years ago. Following the tragic theft of most of his text for his long-planned magnum opus on the Birds of British India, in 1885 Hume donated his entire bird collection to the Natural History Museum and withdrew from ornithology.
Finally returning to live permanently in Britain in 1894, he threw himself into liberal politics, as well as taking up British botany with the documentary zeal he had once devoted to Indian ornithology. Dissatisfied with convenience of access to the Natural History Museum botany collections for the interested working man, his final major act, in 1910, was to set up and endow the South London Botanical Institute (SLBI): over one hundred years later this is still going strong. Meanwhile, his ornithological collection remains the bedrock for all future south Asian bird research, as is fulsomely acknowledged by the most recent handbook for the area (Rasmussen & Anderton 2012 Birds of South Asia. 2 vols. 2nd ed.). In 2012, the centenary year since his death, the NHM and SLBI jointly acknowledged his significance by devoting a one-day conference to him.
Those interested to learn more of Hume may wish to look at a recent review of his life, focused on his ornithology – http://orientalbirdclub.org/wp-content/uploads/2012/11/Allan-Octavian-Hume.pdf – and at the account of him on Wikipedia – http://en.wikipedia.org/wiki/Allan_Octavian_Hume . The current effort to crowd-source his bird registers will feed directly into on-going research into understanding his development as an ornithologist and into interpreting the science in some of his diaries now held by the NHM.

6 Asian Fairy-bluebirds Irena puella from the Hume collection
_________________________________________________________________
The content about Allan Octavian Hume was written by Robert Prys-Jones and intro. and posting of this by Robert Guralnick.
The Sitch with the Stitch—The CITStitch Hackathon
This is being cross-posted with iDigBio, and co-authored with Libby Ellwood and Austin Mast.
Internet-scale public engagement in the digitization of biodiversity research specimens, such as can be seen at Notes from Nature, DigiVol, the Smithsonian’s Transcription Center, and FromthePage.com, offers clear win-wins insofar as motoring through our 100’s-of-millions-of-specimens digitization backlog and advancing science literacy. However, developing that level of engagement presents some large cyberinfrastructure challenges, given that the community of public engagement tools has yet to interoperate “seamlessly” amongst themselves and with the more established biodiversity data platforms, such as Symbiota and iDigBio. This observation was first widely discussed at iDigBio’s 2012 Public Participation in Digitization of Biodiversity Research Specimens Workshop, which led to iDigBio’s development of Biospex, a prototype public participation project management system.
For the second year in a row, iDigBio and Notes from Nature co-organized a citizen science (CITSCI) hackathon focused on the cyberinfrastructure gaps. You can read about last year’s CITSCribe Hackathon here. The main goal of this year’s CITStitch Hackathon (Dec 3–5) was to build interoperability among projects that enable public participation in digitization in useful and exciting ways for both the public participation project managers and the public participants. Among this year’s 24 participants were developers, data managers and publishers from public participation tools as well as those who work with cool tools for data visualization (e.g., CartoDB and Zooniverse), data cleaning (e.g., VertNet, Encyclopedia of Life, Global Names, FilteredPush, Kurator, SALIX), and georeferencing (GeoLocate and CoGE).
On Day 1, Austin Mast (Florida State Univ.) and Rob Guralnick (Univ. of Florida) welcomed everyone and provided brief introductions to iDigBio, Biospex, and Notes from Nature before lightning intros from each participant. Next, Libby Ellwood (Florida State Univ,), Austin, and Rob provided an introduction to proposed activities at the hackathon. These had been developed by iDigBio’s Interoperability for Public Participation in Digitization working group—the hackathon’s organizing committee (including those three plus Ed Gilbert, Nelson Rios, Ben Brumfield, Paul Flemons, and Greg Newman). These activities were presented as occurring in one of two tracks. The first track focused on innovative cross-platform ways to deploy and manage public participation projects, visualize and analyze progress for the project managers, and ingest data and provenance back into data management systems; this group would later be dubbed “Team Tardigrade.” The second track focused on development of novel ways to engage citizen scientists (e.g., via visualizations of individual and collective contributions); this would become “Team Honey Badger.” After this, Cody Meche gave an engaging talk on Agile development best practices and we split into our teams to develop priorities, goals for deliverables, and a road map.
What followed in Days 2 and 3 were a series of code-sprints interspersed with animated stand-ups, all fueled with a lot of coffee, hot tea, and food. In the end, each team went on to produce several deliverables involving subsets of their members that far exceeded expectations. Content regarding the deliverables can be found at the CITStitch wiki page. We have summarized the work briefly below. But as was emphasized at the start of the hackathon, successes will also be measured by the number of long-term collaborations initiated over dinner at the Reggae Shack, The Top or Andaz Indian Restaurant.
Team Tardigrade
Stuart Lynn (Zooniverse) produced a broadly useful web service and data explorer for the (now 1 million!) transcriptions in Notes from Nature.
Ed Gilbert (Symbiota) and Daryl Lafferty (SALIX) produced a SALIX web service that will take an OCR text string and direct it to the correct SALIX-enabled Symbiota portal for processing; this SALIX-parsed data then can be sent to a transcription tool for proofreading.
John Wieczorek (UC–Berkeley), David Lowery (Harvard), and Dmitry Mozzherin (Marine Biological Lab, Woods Hole) produced web services for assessing the fitness for use of data and doing data cleaning, including validators for scientific name, year collected, collection locality coordinates, and measure of coordinate uncertainty.
Ben Brumfield (FromthePage.com), Greg Riccardi (Florida State Univ.), and Robert Bruhn (iDigBio’s Biospex) expanded the Biospex data model to include ledger and field book pages in anticipation of adding FromthePage as an actor in Biospex project workflows.
Finally, Greg Riccardi, Ed Gilbert, Nelson Rios (Tulane Univ.), Ben Brumfield, Robert Bruhn, and Austin Mast established a manifest file example in JSON that enables tools and project management systems to communicate about the public participation projects.
Team Honey Badger
Chris Snyder (Zooniverse), with help from Libby Ellwood (Florida State Univ.) and Rob Guralnick (University of Florida), created functionality in Notes from Nature that compares entered taxonomic names against the GBIF Name API and gives feedback to inform the citizen scientist as to whether the name exists in GBIF and the number of records associated with that name; this gives the participant a sense of the significance of their contribution.
Julie Allen (Illinois Natural History Survey), Charlotte Germain (Univ. of Florida), Sophia B Liu (USGS iCoast) and Andrew Hill (CartoDB) created dynamic maps to visualize citizen science contributions; for example, the participant could upload a dataset to cartodb.com and select subsets of the data to display the country of origin for specimens that have been transcribed. Click here for a map of countries with specimens that have been digitized in Notes from Nature.
Paul Kimberly (Smithsonian), Paul Flemons (Australian Museum), Deb Paul (iDigBio), Libby, Rob, and Austin fleshed out a proposal for a 4-day global transcription blitz organized by the major transcription centers, including its timing, name, goals, funding, and organizational structure, and scoped the functionality for its website in years 1 and later (more to come on that in future blog posts!).
And, especially relevant to the global transcription blitz website, Alex Thompson (iDigBio) produced a prototype that integrates results across different transcription platforms and generates summary results and means for further exploration using Elasticsearch.
Thank you to all of our participants—what a great experience! You can check out more photos in the iDigBio CitStitch Hackathon album on facebook.
Thanks A Million!
Last Wednesday, while many Notes from Nature folks were in a citizen science hackathon, appropriately enough, we passed the one million transcription mark. It is a big milestone, and all of us involved in the project from the science and developer team appreciate the hard work. That work helps scientists like UC Berkeley’s Joanie Ball-Damerow examine changes in dragonfly and damselfly communities over the past 100 years in California based on museum records. Her work on that will soon be published in the journal Zookeys, and your help with unlocking data makes those kind of projects possible.
In case you are curious, our Notes from Nature user snowysky transcribed data from this specimen image for our millionth record, a sweat bee (Halictus ligatus) that was hanging out on some sunflowers (Helianthus annus) when collected in 1975. We hope everyone involved in the project continues to discover all kinds of hidden treasures locked away in collections, just like this one. Now on to 2 million!
Wanted: Feedback on Zooming in our Interface
We have a lot of great stuff happening on Notes from Nature and a big thanks to everyone, and more soon, about passing the million transcription mark. But for now we’d really like some advice from transcribers on what we think is a major upgrade to our interfaces. On the Herbarium interface, we have implemented ZOOMING. On both herbarium sheets and macrofungi labels, you previously would select a region, and then get a zoomed-in version of that section. However, there were some issue with this, well noted in the Notes from Nature talk forums. The new interface uses very simple pan and zoom controls to get the label in the location and size you want.
Please try it out on the Herbarium sheet records and, assuming there aren’t any major bugs or other issues, we’ll be implementing that on the other “specimen” interfaces (e.g. the Macrofungi and Calbug collections) as soon as possible! If you do find any problems or issues, you can let us know right here, on the blog, as comments or you can log into the Notes from Nature talk forums and post there. We hope this removes a lot of frustration!

ZOOM!
A post-Thanksgiving thank you and update
A Notes From Nature Thanksgiving thank you to all of helping us to get biodiversity data transcribed and thus more easily used for new kinds of science. Whether its bees and what they are pollinating, as described by Carl Zimmer in the New York Times or examining the genomes of ancient turkeys from the Smithsonian Natural History Museum, museum collections are an amazing resource that you are helping to unlock. Thank you for the amazing effort.
As some of our devoted Ornithology ledger transcribers have likely noticed, we have finished a major effort of just shy of 3000 ledger pages, and appropriately for a bird-themed holiday in the United States, on Thanksgiving no less. There are many more ledgers at the National History Museum London that are waiting in the wings (sorry! I couldn’t resist) and we’ll be getting those up as soon as possible, along with some more information about those ledgers and their contents.
Notes from Nature and the Push To A Million
Some quick Notes from Nature updates! As you might have noticed, our numbers of transciptions have occassionally grown by leaps and bounds over the last months. What gives? The shortest possible answer is “Ornithology Ledgers” and the truly impressive effort happening there. Back in late June, we had 149,537 Ornithology transcriptions. As of last Sunday Oct. 12, 2014, that number is now 294,973. Wow. That is a lot of work in 4 months. None of those Ornithology records had been included in our total counts (that show up on our homepage), because we had originally been focusing on counting ledger “pages” not transcriptions. And we still have to generate those ledger record counts separately from our logs every few weeks and add them into the total. We hope to solve the problem with manual additions to the total in the near future. So if you see the count “jump” every few weeks, that’s why.
Speaking of counting, many weeks ago we tried to better report the “per collection” statistics, in particular the amount of effort needed to complete work on a collection. We have recently refined those numbers YET AGAIN, and we hope the current reporting is (at least) less confusing. The long story short is each record is transcribed multiple times, and that number is usually 4 times. We have plans to make this more efficient in the future, but until then, this is a workable number of replicate transcriptions. However, occassionally that number is 5 or 6 (for reasons that have to do with both history and some techology glitches). When you add all this up, it was hard to give an exact number of total transcriptions needed.
Now, if you look at any introduction to collections page – take the Herbarium project (http://www.notesfromnature.org/#/archives/herbarium) – you will see total number of images, total number of active images, total number of complete images and a count of transcriptions completed. The active plus complete image numbers should add up to the total number. And the total number of transcrptions gives an overall assessment of effort by our citizen science volunteers. The percent completed is now calculated in terms of images not transcriptions (i.e., completed images divided by the total images).
We mention all this because Notes from Nature is closing in on a HUGE MILESTONE — one million transcriptions! We only have 113,000 transcriptions or so to go. We will mention this more in upcoming blog posts as we make what we hope is a BIG PUSH to 1 million!

ONE MILLION TRANSCRIPTIONS





