Release of Beth Mardutho’s Qoruyo Project for Syriac OCR & HTR 

FOR IMMEDIATE RELEASE: September 18, 2019

Beth Mardutho: The Syriac Institute, Piscataway, NJ

 

Beth Mardutho: The Syriac Institute (www.bethmardutho.org) is pleased to release Qoruyo, its handwritten-text recognition (HTR) models that permit scholars to convert images of ancient, medieval, and modern manuscripts into searchable texts.

The work was carried out by two of Beth Mardutho’s summer Fellows in the Digital Humanities who, within less than three months, created recognition models for the three Syriac scripts using the Transkribus software. The Estrangela, Serto (West Syriac), and East Syriac models regularly obtained accuracies between 96% and 98%. 

The project was conceived by Abigail Pearson (University of Exeter) who held a Work-Study Fellowship for two years in a row at Beth Mardutho. Abigail had worked last summer with Digital Humanities Fellows Emily Chesley (Princeton University) and Jillian Marcantonio (Duke University) to evaluate Tesseract 4.0, Google’s OCR engine, for Syriac. She continued to be interested in the subject and returned to Beth Mardutho this summer with an idea of creating modules for handwritten manuscripts. Abigail concentrated on building a model for the East Syriac hand.

Kyle Brunner (New York University, Institute for the Study of the Ancient World) was the recipient of the Dr. Talal and Mrs. Wesal Findakly Fellowship in the Digital Humanities for 2019. He joined the project and built two models: one for Estrangelo and the other for Serto. Kyle gathered manuscript images of various hands and enhanced the models so that they can recognize texts written from various periods of time, beginning as early as the sixth century. The modules were tested on both printed texts and handwritten manuscripts. 

 

Beth Mardutho now is able to share these models with scholars who desire to apply OCR on printed texts or HTR on handwritten manuscripts. The models are available on our website: http://bethmardutho.org/qoruyo/.

Beth Mardutho: The Syriac Institute (www.bethmardutho.org) is a non-profit education institution dedicated to the promotion of the Syriac language and its heritage especially via digital humanities. The Institute holds annual intensive courses in the Digital Humanities (not Syriac-specific) in January and Syriac language courses in July-August. 

Beth Mardutho is supported primarily by annual membership; joining at different levels is available at http://bethmardutho.org/membership/. Those interested in supporting a named fellowship may inquire at http://bethmardutho.org/fellowships/.

Transkribus is developed by the READ (Recognition and Enrichment of Archival Documents) Project. Beth Mardutho greatly appreciates their outstanding efforts in making this tool available.

css.php