Simtho: The Syriac Thesaurus is a medium-size database of Syriac literary texts. The Beta version was launched at AAR/SBL in San Diego in 2019 and consists of 7.3 million tokens (ca. 6.5 million words). Users can search the corpus using different methods: simple word and phrase search, regular expressions, and a Corpus Query Language. Search operations can be filtered by a rich set of metadata fields such as author, composition date periods, genre, poetic meter (when applicable), and much more. In addition to concordance results, users can find collocations and frequencies of occurrence. Search results can be saved or exported in text and XML formats. Simtho is freely available online. Volunteers who are interested in helping can read the Call for Volunteers and Call for Texts sections below.
The Thesaurus by the Numbers…
George A. Kiraz (Beth Mardutho and Institute for Advanced Study, Princeton), Simtho Editor-in-Chief
Sebastian P. Brock (University of Oxford, Emeritus), Senior Advisor
Slavomír Čéplö (Austrian Academy of Sciences / Slovak Academy of Sciences), Corpus Building and Management Specialist
Jack Tannous (Princeton University), Area Editor (Early Syriac to Renaissance)
Ephrem Ishac (Karl-Franzens-Universität Graz), Area Editor (Post-Renaissance to Kthobonoy Literature; Liturgical Texts)
Shelby Loster (Beth Mardutho: The Syriac Institute), Simtho Projects Coordinator, Seibel Digital Humanities Fellow, 2019–2020
Joshua Hood (Catholic University of America/Catholic Distance University), Seibel Digital Humanities Fellow, 2021
Jacob Margason, (Vanderbilt University), Data Engineer
Daniel Stoekl (École Pratique des Hautes Études – Université PSL), eScriptorium Model Training
Benjamin Kiessling (Université PSL), Kraken Advisor
Fadi Homsi (Hama University), Dr. Khalid and Mrs. Amira Dinno Fellow, Summer 2021
Anton Fleissner, Edward Y. Hannoush Memorial Fellow (sponsored by Dr. Peter and Dr. Gretchen Hannoush), Summer 2021
Maroun El Houkayem (Duke University), Dr. Suhail and Mrs. Luna Zavaro, Summer 2021
Yanir Marmor (Tel Aviv University), Dr. Nebil and Mrs. Jennifer Aydin Fellow, Summer 2021
Justinian Mândrilă (University of Göttingen), Dr. Jack Jallo and Mrs. Gage Johnston Fellow, Summer 2021
Ian Ollila (Princeton Theological Seminary), PTS Field Ed Intern, Summer 2021
William Clocksin (University of Hertfordshire), OCR Specialist
Johan M. V. Lundberg (University of Cambridge), Seibel Digital Humanities Fellow, 2019
Brandon Allen (Princeton Theological Seminary/University of Oxford), Seibel Digital Humanities Fellow, 2020
William Bunce (University of Oxford), Dr. Khalid and Mrs. Amira Dinno Digital Humanities Fellow, Summer 2019
Patrick Conlin (Marquette University), Beth Mardutho Work-Study Fellow, Summer 2019
Kyle Brunner (New York University), Dr. Talal and Mrs. Wesal Findakly Fellow, 2019
Abigail Pearson (University of Exeter), Beth Mardutho Work-Study Fellow, 2018-2019
Emily Chesley (Princeton Theological Seminary), Dr. Jack Jallo and Mrs. Gage Johnston Fellow, 2018
Jillian Marcantonio (Duke University), Beth Mardutho Digital Humanities Fellow, Summer 2018
Muhannad Maher (St. Ephrem Seminary), Mr. Malak Yunan and Dr. Evelyne Yunan Fellow, Summer 2019
Tony Wardeh, Beth Mardutho Digital Humanities Corresponding Fellow, 2020
Jonathan Warner (Cornell University), Mr. Malak Yunan and Dr. Evelyne Yunan Fellow, Summer 2020
Joss Childs (University of Chicago), Dr. Nebil and Mrs. Jennifer Aydin Fellow, Summer 2020
Omri Matarasso (Princeton University), Dr. Khalid and Mrs. Amira Dinno Fellow, Summer 2020
Srecko Koralija (University of Cambridge), Beth Mardutho Digital Humanities Corresponding Fellow, 2020
Yevgeniy Safronov (Princeton Theological Seminary), PTS Field Ed Intern, Summer 2020
David Micahel Felsch (Princeton Theological Seminary), PTS Field Ed Intern, Summer 2020
Briana Grenert (Princeton Theological Seminary), PTS Field Ed Intern, Summer 2020
Elias Jallo, Beth Mardutho Digital Humanities Intern, Summer 2020
Ethan Laster (Saint Louis University), Beth Mardutho Digital Humanities Volunteer, 2020
Fr. Mathew Jacob, Beth Mardutho Volunteer, 2020
Tetiana Shyshkina (National University of Kyiv-Mohyla Academy), Beth Mardutho Digital Humanities Volunteer, 2020
Jacob Mathew, Beth Mardutho Digital Humanities Volunteer, Summer 2020
Patrick Robert Kiernan (Princeton Theological Seminary), PTS Non-Field Ed Intern, Summer 2021
Call for Submissions
We welcome contributions of typed texts. Scholars who published critical editions of texts (in book or article format) are encouraged to send us their texts for inclusion. As Simtho is a concordance software, it does not violate the copyright of published material. Please send submissions to email@example.com.
Call for Participation
We welcome volunteers who know Syriac at any level. There are tasks for those who can only recognize Syriac letters and tasks for experts on Syriac literature—and everything in between. Please contact us at firstname.lastname@example.org.
Users are encouraged to read the SketchEngine user guide. As of now, the software is not able to ignore diacritics during search. As such, the Simtho project had to compromise as follows:
- All vowel marks, most diacritical dots, and non-Latin punctuation marks were removed. If future releases of the software will support search with diacritical marks, these will be reinstated.
- The plural syome double-dot was moved by convention to the end of the string (otherwise, the user has to know where it is to perform a search); e.g., singular ܟܬܒܐ, plural ܟܬܒܐ̈.
- The feminine dot on ܗ̇ was retained. In the case of plurals, the syome will come after it like this ܗ̇̈. This order is important to perform search operations.
- The single disambiguation dot, when present, was retained; e.g. ܡ̇ܢ vs. ܡ̣ܢ. We thought that this will disambiguate the text.
Users can use regular expressions to perform searches that ignore these marks.
SketchEngine implements a Corpus Query Language (CQL) and regular expression searching. Users are encouraged to learn this query language to make the most of Simtho. This will permit one, for example, to search a word form with or without ܒܕܘܠ prefixes or suffixes.
The metadata is self-explanatory. The tag .CompositionYear is a zero-padded estimate to the date of the text. Early nth century has been encoded as 0m25 where m=n-1; e.g., early 6th century is encoded 0525. Similarly, mid or simply 6th century is 0550, late 6th century is 0575. When the date is a range over more than one century (e.g. 6th or 7th century), the later date is encoded. The zero-padding permits users to sort concordance results chronologically as the algorithms use lexical sorting. In the case of Greek texts translated into Syriac, when the Syriac translator is known, the year is given for the translator because that reflects the Syriac composition. (The translator abbreviation is given next to the author; e.g. “SevAnt(JacEdes)” for Severus of Antioch (tr. Jacob of Edessa). Otherwise, the date of the Greek author is given.
The tag .DocumentType explains the method by which the electronic version of the text was created. The value OCR stands for optical character recognition, an automatic way to convert images of texts into texts. While we used highly reliable software, OCR is never 100% accurate. Before citing texts in a research paper, make sure you check the text in the print publication or the manuscript from which it is taken. This data is given in the .Reference tag. One should also double check the page (page.nr) and line number (line.nr) references.
Please send feedback to email@example.com. We welcome to hear about errors we have made!