Electronic lexicon of Old Irish

Disciplines History, Linguistics, Literature and Language - Irish
Temporal Terms Middle Ages (4th c. to 15th c.), Early Modern (16th c. to 18th c.)
Methods and Techniques Data Structuring and enhancement, Data publishing and dissemination, Data Capture, Data Analysis, Text Encoding, Data modelling, Web technologies, Manual transcription, Textual analysis
Contact Julianne Nyhan (nyhanatuni-trier [dot] de)
Website http://epu.ucc.ie/lexicon/entry
Start/End date September 2006 - (open-ended)

Context and research questions
The electronic Dictionary of the Irish Language (eDIL http://www.dil.ie/) has made a searchable, scholarly electronic edition of DIL available in its entirety and to an extremely high standard. This project had a different emphasis and did not aim to create an electronic edition of DIL. Rather, my focus was one the development of an XML vocabulary that supports the semantic description of the Old, Middle and Early modern Irish historical lexicography and the application of that vocabulary to a subset of DIL. The prototype electronic lexicon of medieval Irish described here is one output of my doctoral research (degree awarded in December 2006). My main research question was how XML could be used to enable inflected forms of medieval Irish words to be identified and returned from an electronic dictionary or lexicon with a high degree of precision because at the present time many retro-digitised historical dictionaries do not support this deeper level of inquiry. Other questions focused on new insights into the dictionary that were generated when information could not be appropriately modelled in XML, the history of DIL and information ordering in Irish glossaries and early print dictionaries.

Methodology
Having determined the essential information to be included in the Lexicon (headword, grammatical information, etymology, syntactical information, definitions, and word forms), a basic XML matrix was devised to describe this information and that could be expanded in a number of ways to describe more detailed information or to express caveats. I proceeded to work through the hardcopy version of DIL in order to extract the necessary subset of information (which I then typed up). This subset was enhanced with additional information such as part of speech (absent, for the most part, from DIL) and also categorised into e.g. ‘sense types’ (to allow for thematic searches of definitions). Inflected forms and their variant spellings were grouped together and encoded as simple forms, the same process was followed in the case of compound forms, which were grouped together and encoded as compound nouns, verbs or adjectives; likewise for forms with emphasising suffixes or infixed pronouns, for example. Further modelling and XML encoding of this data was very much an iterative process that involved many layers of refinement and revision, and the resulting prototype is deeply encoded in XML.

Present status
Seven letters of the Lexicon have been prepared but it has been possible to make but a small portion of this available, with an experimental search mechanism that must be further refined and adapted. The search mechanism does not allow the detailed XML encoding contained in the lexicon to be fully explored or utilised, but does demonstrate some of the contributions that the Lexicon has made in terms of the search and retrieval of medieval Irish, especially regarding inflected forms. In the shorter term, I am planning to make the underlying XML files and project DTD freely available in order to allow those with the requisite expertise to write their own queries. In the longer term, I hope that funding can be secured which will allow the project to be taken forward. The possibilities for developing the Lexicon in conjunction with eDIL, of seeking to fuse aspects of the different models of both those works and preparing them for use with the CELT corpus are clear. However, there is also much potential for using the Lexicon at an experimental level, as a tool for investigating and imagining the historical lexicography contained in DIL and for testing hypotheses about that data within the context of the way it has been modelled. The XML framework that has been devised for the Lexicon can be transformed with XSLT to make it TEI conformant. It makes available a complex and specialised domain-specific application and extension of the encoding that is available for dictionaries in the TEI Guidelines.

The editorial content on this page is subject to the AUP and is maintained by this project. Please direct comments, and report errors or omissions, to the project contact identified on this page.