By Antonis Symvonis, Technical Coordinator of iRead

iRead Linguistic Infrastructure

The iRead project’s Linguistic Infrastructure is a set of language resources and services which enable SMEs, publishers and education providers to design and deliver content to primary school children learning to read. It contains a series of learning resources and services which pertain to specific language models (English, German, Greek and Spanish) and also includes an ‘English as a Foreign Language’ component. The Linguistic Infrastructure consists of:

Language models: capture the journey of a student learning to read and contain language features that need to be mastered, including difficulty compared to other features in the same category, and prerequisites that should be mastered before introducing a new language feature.

Dictionaries: provide linguistic information about words including phonics features (i.e. individual sounds, digraphs, sight words etc), chunking and syllabification, and grammar (i.e. parts of speech, negations, modal verbs, suffixes etc).

The models, resources and services are available through a set of APIs and downloadable resources allowing SME’s, publishers and education providers to integrate these capabilities into their own reading applications.

Amigo – using the dictionaries and language models

Amigo is an e-reader designed in collaboration with Dolphin with instructional features that teach children word decoding reading skills (specifically phonological and morphological skills) implicitly and explicitly. A teacher can choose a text/book and then activate a language feature from a drop-down list for the child to learn. The child first receives a “pre-reading activity”, which is a short explicit instruction presenting the language rule related to the language feature with word examples. Once they enter the book, instances of the language feature appearing in the text are highlighted to support implicit learning. For example, if a teacher wants their students to learn the “i-e” split digraph or the “-ed” suffix, the highlights for the targeted features will be “time” and “looked”, respectively.

Left: pre-reading activity providing explicit instruction on language rule, Right: text highlight providing implicit learning opportunity.

In order to implement this functionality, the Amigo Reader employs the iRead’s Linguistic Infrastructure. Firstly, the Amigo Reader is aware of iRead’s language models and presents the word-level linguistic phenomena from its language feature list. Secondly, when a user instructs Amigo to load a specific text, the Amigo Reader analyses the text file then creates and loads an annotated version of it. Informed by the annotations, the teacher is notified which of the language features available is present in the text to inform which language feature they select.

The annotated text file contains syntactic and lexical information. The annotation includes a vast amount of information about each word. Some information is general such as word length, syllabification, grapheme-to-phoneme correspondence, etc. In addition to this, each word is tagged with the language features it includes (according to the specific language model) along with the position of the feature within the word. This positional information is very useful for implementing the “highlights” since the Reader can use this information to highlight where the feature appears in the word. Besides model-specific information, other additional information is also incorporated, including word length, syllabification, grapheme-to-phoneme correspondence, etc.

Amigo uses the word information included in the annotated text file to support a second instructional feature, the “tricky word list” feature. A child can tap and add a single word to their own tricky word list. This personalised bank of words can be ordered chronologically or alphabetically. Using the annotated text and the syllabification information, the child is shown where the syllable breaks are in multisyllabic words to support their decoding of the word.

Example of a “tricky word list” comprising a personalised bank of words selected by the child.

Do you want more information?

To find out more about the Linguistic Infrastructure and how to license it, contact us at iread@edia.nl
If you are interested in building or improving an EdTech product for literacy, see our industry page
For more information about the content and development of the linguistic infrastructure, access our white paper
See also our recent blog post on Using the Linguistic Infrastructure to Develop the Navigo Literacy Game