• 79 

Corpora

In the context of the TALES project on the automatic processing of the Ladin language, launched in 1999 in collaboration with the Institute for scientific and technological research (ISTR) in Trento, organized collections of Ladin texts were created, both in the standard and in the single languages.

The corpora collected here (in the Fassa, Gardena, Badia and Ampezzo Ladin languages), contain a total of approximately 6,500,000 words. The texts selected cover a period extending from 1800 to the current day, with a prevalence of texts pertaining to the second half of the 20th century. In order to guarantee a certain balance between the various types, both literary texts (prose, poetry, theatre, memoirs, texts on folklore and traditions, prayer books) and non-literary texts (legal and administrative texts, forms, texts of journalistic and pragmatic information, texts divulging scientific and cultural information and educational texts) were included.

Currently, the Fassa text corpus is the one at the most advanced stage of elaboration. Its structure, which provides relevant information for every text (date, place of origin, type of text, author), allows you to refine your search according to a series of predetermined criteria.

The corpora can be viewed using the concordancer, a tool developed ad hoc and aimed above all at linguists and scholars of the Ladin language; it allows you to analyse the texts by seeking concordances, classifications and frequencies based on the KWIC (Keyword In Context) method, meaning that the word being sought is displayed with its context).

 

Events

Shop

Newsletter

To stay in contact with the Institute and the Museum and to receive our updates, register for our new newsletter service.

Cookies | Web Policy | Impressum | Sitemap | P.IVA 00379240229