Online resources


CORPORART – PT/IT specialized comparable corpora of Public Art
CORPORART – PT/IT is a bilingual comparable corpus of the Public Art domain. It comprises sub corpora for contemporary European Portuguese and Italian, from 2000 to 2018, covering text types and subdomains representative of the production of specialized texts in this highly interdisciplinary domain.

Portuguese Literature Corpus for Distant Reading
The Portuguese Literature Corpus for Distant Reading is a literary corpus of non canonical novels by Portuguese authors, from the period 1840-1920.

Resulting from the project EXPRIMI, MIGRANTE.PT is an European Portuguese corpus for specific purposes with around 1,5 million tokens, of institutional texts concerning the integration of migrants in Portugal and directed to these migrants, collected from sites and materials freely available online.

Parallel sense-annotated corpus ELEXIS-WSD 1.0
ELEXIS-WSD is a parallel sense-annotated corpus in which content words (nouns, adjectives, verbs, and adverbs) have been assigned senses. Version 1.0 contains sentences for 10 languages: Bulgarian, Danish, English, Spanish, Estonian, Hungarian, Italian, Dutch, Portuguese, and Slovene (read more)

Lexicons, Dictionaries, Glossaries

Basic terms in speech and language pathology diagnosis
Termes de base du diagnostic orthophonique is a resource consisting of a list of 45 lexical units considered fundamental for the diagnosis of speech and language pathologies, constituting a non-exhaustive extract of the vocabulary in use in 2020.

BDTT-AR – Textual and Terminological Database for the Portuguese Parliament
BDTT-AR is a multilingual database (Portuguese, English and French) that contains the terminology used within the Portuguese Parliament.

COVID-19 Collaborative Glossary
The COVID-19 Collaborative Glossary comprises the terminology used by official Healthcare agencies, healthcare professionals and scientists, as well as the media and social media.

Multilingual Terminological Glossaries for specific purposes within the Community of Portuguese Language Countries
By request of Instituto Camões, the Research Group “Lexicology, Lexicography and Terminology” has conceived and created multilingual terminological glossaries (Portuguese, English and French) associated to textual databases in the domains of Agronomy, Health Sciences, Law and Economics.
Link: available soon at Instituto Camões website


OntoAndalus is an ontology of pottery artefacts of al-Andalus. The purpose of OntoAndalus is to further knowledge in the domain and to facilitate the development of a multilingual terminological resource based on formal descriptions or definitions of concepts and other units of knowledge.

OntoCork is a micro domain-ontology of cork stoppers. The purpose of this domain-ontology is to organise cork stoppers – concepts and terms – in a systematic way.
OntoCork is being developed in OWL. The formal definitions are inferred from CorkCorpus – a specialised corpus built from scratch.

OntoDomLab-Med is an ontology of domain labels focused on Medicine. Its purpose is to provide a solid conceptual foundation that facilitates interoperability with TEI Lex-0, as well as improve: i) the consistency of domain labelling assignment and ii) the efficiency in what concerns information retrieval.