Online resources


CORPORART – PT/IT specialized comparable corpora of Public Art
CORPORART – PT/IT is a bilingual comparable corpus of the Public Art domain. It comprises sub corpora for contemporary European Portuguese and Italian, from 2000 to 2018, covering text types and subdomains representative of the production of specialized texts in this highly interdisciplinary domain.

Portuguese Literature Corpus for Distant Reading
The Portuguese Literature Corpus for Distant Reading is a literary corpus of non canonical novels by Portuguese authors, from the period 1840-1920.

Resulting from the project EXPRIMI, MIGRANTE.PT is an European Portuguese corpus for specific purposes with around 1,5 million tokens, of institutional texts concerning the integration of migrants in Portugal and directed to these migrants, collected from sites and materials freely available online.

Parallel sense-annotated corpus ELEXIS-WSD 1.0
ELEXIS-WSD is a parallel sense-annotated corpus in which content words (nouns, adjectives, verbs, and adverbs) have been assigned senses. Version 1.0 contains sentences for 10 languages: Bulgarian, Danish, English, Spanish, Estonian, Hungarian, Italian, Dutch, Portuguese, and Slovene (read more)

Lexicons, Dictionaries, Glossaries

Basic terms in speech and language pathology diagnosis
Termes de base du diagnostic orthophonique is a resource consisting of a list of 45 lexical units considered fundamental for the diagnosis of speech and language pathologies, constituting a non-exhaustive extract of the vocabulary in use in 2020.

BDTT-AR – Textual and Terminological Database for the Portuguese Parliament
BDTT-AR is a multilingual database (Portuguese, English and French) that contains the terminology used within the Portuguese Parliament.

COVID-19 Collaborative Glossary
The COVID-19 Collaborative Glossary comprises the terminology used by official Healthcare agencies, healthcare professionals and scientists, as well as the media and social media.

Dicionário de Abreviaturas Digitais (Dictionary of Digital Abbreviations)
The Dicionário de Abreviaturas Digitais considers abbreviations used by new generations of European Portuguese speakers in written communication on social networks.

DLP – Portuguese Language Dictionary
The Portuguese Language Dictionary of the Academy of Sciences of Lisbon, based on the Contemporary Portuguese Language Dictionary, was published in April 2023 and is constantly being updated.

Multilingual Multidomain Dictionary
The Multilingual Multidomain Dictionary is a database created from the work developed by the students of UC Terminology (NOVA FCSH) during the 1st semester of the academic year 2022-2023 (1st cycle of studies). The terminological data that make up the records were extracted from specialised corpora compiled by the students.

Multilingual Terminological Glossaries for specific purposes within the Community of Portuguese Language Countries
By request of Instituto Camões, the Research Group “Lexicology, Lexicography and Terminology” has conceived and created multilingual terminological glossaries (Portuguese, English and French) associated to textual databases in the domains of Agronomy, Health Sciences, Law and Economics.
Link: available soon at Instituto Camões website


CDMR – MorDigital Domain Classification
The CDMOR covers the areas of knowledge used, in the form of domain labels, in the lexicographic articles of António de Morais Silva’s “Diccionario da Lingua Portugueza” (1789; 1813; 1823).

OntoAndalus is an ontology of pottery artefacts of al-Andalus. The purpose of OntoAndalus is to further knowledge in the domain and to facilitate the development of a multilingual terminological resource based on formal descriptions or definitions of concepts and other units of knowledge.

OntoCork is a micro domain-ontology of cork stoppers. The purpose of this domain-ontology is to organise cork stoppers – concepts and terms – in a systematic way.
OntoCork is being developed in OWL. The formal definitions are inferred from CorkCorpus – a specialised corpus built from scratch.

OntoDomLab-Med is an ontology of domain labels focused on Medicine. Its purpose is to provide a solid conceptual foundation that facilitates interoperability with TEI Lex-0, as well as improve: i) the consistency of domain labelling assignment and ii) the efficiency in what concerns information retrieval.