Online resources


Corpora

CORPORART – PT/IT specialized comparable corpora of Public Art
CORPORART – PT/IT is a bilingual comparable corpus of the Public Art domain. It comprises sub corpora for contemporary European Portuguese and Italian, from 2000 to 2018, covering text types and subdomains representative of the production of specialized texts in this highly interdisciplinary domain.
Link: https://clunl.fcsh.unl.pt/en/online-resources/corpora/corporart-corpus-comparavel-pt-it-de-especialidade-no-dominio-da-arte-publica/

Portuguese Literature Corpus for Distant Reading
The Portuguese Literature Corpus for Distant Reading is a literary corpus of non canonical novels by Portuguese authors, from the period 1840-1920.
Link: https://github.com/COST-ELTeC/ELTeC-por

MIGRANTE.PT
Resulting from the project EXPRIMI, MIGRANTE.PT is an European Portuguese corpus for specific purposes with around 1,5 million tokens, of institutional texts concerning the integration of migrants in Portugal and directed to these migrants, collected from sites and materials freely available online.
Link: https://clunl.fcsh.unl.pt/en/online-resources/corpora/migrante-pt/

Parallel sense-annotated corpus ELEXIS-WSD 1.0
ELEXIS-WSD is a parallel sense-annotated corpus in which content words (nouns, adjectives, verbs, and adverbs) have been assigned senses. Version 1.0 contains sentences for 10 languages: Bulgarian, Danish, English, Spanish, Estonian, Hungarian, Italian, Dutch, Portuguese, and Slovene (read more)
Link: http://hdl.handle.net/11356/1674

Lexicons, Dictionaries, Glossaries

Basic terms in speech and language pathology diagnosis
Termes de base du diagnostic orthophonique is a resource consisting of a list of 45 lexical units considered fundamental for the diagnosis of speech and language pathologies, constituting a non-exhaustive extract of the vocabulary in use in 2020.
Link: https://www.ortolang.fr/market/lexicons/termes-diag-orthophonique

BDTT-AR – Textual and Terminological Database for the Portuguese Parliament
BDTT-AR is a multilingual database (Portuguese, English and French) that contains the terminology used within the Portuguese Parliament.
Link: http://terminologia.parlamento.pt/pls/ter/terwinter.home

COVID-19 Collaborative Glossary
The COVID-19 Collaborative Glossary comprises the terminology used by official Healthcare agencies, healthcare professionals and scientists, as well as the media and social media.
Link: https://www.lexonomy.eu/ec25mm79/

Dicionário de Abreviaturas Digitais (Dictionary of Digital Abbreviations)
The Dicionário de Abreviaturas Digitais considers abbreviations used by new generations of European Portuguese speakers in written communication on social networks.
Link: https://www.lexonomy.eu/#/dig2023

DLP – Portuguese Language Dictionary
The Portuguese Language Dictionary of the Academy of Sciences of Lisbon, based on the Contemporary Portuguese Language Dictionary, was published in April 2023 and is constantly being updated.
Link: https://dicionario.acad-ciencias.pt/

Multilingual Multidomain Dictionary
The Multilingual Multidomain Dictionary is a database created from the work developed by the students of UC Terminology (NOVA FCSH) during the 1st semester of the academic year 2022-2023 (1st cycle of studies). The terminological data that make up the records were extracted from specialised corpora compiled by the students.
Link: https://doi.org/10.34619/30qc-jzno

Multilingual Terminological Glossaries for specific purposes within the Community of Portuguese Language Countries
By request of Instituto Camões, the Research Group “Lexicology, Lexicography and Terminology” has conceived and created multilingual terminological glossaries (Portuguese, English and French) associated to textual databases in the domains of Agronomy, Health Sciences, Law and Economics.
Link: available soon at Instituto Camões website

Ontologies

CDMR – MorDigital Domain Classification
The CDMOR covers the areas of knowledge used, in the form of domain labels, in the lexicographic articles of António de Morais Silva’s “Diccionario da Lingua Portugueza” (1789; 1813; 1823).
Link: http://vocabs.rossio.fcsh.unl.pt/morais_domains/

OntoAndalus
OntoAndalus is an ontology of pottery artefacts of al-Andalus. The purpose of OntoAndalus is to further knowledge in the domain and to facilitate the development of a multilingual terminological resource based on formal descriptions or definitions of concepts and other units of knowledge.
Link: https://doi.org/10.34619/3W3T-HJ8S

OntoCork
OntoCork is a micro domain-ontology of cork stoppers. The purpose of this domain-ontology is to organise cork stoppers – concepts and terms – in a systematic way.
OntoCork is being developed in OWL. The formal definitions are inferred from CorkCorpus – a specialised corpus built from scratch.
Link: https://doi.org/10.34619/a27q-1ryd

OntoDomLab-Med
OntoDomLab-Med is an ontology of domain labels focused on Medicine. Its purpose is to provide a solid conceptual foundation that facilitates interoperability with TEI Lex-0, as well as improve: i) the consistency of domain labelling assignment and ii) the efficiency in what concerns information retrieval.
Link: https://doi.org/10.34619/emw4-ax6o

Training material

Introduction to dictionaries
The aim of this course is to present a brief history of dictionaries as tools for organizing knowledge about words and their meanings, and to analyze different ways of understanding and classifying the dictionary genre.
Link: https://elexis.humanistika.org/resource/posts/introduction-to-dictionaries

Standards for representing lexicographical data: an overview
This course focuses on the importance of standards in facilitating cooperation between lexicographers in a multilingual and multicultural context.
Link: https://campus.dariah.eu/resource/posts/standards-for-representing-lexicographic-data-an-overview

Others

CORPORART_GRAMM_IT_1.0
The CORPORART_GRAMM_IT_1.0 is a Semantic Word Sketch Grammar for Italian, codified using the Corpus Query Language (CQL) for Sketch Engine, but adaptable to other systems using CQL (read more)
Link: https://doi.org/10.34619/0bjc-vhc4

CORPORART_GRAMM_PT_1.0
The CORPORART_GRAMM_PT_1.0 is a Semantic Word Sketch Grammar for European Portuguese, codified using the Corpus Query Language (CQL) for Sketch Engine, but adaptable to other systems using CQL (read more)
Link: https://doi.org/10.34619/j2xi-fxfo