Online resources
Corpora
CORPORART – PT/IT specialized comparable corpora of Public Art
CORPORART – PT/IT is a bilingual comparable corpus of the Public Art domain. It comprises sub corpora for contemporary European Portuguese and Italian, from 2000 to 2018, covering text types and subdomains representative of the production of specialized texts in this highly interdisciplinary domain.
Link: https://clunl.fcsh.unl.pt/en/online-resources/corpora/corporart-corpus-comparavel-pt-it-de-especialidade-no-dominio-da-arte-publica/
Portuguese Literature Corpus for Distant Reading
The Portuguese Literature Corpus for Distant Reading is a literary corpus of non canonical novels by Portuguese authors, from the period 1840-1920.
Link: https://github.com/COST-ELTeC/ELTeC-por
MIGRANTE.PT
Resulting from the project EXPRIMI, MIGRANTE.PT is an European Portuguese corpus for specific purposes with around 1,5 million tokens, of institutional texts concerning the integration of migrants in Portugal and directed to these migrants, collected from sites and materials freely available online.
Link: https://clunl.fcsh.unl.pt/en/online-resources/corpora/migrante-pt/
Parallel sense-annotated corpus ELEXIS-WSD 1.0
ELEXIS-WSD is a parallel sense-annotated corpus in which content words (nouns, adjectives, verbs, and adverbs) have been assigned senses. Version 1.0 contains sentences for 10 languages: Bulgarian, Danish, English, Spanish, Estonian, Hungarian, Italian, Dutch, Portuguese, and Slovene (read more)
Link: http://hdl.handle.net/11356/1674
Lexicons, Dictionaries, Glossaries
Basic terms in speech and language pathology diagnosis
Termes de base du diagnostic orthophonique is a resource consisting of a list of 45 lexical units considered fundamental for the diagnosis of speech and language pathologies, constituting a non-exhaustive extract of the vocabulary in use in 2020.
Link: https://www.ortolang.fr/market/lexicons/termes-diag-orthophonique
BDTT-AR – Textual and Terminological Database for the Portuguese Parliament
BDTT-AR is a multilingual database (Portuguese, English and French) that contains the terminology used within the Portuguese Parliament.
Link: http://terminologia.parlamento.pt/pls/ter/terwinter.home
COVID-19 Collaborative Glossary
The COVID-19 Collaborative Glossary comprises the terminology used by official Healthcare agencies, healthcare professionals and scientists, as well as the media and social media.
Link: https://www.lexonomy.eu/ec25mm79/
Dicionário de Abreviaturas Digitais (Dictionary of Digital Abbreviations)
The Dicionário de Abreviaturas Digitais considers abbreviations used by new generations of European Portuguese speakers in written communication on social networks.
Link: https://www.lexonomy.eu/#/dig2023
DLP – Portuguese Language Dictionary
The Portuguese Language Dictionary of the Academy of Sciences of Lisbon, based on the Contemporary Portuguese Language Dictionary, was published in April 2023 and is constantly being updated.
Link: https://dicionario.acad-ciencias.pt/
Multilingual Multidomain Dictionary
The Multilingual Multidomain Dictionary is a database created from the work developed by the students of UC Terminology (NOVA FCSH) during the 1st semester of the academic year 2022-2023 (1st cycle of studies). The terminological data that make up the records were extracted from specialised corpora compiled by the students.
Link: https://www.lexonomy.eu/#/2j4ucmiq
Multilingual Terminological Glossaries for specific purposes within the Community of Portuguese Language Countries
By request of Instituto Camões, the Research Group “Lexicology, Lexicography and Terminology” has conceived and created multilingual terminological glossaries (Portuguese, English and French) associated to textual databases in the domains of Agronomy, Health Sciences, Law and Economics.
Link: available soon at Instituto Camões website
Ontologies
CDMR – MorDigital Domain Classification
The CDMOR covers the areas of knowledge used, in the form of domain labels, in the lexicographic articles of António de Morais Silva’s “Diccionario da Lingua Portugueza” (1789; 1813; 1823).
Link: http://vocabs.rossio.fcsh.unl.pt/morais_domains/
OntoAndalus
OntoAndalus is an ontology of pottery artefacts of al-Andalus. The purpose of OntoAndalus is to further knowledge in the domain and to facilitate the development of a multilingual terminological resource based on formal descriptions or definitions of concepts and other units of knowledge.
Link: https://doi.org/10.34619/3W3T-HJ8S
OntoCork
OntoCork is a micro domain-ontology of cork stoppers. The purpose of this domain-ontology is to organise cork stoppers – concepts and terms – in a systematic way.
OntoCork is being developed in OWL. The formal definitions are inferred from CorkCorpus – a specialised corpus built from scratch.
Link: https://doi.org/10.34619/a27q-1ryd
OntoDomLab-Med
OntoDomLab-Med is an ontology of domain labels focused on Medicine. Its purpose is to provide a solid conceptual foundation that facilitates interoperability with TEI Lex-0, as well as improve: i) the consistency of domain labelling assignment and ii) the efficiency in what concerns information retrieval.
Link: https://doi.org/10.34619/emw4-ax6o
Menu
- Projects
- Ongoing projects
- iRead4Skills – Intelligent Reading Improvement System for Fundamental and Transversal Skills Development
- ProPerL2 – Production and Perception in L2 speech learning
- Heritage Languages go to School: The interplay of (extra)linguistic factors in successful language development
- Investigating the impact of implicit and explicit instruction on phonological acquisition in a second language
- LL2DS – Linking Linguistics to Data Science
- LAUA – Language Attrition and Ultimate Attainment
- CORRELATE – Corpora and Lexical and Terminological Resources
- ANACOREX – Anafora y expresiones referenciales en el bilinguismo: triangulando enfoques de corpus y experimentales
- Caring Communication: gene therapy in the context of hemophilia
- CoRaLHis – Comparing Romance Languages through History: building a multilingual parallel diachronic corpus (13th-18th C.)
- MorDigital – Digitisation of Diccionario da Lingua Portugueza by António de Morais Silva
- QuILL – Quality in Language Learning
- Western Sephardic Diaspora Roadmap
- EXPRIMI
- G&T.Comenta
- Project GiroFLE
- Com@Rehab – Communication for interactive rehabilitation in virtual reality
- OrthoDef
- COVID-19 Collaborative Glossary
- Read4Succeed: Improving migrant, refugee and from deprived neighbourhood children reading skills through an Animal Assisted Reading program
- TERMVEST – The Clothing Terminology: European Portuguese version
- Digital Edition of the “Vocabulário Ortográfico da Língua Portuguesa” (VOLP-1940)
- PIPALE – Preventive Intervention Project for Learning to Read and Write
- Corpus Linguístico & Avatar para a Língua Gestual Portuguesa
- ELEXIS – European Lexicographic Infrastructure
- POR Nível – Design and validation of a placement test to PFL
- ANACOR: A corpus-based approach to anaphora resolution in second language acquisition: beyond the interfaces
- Cultural Heritage Lexicon
- Chair of Portuguese as a Second and Foreign Language
- Concluded projects
- Humanities Going Digital (HUGOD)
- Monitor Corpora. PressCoronaVírus
- European Portuguese-Standard Arab Dictionary
- MOCOLANG-O – MOdélisation COnceptuelle des troubles (du LANGage et de la communication) en Orthophonie
- Romance clitics in diachrony. An integrated approach
- Portuguese Literature Corpus for Distant Reading
- ALPROF – Automatic Assessment of Language Proficiency for Migrant Integration
- CLARIN CLUNL
- Utopia, Food and the Future
- Development of syntactic structures in Portuguese and French monolingual and bilingual acquisition
- The Case of Grammatical Relations
- BlackBox – a Collaborative Platform to Document Performance Composition: from conceptual structures in the backstage to customizable visualizations in the front-end
- Promotion of scientific literacy
- PerGRam – Percursos para o ensino da gramática nos primeiros anos de escolaridade
- Knowledge Organisation Proposal within the scope of infertility: the role of Terminology
- Subordination in Medieval Portuguese
- Crosslinguistic and Crosspopulation approaches to the Acquisition of Dependencies
- Syntactic and lexical factors in processing complexity
- SIERA – Integrating Sina Institute into the European Research Area
- Syntactic Dependencies from 3 to 10
- Events and subevents in Capeverdean
- TKB – Transmedia Knowledge Base for Contemporary Dance
- Research network projects
- CLIL in Languages Other Than English
- NexusLinguarum – European network for Web-centred linguistic data science
- Heritage Language Consortium
- @ Cientista Regressa à Escola
- Distant Reading for European Literary History
- KEYSTONE – Semantic Keyword-Based Search on Structures Data Sources
- ENeL – European Network of e-Lexicography
- GraMaLL – Grasping Meaning Across Languages and Learners
- Language Impairment in a Multilingual Society: Linguistic Patterns and the Road to Assessment
- Value for Health CoLAB
- Infrastructures
- Services provision