Portuguese Literature Corpus for Distant Reading
Identification
- Project identification: Portuguese Literature Corpus for Distant Reading
- Group: Lexicology, Lexicography and Terminology
- Principal Investigator: Raquel Amaro
- Duration: Dec. 2018 – June 2020
- Funding entity: co-funded by COST Action CA 16204; with the support of Biblioteca Nacional de Portugal (Portuguese National Library)
- Keywords: Corpus linguistics; Literature; Distant Reading.
Description
The project “Portuguese Literature Corpus for Distant Reading” aims at the constitution of a literary corpus of novels by Portuguese authors to integrate the European corpus ELTeC – European Literary Text Collection. This will assure the Portuguese contribution to the development of good practices and computational methods of textual analysis adapted to the European literary traditions, on the one hand, and to the study and analysis of fundamental concepts of Portuguese and European literary theory and history, on the other, in the context and development of the action CA 16204 – Distant Reading for European Literary History (http://www.cost.eu/COST_Actions/ca/CA16204).
The integration of the Portuguese language and culture in this European corpus will allow the joint development of new methods and new ways of conceiving European literary history, in the context of international cooperation projects, allowing the creation of new theoretical and practical frameworks and the contrastive analysis of languages and cultures through innovative and sophisticated data-based computational methods.
Results
The corpora from ELTeC – European Literary Text Collection are open. It is possible to access to statistics and human-readable versions of each text of the corpora, including the Portuguese Literature Corpus for Distant Reading, at https://distantreading.github.io/ELTeC/.
The original source files are stored in a GitHub repository and can be downloaded freely at https://github.com/COST-ELTeC.
Team
Raquel Amaro (principal investigator)
Paulo Pereira (Portuguese Literature Centre, FLUC)
Isabel Araújo Branco (CHAM – Humanities Centre, NOVA FCSH)
Adeliana Silva (CLUNL, research grant holder)
Diana Santos (Universidade de Oslo)
Menu < back
- Projects
- Ongoing projects
- SPELL2 – Synchronizing Perceptual and Lexical Abilities in Second Language Acquisition
- MultiPoD – Multilingual and Multicultural Spaces for Political Deliberation
- HEREDITARY – HetERogeneous sEmantic Data Integration for guT-brAin interplay
- TTC-CPLP – Terminologias Técnicas e Científicas para a CPLP
- CHAMUÇA – Portuguese and South Asian Lexicon Archive
- DiTo – Didática do Texto
- REDGRAM – Digital Resources for Education – Grammatical Pathways
- iRead4Skills – Intelligent Reading Improvement System for Fundamental and Transversal Skills Development
- Investigating the impact of implicit and explicit instruction on phonological acquisition in a second language
- LAUA – Language Attrition and Ultimate Attainment
- CORRELATE – Corpora and Lexical and Terminological Resources
- TERMVEST – The Clothing Terminology: European Portuguese version
- POR Nível – Design and validation of a placement test to PFL
- Cultural Heritage Lexicon
- Concluded projects
- EPISTRAN – Epistemic Translation: Towards an Ecology of Knowledges
- NObarriers2Health: Reducing language and cultural barriers through machine translation literacy for inclusive multilingual health communication
- Heritage Languages go to School: The interplay of (extra)linguistic factors in successful language development
- ProPerL2 – Production and Perception in L2 speech learning
- Active Citizenship Through Dialogue in Virtual teacher communities
- ANACOREX – Anafora y expresiones referenciales en el bilinguismo: triangulando enfoques de corpus y experimentales
- G&T.Comenta
- Language and literacy at school – the contribution of metasyntactic abilities to reading comprehension development
- EXPRIMI
- LL2DS – Linking Linguistics to Data Science
- MorDigital – Digitisation of Diccionario da Lingua Portugueza by António de Morais Silva
- Humanities Going Digital (HUGOD)
- QuILL – Quality in Language Learning
- Western Sephardic Diaspora Roadmap
- Caring Communication: gene therapy in the context of hemophilia
- Monitor Corpora. PressCoronaVírus
- COVID-19 Collaborative Glossary
- Com@Rehab – Communication for interactive rehabilitation in virtual reality
- Project GiroFLE
- OrthoDef
- PIPALE – Preventive Intervention Project for Learning to Read and Write
- Digital Edition of the “Vocabulário Ortográfico da Língua Portuguesa” (VOLP-1940)
- Read4Succeed: Improving migrant, refugee and from deprived neighbourhood children reading skills through an Animal Assisted Reading program
- CoRaLHis – Comparing Romance Languages through History: building a multilingual parallel diachronic corpus (13th-18th C.)
- Corpus Linguístico & Avatar para a Língua Gestual Portuguesa
- ELEXIS – European Lexicographic Infrastructure
- MOCOLANG-O – MOdélisation COnceptuelle des troubles (du LANGage et de la communication) en Orthophonie
- ANACOR: A corpus-based approach to anaphora resolution in second language acquisition: beyond the interfaces
- European Portuguese-Standard Arab Dictionary
- Romance clitics in diachrony. An integrated approach
- Portuguese Literature Corpus for Distant Reading
- ALPROF – Automatic Assessment of Language Proficiency for Migrant Integration
- CLARIN CLUNL
- Utopia, Food and the Future
- Development of syntactic structures in Portuguese and French monolingual and bilingual acquisition
- The Case of Grammatical Relations
- BlackBox – a Collaborative Platform to Document Performance Composition: from conceptual structures in the backstage to customizable visualizations in the front-end
- Promotion of scientific literacy
- PerGRam – Percursos para o ensino da gramática nos primeiros anos de escolaridade
- Knowledge Organisation Proposal within the scope of infertility: the role of Terminology
- Subordination in Medieval Portuguese
- Crosslinguistic and Crosspopulation approaches to the Acquisition of Dependencies
- Syntactic and lexical factors in processing complexity
- SIERA – Integrating Sina Institute into the European Research Area
- Syntactic Dependencies from 3 to 10
- Events and subevents in Capeverdean
- TKB – Transmedia Knowledge Base for Contemporary Dance
- Research network projects
- ELEXIS Association
- PhraConRep – A Multilingual Repository of Phraseme Constructions in Central and Eastern European Languages
- Y-JustLang – Justice to youth language needs
- ENEOLI – European Network On Lexical Innovation
- Consortium Huma-Num ARIANE
- GRAFE’Maire
- e-Term ANCV – Recurso terminológico jurídico-parlamentar digital Assembleia Nacional de Cabo Verde
- Metalex – International Metalexicography Network
- @ Cientista Regressa à Escola
- CLIL in Languages Other Than English
- NexusLinguarum – European network for Web-centred linguistic data science
- Distant Reading for European Literary History
- HL2C – Heritage Language Consortium
- KEYSTONE – Semantic Keyword-Based Search on Structures Data Sources
- ARLE – International Association for Research in L1 Education
- ENeL – European Network of e-Lexicography
- GraMaLL – Grasping Meaning Across Languages and Learners
- Language Impairment in a Multilingual Society: Linguistic Patterns and the Road to Assessment
- GIRTraduvino – Grupo de Investigación Reconocido sobre la Lengua de la Vid Y el Vino y su Traducción
- Value for Health CoLAB
- Infrastructures
- Services provision
PT