CORPORART_GRAMM_PT_1.0
CORPORART Semantic Word Sketch Grammar for European Portuguese – Nouns, verbs and adjectives CQL expressions for semantic relations extraction

 

 

Description

The CORPORART_GRAMM_PT_1.0 is a Semantic Word Sketch Grammar for European Portuguese, codified using the Corpus Query Language (CQL) for Sketch Engine, but adaptable to other systems using CQL. The CORPORART_GRAMM_PT_1.0 consists of 42 lexical-syntactic patterns for extracting semantically related items, encompassing various semantic relations, such as hyponymy/hyperonymy, holonymy/meronymy, as well as relations crucial for event description (e.g., agent, result, instrument, location etc.) and adjective description (e.g., characterization).
Usually, WordSketch Grammars function for collocations extraction and categorization, according to their grammatical relations. The innovative aspect of our CQL Grammar is that it is designed for semantic information retrieval. The CQL grammar operates based on a keyword and the result is displayed in the form of a Word Sketch. Even though CORPORART_GRAMM_PT_1.0 was initially designed for CORPORART, it can be applied to any European Portuguese corpus.

More on this CQL Grammar
Barbero, C. (2022). CQL Grammars for Lexical and Semantic Information Extraction for Portuguese and Italian. In: Pinheiro, V., et al. Computational Processing of the Portuguese Language. PROPOR 2022. Lecture Notes in Computer Science, vol 13208. Springer, Cham. https://doi.org/10.1007/978-3-030-98305-5_35

Identifier
DOI: https://doi.org/10.34619/j2xi-fxfo

ACCESS
CORPORART_GRAMM_PT_1.0 is freely available according to Open Access policies and you can find it on the Sketch Engine platform.

Authors and Affiliation
CORPORART_GRAMM_PT_1.0 was designed by Chiara Barbero and Raquel Amaro, as part of Chiara Barbero’s PhD program in Linguistics (scholarship: PD/BD/128131/2016). This research was also supported by the Portuguese national funding through the FCT – Portuguese Foundation for Science and Technology, I.P. as part of the project UIDB/LIN/03213/2020; 10.54499/UIDB/03213/2020 and UIDP/LIN/03213/2020; 10.54499/UIDP/03213/2020 – Linguistics Research Centre of NOVA University Lisbon.