Pre-Conference Workshop: Quantitative answers to qualitative questions? The challenge of ambiguity in corpus data

Tähelepanu! 2020 ERÜ kevadkonverents JÄÄB ÄRA / NB! 2020 EAAL Conference is cancelled
Konverents toimub aastal 2021 / The Conference will take place in 2021

When analysing large amounts of linguistic data, we cannot manage without specification of parts of speech. In the workshop on PoSs, we would like to discuss topics related to the role of PoSs in corpus analysis and other digital tools processing language. The questions we address are among others:

  • What kinds of problems arise in connection with PoSs when creating automatic systems? How do languages differ in respect to the PoS role in corpus analysis?
  • As PoSs combine syntactic, morphological and semantic information, what would be the best way to unite these levels for the best possible outcome when dealing with the PoSs in corpus analysis and lexicographic work?
  • What are the ways to preprocess the corpus data that would facilitate specification of the lexical category of a word?

We invite researchers working in the area of large corpora processing, parsing of morphologically rich languages, word sense disambiguation, e-lexicography, etc. to participate in the workshop and to contribute to the discussions related to the role of PoSs in language technology.

The working language of the workshop is English.

Invited speakers:
Miloš Jakubíček, the CEO of Lexical Computing Ltd, a software developer working on Sketch Engine, and a fellow of the NLP Centre at Masaryk University). He is a researcher in computer lexicography and corpus linguistics. He has experience with corpus building and morphosyntactic analysis.

Date: 22 April, 2020

Venue: Institute of the Estonian Language, Roosikrantsi 6, Tallinn

Project No PSG227Personal research funding of the Estonian Research Council, 2019–2022, partner: Center of Estonian Language Resources

Kontakt / Contact

Geda Paulsen (Institute of the Estonian Language), geda.paulsen[at]

► Teeside esitamine / Submission of abstracts

Registreerimine konverentsile / Registration to the Conference

Center of Estonian Language Resources, Institute of the Estonian Language