When analysing large amounts of linguistic data, we cannot manage without specification of parts of speech. In the workshop on PoSs, we would like to discuss topics related to the role of PoSs in corpus analysis and other digital tools processing language. The questions we address are among others:
- What kinds of problems arise in connection with PoSs when creating automatic systems? How do languages differ in respect to the PoS role in corpus analysis?
- As PoSs combine syntactic, morphological and semantic information, what would be the best way to unite these levels for the best possible outcome when dealing with the PoSs in corpus analysis and lexicographic work?
- What are the ways to preprocess the corpus data that would facilitate specification of the lexical category of a word?
We invite researchers working in the area of large corpora processing, parsing of morphologically rich languages, word sense disambiguation, e-lexicography, etc. to participate in the workshop and to contribute to the discussions related to the role of PoSs in language technology.
The working language of the workshop is English.
Miloš Jakubíček, the CEO of Lexical Computing Ltd, a software developer working on Sketch Engine, and a fellow of the NLP Centre at Masaryk University). He is a researcher in computer lexicography and corpus linguistics. He has experience with corpus building and morphosyntactic analysis.
Date: 22 April, 2020
Venue: Institute of the Estonian Language, Roosikrantsi 6, Tallinn
Kontakt / Contact
Geda Paulsen (Institute of the Estonian Language), geda.paulsen[at]eki.ee
► Registreerimine konverentsile / Registration to the Conference (avaneb veebruaris / to be opened in February)