Leibniz MMS Days 2024 - Abstract
Fankhauser, Peter
Language adapts to the communicative needs of its context, along dimensions such as domain, register, and time. This is reflected in preferential choices of vocabulary, grammar, meaning, and style, which differentiate language use by context. This visualization aims at exploring domain specific word use based on three related but complementary measures:
- Typicality of a word measures how typical a word is for a domain compared to other domains. Informally, if a word is significantly more frequent in a domain, it is typical for the domain.
- (Paradigmatic) Productivity of a word measures how many (paradigmatically) similar words are used in the domain. A word w with a paradigmatic neighbourhood (semantic field) consisting of many words is considered highly productive.
- Ambiguity of a word measures how different its use in one domain is from another. Formally, this is defined as the cosine distance between the domain specific embeddings of a word.