Expertise of our group

The Language Technology group is focusing on Natural Language Processing (NLP) research. Specifically, we are interested in statistical methods that make use of large unannotated text corpora, which nowadays is known under the Big Data Paradigm.

We collaborate with a range of partners in academia and industry. Please also look at our software projects.

Structure Discovery

A focus of our group is on unsupervised and knowledge free methods (e.g., clustering of lexical graphs) or topic models. These methods, which neither presuppose training data nor assume the existence of knowledge resources, identify regularities in large text collections and mark them back into the data, following the structure discovery paradigm. This markup, which is entirely data-driven and therefore independent of domain and language, is then used as features for learning applications in supervised machine learning settings: the usefulness of structure discovery processes is assessed in an application-based manner.

Statistical Semantics

We examines statistical methods that reflect natural-language semantics. Specifically we compute semantic similarities and semantic relations between lexical items through the analysis of large texts, and make these available within texts in a contextualized fashion. These relations are used in applications such as semantic indexing, paraphrasing and identification of lexical chains.


For obtaining the markup necessary to train supervised language technology components, the group advocates the use of crowdsourcing techniques. Here, unskilled workers are paid small sums to perform small annotation tasks. The advantage of this is the virtually unlimited number of annotators, which makes the creation of training data quick and scalable. Quality is ensured by redundancy and by using qualification tests or test items. A major challenge lies in the formulation of complex annotation tasks needed as simple subtasks suitable for the crowd.

A A A | Drucken Print | Impressum Impressum | Sitemap Sitemap | Suche Search | Kontakt Contact | Webseitenanalyse: Mehr Informationen
zum Seitenanfangzum Seitenanfang