2017HS: 32010 Natural Language Processing
The main objective of this course is to introduce the students to the underlying problems when facing with natural languages data.
This course has the following objectives:
1) to understand the problems related to representation and manipulation of text data;
2) to understand the statistical properties underlying in all text data;
3) to understand the main approaches in NLP;
Practical exercises will complete the theoretical presentation.
Description
Introduction to linguistics (morphology, syntax, semantics); Spelling detection and correction; Statistical models (counting words, bigrams, entropy); Parsing; Markov chains; Hidden Markov chains; Text categorization, sentiment analysis, authorship attribution; Cryptography; Question/Answering; Text summarization.
The final mark is based on both a final written exam and the results of the practical exercises.
References
Christopher D. Manning, Hinrich Schütze: Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge (MA).