ºÚ¹Ï³ÔÁÏÍø
CS3TM20-Text Mining and Natural Language Processing
Module Provider: Computer Science
Number of credits: 10 [5 ECTS credits]
Level:6
Terms in which taught: Spring term module
Pre-requisites:
Non-modular pre-requisites:
Co-requisites:
Modules excluded:
Current from: 2021/2
Module Convenor: Prof Xia Hong
Email: x.hong@reading.ac.uk
Type of module:
Summary module description:
This module introduces both the theory and practice of Text Mining and Natural Language Processing (NLP).
Aims:
The aim of this module is to introduce the field of text mining and natural language processing. A key focus of the module is placed on the theories and practice of processing text data from the aspects of lexicons, syntactics, and semantics.Ìý
This module also encourages students to develop a set of professional skills, such as problem solving, creativity, technical report writing, organization and time management, self-reflection, software design and development; end-user awareness, action planning and decision making, commercial awareness, critical analysis of published literature and value of diversity.
Assessable learning outcomes:
By the end of this module, students should be able to
- Understand and apply the fundamental principles of text mining and natural language processing;
- Apply methods and algorithms to process different types of textual data;
- Empirically evaluate the performances of methods and algorithms by using accuracy and efficiency metrics;
- Apply analytical and programming skills through using the existing NLP methods and tool s such as NLTK and scikit-learn (python)
Additional outcomes:
This module will provide an overview of the field of Text Mining and NLP and its sub-areas, and will introduce and explain its key techniques, including their applicability and limitations. Topics covered will include:
- Regular expression, Text Normalization,Ìý
- N-gram and language model, part-of-speech tagging
- lexical semantics, Word Senses and WordNet
- Syntactic and Semantic parsing
- Text classification,Ìýsentiment analysis
- Information extraction including name entity recognition and relation extraction
- Advanced topics: ÌýMachine learning for NLP, Word embedding, Hidden Markov model and Viterbi algorithm
Outline content:
Brief description of teaching and learning methods:
The course material will be introduced through lectures and practicals. The lecture material will be applied during lab practical sessions. The lab work will provide the student with support to develop high fidelity prototypes by adopting the concepts and storyboards as well as plan for evaluation.ÌýÌý
Ìý | Autumn | Spring | Summer |
Lectures | 16 | ||
Practicals classes and workshops | 4 | ||
Guided independent study: | Ìý | Ìý | Ìý |
Ìý Ìý Wider reading (independent) | 5 | ||
Ìý Ìý Wider reading (directed) | 5 | ||
Ìý Ìý Exam revis |