Course Overview
More than 70% of the data on the internet is unstructured. Among them, text is the most common form that appears in almost all data sources. For example, text data such as emails, online reviews, tweets, news and reports hold valuable information and insight for most research and applications. Text analytics, usually involving techniques from text mining or natural language processing (NLP), can automatically uncover patterns and extract meaning/context from these unstructured texts.
This course assumes that you have basic Python programming knowledge, or have previously attended "Introduction to Python for Data Science" from Stats Central. This course will provide you the foundation to process and analyze text.
In this course, we will cover some useful Python features and libraries for text processing and analysis. We will touch on some advanced topics such as sentiment analysis, text classification, and/or topic extraction.
Course Outline
This course will cover topics including:
• Jupyter Notebook
• Basic text operations in Python
• Text analytics and NLP
• Tokenization, stopwords, lexicon normalization, POS tagging
• Sentiment analysis and text classification
Presenter and Expertise: A/Professor Raymond Wong (Stats Central and UNSW School of Computer Science and Engineering)
Course Requirements: You will need to use a computer during the course.
Date: Tuesday 8 to Wednesday 9 September 2020 (two morning sessions)
Duration: 10.00am - 1.00pm each day
Location: Online (access to be advised)
You will receive a certificate of completion for the course.