Natural Language Processing BCS714B
Course Code: BCS714B
Credits: 04
CIE Marks: 50
SEE Marks: 50
Total Marks: 100
Exam Hours: 03
Total Hours of Pedagogy: 40H
Teaching Hours/Weeks: [L:T:P:S] 3:0:0:0
Introduction: What is Natural Language Processing? Origins of NLP, Language and Knowledge,
The Challenges of NLP, Language and Grammar, Processing Indian Languages, NLP Applications.
Language Modeling: Statistical Language Model – N-gram model (unigram, bigram), Paninion
Framework, Karaka theory.
Word Level Analysis: Regular Expressions, Finite-State Automata, Morphological Parsing, Spelling
Error Detection and Correction, Words and Word Classes, Part-of Speech Tagging.
Syntactic Analysis: Context-Free Grammar, Constituency, Top-down and Bottom-up Parsing, CYK
Parsing.
Naive Bayes, Text Classification and Sentiment: Naive Bayes Classifiers, Training the Naive Bayes Classifier, Worked Example, Optimizing for Sentiment Analysis, Naive Bayes for Other Text Classification Tasks, Naive Bayes as a Language Model.
Information Retrieval: Design Features of Information Retrieval Systems, Information Retrieval
Models – Classical, Non-classical, Alternative Models of Information Retrieval – Custer model, Fuzzy
model, LSTM model, Major Issues in Information Retrieval.
Lexical Resources: WordNet, FrameNet, Stemmers, Parts-of-Speech Tagger, Research Corpora.
Machine Translation: Language Divergences and Typology, Machine Translation using Encoder- Decoder, Details of the Encoder-Decoder Model, Translating in Low-Resource Situations, MT Evaluation, Bias and Ethical Issues.