NLP 8505

Arabic Natural Language Processing


Lectures

Monday: 3:00pm - 4:30pm, Classroom 6

Wednesday: 3:00pm - 4:30pm, Lecture Hall 2


Course Description

This course offers an in-depth introduction to Arabic Natural Language Processing (NLP), focusing on the unique challenges presented by Arabic as a computational object of study. Students will learn about core enabling technologies for NLP with a strong focus on the Arabic language and its dialects. The course will include text normalization, morphological analysis, syntactic parsing, and semantic analysis. The course will integrate theory and hands-on experience, including applied deep learning techniques and practical applications like machine translation, sentiment analysis, and more. By the end of the course, students will be equipped to contribute to advancements in Arabic NLP research and development.


Topics

This course combines theoretical foundations with applied practice, organized around the core components of Arabic NLP. The main topics include:

  • Arabic Script and Orthography: Principles of the writing system and orthographic variation.
  • Tokenization: Fundamentals of word segmentation for Arabic.
  • Arabic Morphology: Morphological structure and processes; computational analysis, generation, and disambiguation methods.
  • Dialect Modeling: Representing and processing dialectal Arabic.
  • Arabic Resources: Corpora and tools.
  • Applications in Arabic NLP: Readability modeling, grammatical error correction, and text rewriting as case studies of end-to-end systems.

Supplemental Material


Grading

Percentage Assessment Component
25% Assignment 1
25% Assignment 2
50% Course Project:
– Team Declaration (5%)
– Proposal Abstract (5%)
– Preliminary Report (10%)
– Presentation + Final Report (30%)

Schedule

Week 1

March 2: Introduction to Arabic NLP, history, challenges
March 4: Arabic Script and Orthography

Week 2

March 9: Morphological Structure, Analysis and Generation
March 11: Morphological Disambiguation

Week 3

March 23: Arabic Dialect Modeling 1
March 25: Arabic Dialect Modeling 2

Week 4

March 30: Dialectal Arabic Evaluation – Guest Lecture (Dr. Amr Keleg)
April 1: Arabic Syntactic Analysis – Guest Lecture (Prof. Nizar Habash)

Week 5

April 6: Educational ArabicNLP
April 8: Team 1 - Presentation / Reading Group
  • Slides

Week 6

April 13: Team 2 - Presentation / Reading Group
  • Slides
April 15: Team 3 - Presentation / Reading Group
  • Slides

Week 7

April 20: Bias and Ethics
April 22: Controlled Natural Language Generation for Morphologically Rich Languages: The Case of Arabic
  • Slides

Week 8

April 27:
  • Final Presentations and Projects Due