Edit this page on GitHub

Information Retrieval and Natural Language Processing

Course Description

Information Retrieval (IR) is an area that aims at answering user information needs with the most relevant information. In this course we shall study how search applications, e.g. Google, compute relevant search results from a repository of Web information.

This course starts by dissecting a search engine, and discusses the fundamental techniques currently used in information retrieval. Afterwards, the most relevant algorithms and retrieval models are discussed in detail.

The current demand for intuitive search processes and language comprehension have been alligning Natural Language Processing (NLP) and IR. In this course you will learn fundamental techniques, that are used to encode syntax, grammar, and semantics in machines.

This course includes extensive hands-on laboratories where key retrieval and NLP algorithms are examined. The goal is to strengthen students’ experimental analysis and critical thinking skills concerning search performance metrics and experimental results.



Exam (40%) + Lab work (60% with three submissions)

Online Lectures and Discussion forum

All lectures and labs are thaught by Zoom. Please, contact instructors to access the meeting ID and password.

A discussion forum (Discord) is set up to let students and lecturers discuss course and project issues. Please ask the intructors to join.

Discussion Forum Rules

When registering for the discussion forum, please follow the username schema: “FirstName Surname-StudentNr” e.g.: “Gustavo Gonçalves-40000”



Project guidelines for milestone 2:


Exercises Sheet


Joao Magalhaes ([email protected] - remove the ‘x’s to mail us)

Gustavo Gonçalves (ggoncalv cs.cmu.edu)