Foto 7

Prof. Franco Maria Nardini, Salvatore Trani, ISTI-CNR - Italy - "Challenges in Modern Web Search", 5-8 July 2021

Hours:
16 hours (4 credits)

Room:

From remote by using Microsoft Teams. The link will be sent in due time to all students who registered to the seminar.

To register to the course, click here

 

Short Abstract:
This PhD course focuses on Web search and discusses the challenges in the three main areas of Web search: i) crawling, ii) indexing, and iii) query processing. The course introduces each area by discussing the state of the art in the field and by presenting the open research questions. The emphasis of the course is on query processing, an area where machine learning provides an important contribution to advance the state of art. After an introduction of the different query processing techniques, the course i) introduces supervised techniques explicitly focused to target the ranking problem, ii) discusses several efficiency/effectiveness trade-offs in query processing and iii) analyse several related optimization techniques. The course will also provide an overview of the query processing techniques employing deep neural networks. Two hands-on sessions will cover indexing and query processing of public Web collections.

Course Contents in brief:

  1. Modern Web Search ( 4 hours )
    1. The web: history, peculiarities and the importance of the search.
    2. Anatomy of a modern Web search engine: crawling, indexing, query processing.
    3. Crawling: definition and application. Architecture of a modern crawler.
    4. Challenges in crawling the Web
  2. Fast Indexes for Web search ( 4 hours )
    1. Data structures for indexing Web documents
    2. Modern techniques for efficient text retrieval
    3. Challenges in indexing the Web
    4. Hands On : Indexing and basic query processing on a public Web collection
  3. Machine learning in modern query processors ( 8 hours )
    1. Machine learning approaches for IR: Learning to Rank
    2. Efficiency/Effectiveness Trade-offs, Cascading Architectures
    3. Neural information retrieval
    4. Hands On : Learning to Rank and Deep Neural Networks for efficient Web search

Schedule:

  1. 05/07/2021 - 9:00 - 13:00
  2. 06/07/2021 - 9:00 - 13:00
  3. 07/07/2021 - 9:00 - 13:00
  4. 08/07/2021 - 9:00 - 13:00