Page Areas:

Current Submenu:

Additional Information:

Campus map

Campusplan JKU Linz

Our location on campus ...  more of Campus map (Titel)

Softwarepark Hagenberg

Hier den Alternativtext zum Bild eingeben!

Our location on Softwarepark Hagenberg ...  more of Softwarepark Hagenberg (Titel)

Position Indication:


Web Search and Mining (351.028)


a.Univ.-Prof. Dr. Birgit Pröll (birgit.proell(at)



Schedule and Preliminary Meeting

The course comprises 6 lectures and a final exam (see KUSSS for schedule). Exrcises have to be worked on in groups up to 3 students and presented in the first 1,5 hours of each lecture. Except for the preliminary meeting and the final exam, which start at 12:00 s.t., all lectures will start at 12:00 c.t.

There might be a rescheduling replacing the lecture date from May 18 to April 27 dependent on room availabilities. In case of a re-scheduling students will be informed by email.

Presence is obligatory at the beginning of the first course => preliminary meeting, which is meant for organisational issues, decision on course participants and group building for working on the exercises. If, for arguable reasons, your personal presence in the preliminary meeting is not possible, it is requested that a colleague assures your participation in the lecture, assignes you to a group, and circulates the information presented in the preliminary meeting; otherwise your course registration gets cancelled.

Further, presence is obligatory, when presentations of exercises are scheduled. Absence (also if excused) will imply a reduction of points of the exercises due.

Lecture Contents

- Lecture I
       Preliminary Meeting, Presence Obligatory
       Information Retrieval and Extraction "in a Nutshell" - this first lecture is meant as an introduction of the fudamentals of traditional IR and IE; It sums up contents of the lecture "Information Retrieval and Extraction", which students who participated in this lecture have already knowledge on, but which is of importance for the other students, to gain basic knowledge the further contents of the course are based on.
- Lecture II
       Web Search: Search Engines Archtecture and Components, Web Crawler, Web Indexing/Weighting, Exclusion Concepts, Page Rank
- Lecture III
       Web Site Search, Search Engine Optimization (SEO)
- Lecture IV
       Web Information Extraction: Screen Scraping, Rule-based approach
- Lecture V
        Natural Language Processing on the Web
- Lecture VI
       Opinion Mining, Question-Answering & Dialogue Systems on the Web

Course Description
- Students have competence in fundamentals and technologies of

Web Search (Web Information Retrieval) and their application in search engines
Web Mining with an emphasis on Web Information Extraction focusing on a knooweldge-based approach

- They are able to implement and evaluate applications in these fields and have knowledge about related fields and current research topics.

  • Course Description
  • Web Search (Web Information Retrieval) and their application in search engines
  • Web Mining with an emphasis on Web Information Extraction focusing on a knooweldge-based approach


- Web Search (Web Information Retrieval)

  • Information retrieval „in a nutshell“
  • Web Search Fundamentals
  • Web Crawling
  • Search Engines
  • Weighting and Ranking (PageRank etc.)
  • Web Search Evaluation
  • Site Search
  • Search Engine Optimization (SEO)
  • Search User Interfaces (advanced query concepts etc.)
  • Web 2.0 & Social Media Search/Monitoring

- Web Mining (Web Information Extraction)

  • Information Extraction „in a nutshell“
  • Web information extraction (WebIE) funcamentals
  • WebIE approaches (knowledge-based Web IE, screen-scraping/wrapping, etc.)
  • Web link/structure analysis
  • WebIE tools and aplications (focusing on a knowledge-based approach)

- Current concepts and applications

  • Deep Web Search
  • Spam Detection
  • Question Aanswering Systems (focused seach)
  • Web 2.0 & Social Media Search/Monitoring
  • Opinion Mining / Sentiment Analysis
  • Crowd Knowledge Extraction
  • Web-based Ontology Learning
  • Web Data Quality, etc.


  • Exercises will be handed out and are to be presented in the following lession.
  • Exercises can be workes on in groups up to three persons.
  • The first part of each lesson is dedicated to the presentation of exercises - see course schedule above
  • Presence during this firs part of each lesson is obligatory to achieve the exercises' full points. Non-presence results in reduction of exercises' points.


  • Graduation
  • Exercises (45%), Final Exam (45%), In-class Contribution (10%)
  • The contents for the final exam comprises the in-class presentation of the lecturer as well as the presentations of topics by the students
  • Sowohl die Aufgabenausarbeitung als auch die Klausur müssen positiv abgeschlossen werden.

Course Material & Literature

  • Slides will be accessible for each course the day before
  • Search Engines (B. Croft; Pearson 2010)