Starting from:

$20

UCI- Information Retrieval: Assignment 3 Solved

GOAL: To implement a search engine.

Introduction
This assignment is to be done in groups of 1, 2, 3 or 4. You can work on the same groups that were in place for the crawler Project if you wish to. Although this is presented as one single project here, it is internally it is organized in 3 separate milestones, each with a specific deadline, deliverables and score.

In doing milestones #1 and #2, make sure to consider the evaluation criteria not just of those milestones but also of milestone 3 — part of the milestones’ evaluation will be delayed until the final meeting with the TAs.

You can use code that you or any classmate wrote for the previous projects. You cannot use code written for this project by non-group-member classmates. You are allowed to use any languages and libraries you want for text processing, including nltk. However, you are not allowed to use text indexing libraries such as Lucene, PyLucene, or ElasticSearch.

To accommodate the various skill levels of students in this course, this assignment comes in two flavors:

1   Information Analyst. In this flavor there is some programming involved, but not much more advanced than what you already did so far. It’s a mixture of the Text Processing project and stitching things together. You will be using a small subset of crawled pages. Groups where ALL students are neither CS nor SE can choose this option.

2   Algorithms and Data Structures Developer. In this flavor, not only there is programming to be done, but your code needs to be able to perform well on the entire collection of crawled pages, under the required constraints. This option is available to everyone, but groups that have at least one CS or SE student are required to do this.

Milestones overview

MS
Goal
Due date
Deliverable
Score
MS #1
Initial index
24/02
Short report
2.5
MS #2
Boolean retrieval
02/03
Short report
2.5
MS #3
Complete search
11/03
Code + Live Demo
55.0
General specifications
You will develop two separate programs: an indexer and a search component.

Indexer

More products