[Get it solved] Beginner Python assignment: Description In this project y...

Check Out Our Work & Get Yours Done

Submit Work

Download Sample

Enroll in the complete course for only $250 USD*

Order Now

Submit work Offers

Beginner Python assignment: Description In this project you are going to index a set of documents in a python

computer science

Description

Beginner Python assignment

Description In this project you are going to index a set of documents in a python open-source search engine called tinysearch, devise a set of test queries and evaluate the system on those queries. Indexing the Documents • download the search engine and the corpse from OWL->Resources->Assignment2_Files on your machines and index them (file name: tinysearch.zip). o Note that this corpse contains dumped Wikipedia documents and it is a few years old. o There are instructions concerning these steps further down in this document. Provided corpse This corpse has some of the Wikipedia documents that can be used for mirroring, personal use, informal backups, off-line use or database queries. All text content is licensed under the GNU Free Documentation License (GFDL). Topics and Questions In this part of the assignment, you are going to think of an application domain (i.e. a subject) that is of interest to you. For example, you could choose health, politics, sport, geography, music (any kind), etc. Now, create twenty queries in your chosen domain. For example: Q1: species and dogs Q2: Akita dogs Q3: wolfdog Note: These are just examples chosen by a student who was interested in dogs. You can choose whatever subject that: • It is covered by the documents you are using; • You can think of some quite difficult queries on your chosen topic. Retrieval Experiments Test the performance of the provided search engine using TF-IDF by applying the following steps: 1. Run the queries, as prepared above, through the system and collect the first ten files (or so) returned for each. 2. Compute precision and recall at the following levels of n (where n is the number of documents considered): n=5, n=10. 2 3. To do this, for each query you need to look at (for example) the first ten results (i.e. files) returned and see for each file whether is it Relevant or Not Relevant. A file is relevant if it contains the answer to your query. It does not matter where in the file the answer occurs as long as it is present somewhere. Note that this is not as easy as it sounds since there will be occasions when you are not sure. You need to make a note of the rationale for making your final decision in cases of doubt. Computing recall poses a problem in that we need to know for each query all the correct answers in the collection. Strictly, we cannot know that without inspecting every document in the collection. At TREC they use a pooling method as discussed in the lecture. To get around the problem here, simply check the first n documents (n = 20) returned for each query. Count the number of correct responses there and assume that these are all the correct responses in the collection. Then use this information to compute recall at n=5 and n=10 as above. Assignment Report Write up your results in a short report USING THE TEMPLATE SUPPLIED with the following headings exactly as shown in the template: 1. Cover page includes your (formal) name, ID and the program you are currently enrolledin. 2. Topic and Queries • What topic you chose; how the queries were devised. 3. Indexing the Documents • How was this done? • What problems were encountered (if any) and how were they solved? 4. TF-IDF Performance • Method - short text outlining what you did. • Results - a table summarizing the numerical results as above (review assignment 2 - appendix 1). • Discussion - a short description of what the results show (was TF-IDF always better, always worse or sometimes better/worse?), any interesting problem cases, any technical problems encountered and so on. Report Appendix 1 • Include the queries you used for your TF-IDF evaluation and the IDs of the right answers found for each (if any). • Example (this is just a sample): Num Query IDs of Answers 1 hot chicken 5003 … 20 chicken 0

Related Questions in computer science category

Introduction to Artificial Intelligence and Soft Computing Answer all the questions below. You are required to submit your answers in a single document file in PDF format.

Demonstrate A Basic System Functionality Of The Essential Components For This Assignment.

(Solved) Genetic Algorithm

Each person’s project must have a ListView somewhere to present items. Selecting an item from the ListView must show detailed information about the item selected.

An introductory paragraph of 50 to 75 words or at least 4 sentences. To start need an attention getter than a thesis statement that contains a subject and a comment or opinion. Last for introductory need a preliminary outline.

HTML code assignment help

The simulation tool PCSPIM windows version of SPIM we will be using throughout this semester to write MIPS assembly codes.

The purpose of the project is to learn how to gather data in a database for your research.

A typical rural homestead with domesticated animals , father and mother with son and daughter attending a local primary school

Files to submit: Makefile, all necessary .c, and .h files need to compile your program, Readme.txt All programs must compile without warnings when using the -Wall option If you are working in a group ALL members must submit the assignment on SmartSite Sub

Get Higher Grades Now

Tutors Online

Description

Drop Files Here Or Click to Upload

Get Free Quote!

436 Experts Online

Connect With Us

We Provide Services Across The Globe

Disclaimer: The reference papers or solutions provided by Calltutors.com serve as model papers or solutions for students or professionals and are not to be submitted as it is to any institutions. These documents are intended to be used for research and reference purposes only. University and company's logo's are the property of respected owners. We don't have affiliation with the mentioned universities. By using our services means, you agree to our Honor Code , Privacy Policy , Terms & Conditions , Payment , Refund & Cancellation Policy.

Enroll in the complete course for only $250 USD*

Beginner Python assignment: Description In this project you are going to index a set of documents in a python

computer science

Description

Get instant assignment help service

Related Questions in computer science category

Policy

Exploring

Other

Connect With Us

We Provide Services Across The Globe