[Get it solved] It must prompt the user to enter a query as a ‘bag of w...

Check Out Our Work & Get Yours Done

Submit Work

Download Sample

Enroll in the complete course for only $250 USD*

Order Now

Submit work Offers

It must prompt the user to enter a query as a ‘bag of words’ where multiple terms can be entered separated by a space

computer science

Description

In the previous two development assignments, we have created a process to generate an inverted index based upon the CS 3308 corpus. In the index part 1 assignment we created a process that scans a document corpus and creates an inverted index. In the index part 2 assignment we extended the functionality of the index process by incorporating functionality to:

Ignore (not include the index) a list of stop words
Edit tokens as follows:

Terms under 2 characters in length are not indexed
Terms that contain only numbers are not indexed
Terms that begin with a punctuation character are not indexed

Integrate the Porter Stemmer code into our index
Calculate the tf-idft,d value for each unique combination of document and term

In this assignment, we will be using the inverted index that was created by the processed developed as part of the index part 2 assignment. In unit 4 we learned about the weighting of terms to improve the relevance of documents found when searching the inverted index. First we learned about calculating the inverse document frequency and from this the tf-idft,d weight. Using the tf-idft,d weight we were able to calculate both document and query vectors and evaluate them using the formula for cosine similarity.

In the development assignment for this unit, we will implement these concepts to create a search engine. Your assignment will be to create a search engine that will allow the user to enter a query of terms that will be processed as a ‘bag of words’ query.

Your search engine must meet the following requirements:

It must prompt the user to enter a query as a ‘bag of words’ where multiple terms can be entered separated by a space
For each query term entered, you process must determine the tf-idft,d weight as described in Unit 4
Using the query terms, your process must search for each document that contains each of the query terms
For each document that contains all of the search terms, your process must calculate the cosine similarity between the query and the document
The list of cosine similarity scores must be sorted in descending order from the most similar to the least similar
Finally your search process must print out the top 20 documents (or as many as are returned by the search if there are fewer than 20) listing the following statistics for each:

The document file name
The cosine similarity score for the document
The total number of items that were retrieved as candidates (you will only print out the top 20 documents)
‘home mortgage’ is provided in the output of the search for terms

This Source Code document contains code for a search engine that meets many of these requirements is provided for you as an example. This code does NOT meet all of the requirements of this assignment. Further there are key areas of the code that are missing. You are welcome to use this example code as a baseline, however, you must complete any missing functionality as required by the assignment.

Related Questions in computer science category

What are the pros and cons of having a database language (like SQL) based on an industry-accepted standard

The relational model

The goals of this assignment are to exercise your CSS skills and to give you practice at making a coded page match a set of provided mocks, which is a common task for a web developer.

Create two programs, _hw51.py and _hw52.py, zip them together for submission into one file, _hw5.zip. Make sure both team members' names are in the comments of both programs.

Importance of setting up a secure network

Compare and contrast the economics of time-sharing to the economics of the PC model. In each case what is being bought and sold?

Project Requirements As an IT consultant, you have been contracted to research and report on an IT management issue. The unrealistic part is that you get to choose the management issue. Your term paper assignment is to produce a final report to be present

Example of 2 systems within your work environment that use data integration in some way

You have been hired as an associate for a Real Estate Portfolio company specializing in renovations.

The website above only shows partial content. Have to click on the title again for whole content but i want it to automatically all content being crawl and being shown in CSV)

Get Higher Grades Now

Tutors Online

Description

Drop Files Here Or Click to Upload

May

January

February

March

April

May

June

July

August

September

October

November

December

2025

1950

1951

1952

1953

1954

1955

1956

1957

1958

1959

1960

1961

1962

1963

1964

1965

1966

1967

1968

1969

1970

1971

1972

1973

1974

1975

1976

1977

1978

1979

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

2025

2026

2027

2028

2029

2030

2031

2032

2033

2034

2035

2036

2037

2038

2039

2040

2041

2042

2043

2044

2045

2046

2047

2048

2049

2050

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

00:00

00:30

01:00

01:30

02:00

02:30

03:00

03:30

04:00

04:30

05:00

05:30

06:00

06:30

07:00

07:30

08:00

08:30

09:00

09:30

10:00

10:30

11:00

11:30

12:00

12:30

13:00

13:30

14:00

14:30

15:00

15:30

16:00

16:30

17:00

17:30

18:00

18:30

19:00

19:30

20:00

20:30

21:00

21:30

22:00

22:30

23:00

23:30

Warning: require_once(/home/u706648698/domains/calltutors.com/public_html/service_page_footer.php): failed to open stream: No such file or directory in /home/u706648698/domains/calltutors.com/public_html/Assignment.php on line 380

Fatal error: require_once(): Failed opening required '/home/u706648698/domains/calltutors.com/public_html/service_page_footer.php' (include_path='.:/opt/alt/php73/usr/share/pear') in /home/u706648698/domains/calltutors.com/public_html/Assignment.php on line 380

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Enroll in the complete course for only $250 USD*

It must prompt the user to enter a query as a ‘bag of words’ where multiple terms can be entered separated by a space

computer science

Description

Get instant assignment help service

Related Questions in computer science category

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31