Socket programming in order to implement a real-world application and gain some basic understanding of HTTP

computer science

Description

Michigan State University


Computer Science & Engineering Department

CSE422 Computer Networks, Spring 2017

Laboratory 2: Web Proxy Server

Due: 23:59 Thursday, March 23, 2017


1 Outline

Note that you are neither required to follow these steps nor required to use the skeleton code

[link]. In the skeleton code, the missing parts that require your implementation are marked

with a comment /********TO BE IMPLEMENTED********/. Please feel free to modify the

helper classes if you prefer.

1. Run the client program (client.h/cc) and understand the usage of the helper classes.

Refer to .h files for function definitions and their purposes. Refer to .cc files for

implementations. More information can be found in Section 4.

2. Complete the missing part of proxy.cc. Create a TCP socket to accept a connection.

For TCPSocket class, please use try/catch to capture exceptions. You can find samples

in the client program. Refer to Section 12 for logs.

3. Complete the five functions that have a missing part in ProxyWorker.cc, in the following

order, getRequest, check request, forwardRequest, getResponse, and returnResponse.

More details can be found in Section 8.


4. In getResponse, completely identify/default transfer encoding before working on Chunked transfer encoding (Section 4). Refer to client program for more details.


5. Work on keyword filtering. Note that one requirement is filtering the hostname and

the other is filtering the path. (Section 8).

6. Work on subliminal messages (Section 5).

7. Try your proxy with real browsers (not required).


2 Goals

Apply your knowledge of socket programming in order to implement a real-world application

and gain some basic understanding of HTTP.

1


3 Overview

In this lab, you will implement a simple proxy server for HTTP that forwards requests from

clients to end servers and returns responses from end servers to the clients. You will also

implement a special function that inserts subliminal messages between HTTP web pages.

This lab is worth 12% of the final grade and is composed of 120 points. This lab


is due no later than 23:59 (11:59 PM) on Thursday, March 23, 2017. No late sub-

a mission will be accepted. You will submit your lab using the CSE handin utility



3.1 The HyperText Transfer Protocol, HTTP

The HyperText Transfer Protocol (HTTP) is the World Wide Web’s application-layer

protocol. HTTP operates by having a client (usually a browser) initiate a connection to

a server, send an HTTP request, and then read and display the server’s response. HTTP

defines the structure of these messages and how the clients and servers exchange messages.

A web object is simply a file, such as an HTML file, a JPEG image, or a video clip. A

web page usually consists of one HTML file with several referenced objects. A page or an

object is addressed by a single Uniform Resource Locator (URL). When one wants to access

an HTML page, the web browser initiates a request to the server and asks for the HTML

file. If the request is successful, the server replies to the web browser with a response that

contains the HTML file. The web browser examines the HTML file, identifies the referenced

objects, and for each referenced object, initiates a request to retrieve the object.

An example of an HTTP request/response is shown in Figure 1. The request has only a

message header and does not have a message body. The response consists of a message header

followed by a message body, which is the HTML file requested. The header is composed of

several lines, separated by a carriage return and line feed (CRLF, “\r\n”). For each message,

the first line of the header indicates the type of the message. Zero or more header lines follow

the first line; these lines specify additional information about this message. The end of the

header is indicated by an empty line. The message body may contain text, binary data, or

even nothing at all.

There are actually 8 different HTTP request methods, however, in this lab, we consider only

the GET method, which is used to request objects from the server. The GET request must

include the path to the object the client wishes to download and the HTTP version. In the

above example, the path is /~liuchinj/cse422ss17/index.html and the HTTP version is

HTTP/1.1. Some request methods, such as POST, transmit data to the server in a message

body. However, the GET method does not have a message body.


Related Questions in computer science category