Michigan State University
Computer Science & Engineering Department
CSE422 Computer Networks, Spring 2017
Laboratory 2: Web Proxy Server
Due: 23:59 Thursday, March 23, 2017
Note that you are neither required to follow these steps nor required to use the skeleton code
[link]. In the skeleton code, the missing parts that require your implementation are marked
with a comment /********TO BE IMPLEMENTED********/. Please feel free to modify the
helper classes if you prefer.
1. Run the client program (client.h/cc) and understand the usage of the helper classes.
Refer to .h files for function definitions and their purposes. Refer to .cc files for
implementations. More information can be found in Section 4.
2. Complete the missing part of proxy.cc. Create a TCP socket to accept a connection.
For TCPSocket class, please use try/catch to capture exceptions. You can find samples
in the client program. Refer to Section 12 for logs.
3. Complete the five functions that have a missing part in ProxyWorker.cc, in the following
order, getRequest, check request, forwardRequest, getResponse, and returnResponse.
More details can be found in Section 8.
4. In getResponse, completely identify/default transfer encoding before working on Chunked transfer encoding (Section 4). Refer to client program for more details.
5. Work on keyword filtering. Note that one requirement is filtering the hostname and
the other is filtering the path. (Section 8).
6. Work on subliminal messages (Section 5).
7. Try your proxy with real browsers (not required).
Apply your knowledge of socket programming in order to implement a real-world application
and gain some basic understanding of HTTP.
In this lab, you will implement a simple proxy server for HTTP that forwards requests from
clients to end servers and returns responses from end servers to the clients. You will also
implement a special function that inserts subliminal messages between HTTP web pages.
This lab is worth 12% of the final grade and is composed of 120 points. This lab
is due no later than 23:59 (11:59 PM) on Thursday, March 23, 2017. No late sub-
a mission will be accepted. You will submit your lab using the CSE handin utility
3.1 The HyperText Transfer Protocol, HTTP
The HyperText Transfer Protocol (HTTP) is the World Wide Web’s application-layer
protocol. HTTP operates by having a client (usually a browser) initiate a connection to
a server, send an HTTP request, and then read and display the server’s response. HTTP
defines the structure of these messages and how the clients and servers exchange messages.
A web object is simply a file, such as an HTML file, a JPEG image, or a video clip. A
web page usually consists of one HTML file with several referenced objects. A page or an
object is addressed by a single Uniform Resource Locator (URL). When one wants to access
an HTML page, the web browser initiates a request to the server and asks for the HTML
file. If the request is successful, the server replies to the web browser with a response that
contains the HTML file. The web browser examines the HTML file, identifies the referenced
objects, and for each referenced object, initiates a request to retrieve the object.
An example of an HTTP request/response is shown in Figure 1. The request has only a
message header and does not have a message body. The response consists of a message header
followed by a message body, which is the HTML file requested. The header is composed of
several lines, separated by a carriage return and line feed (CRLF, “\r\n”). For each message,
the first line of the header indicates the type of the message. Zero or more header lines follow
the first line; these lines specify additional information about this message. The end of the
header is indicated by an empty line. The message body may contain text, binary data, or
even nothing at all.
There are actually 8 different HTTP request methods, however, in this lab, we consider only
the GET method, which is used to request objects from the server. The GET request must
include the path to the object the client wishes to download and the HTTP version. In the
above example, the path is /~liuchinj/cse422ss17/index.html and the HTTP version is
HTTP/1.1. Some request methods, such as POST, transmit data to the server in a message
body. However, the GET method does not have a message body.