Clarification added below the initial assignment description.



Note; Clarification added below the initial assignment description. 
You have been assigned to proposing a new network architecture for a new Internet startup, Games4All (G4A). The product G4A has is an online game, similar to PUBG/Fortnight, but instead of 100 players, it's 1000 players per map, and the map is somewhat larger. As the map is larger and has more players, the solution to handle this is to use virtualized servers, hence they can spin-up and down servers on-demand. However, they will only run one map per server. Each player needs around 256Kbps of up/down the speed to the server to have a good gaming experience. 
Consider a network architecture that enables the company to support 200 simultaneous games (200*1000 players), while minimizing the number of physical servers. Assume that the server is limited by network capacity, not CPU/memory. Furthermore, assume a geographically distributed environment, where you would have servers in Europe (Paris, Helsinki), Asia (China-Bejing, India-Hyderabad and South Korea (Seol), North America (San Jose, New York). The games would be distributed as 50% in Europe, 25% in Asia, and 25% in North America. Within each region, the load is divided equally (50/50 or 33/33/33 per city). The focus here is between Servers and ISP in the City.  
All Cities are connected to the main site, to coordinate software updates, player information, etc. The traffic is proportional to the number of players in a city and is approximately 100Kbytes per game a player participates in. I.e. for every game that starts, each player will require 100Kbytes of data to be downloaded to the server at the City where the game will execute. The average game duration is 30 minutes, the maximum is 60minutes. Hence, a player will download (on average) 100Kbytes every 30 minutes. Consider a network architecture between the Cities and the main-site, so that it can support all the anticipated games. 
Please propose a network architecture (draw) that meets both of these two design considerations. From the Map, it should be clear to see the general architecture (Routers/Networks/Switches) and what capacities have been selected for the links (at deployment time). Furthermore, explain how growth would be handled if more servers would be installed to handle more simultaneous games. 
Assume that we need to add 25 games to each city in Europe, 9 games for each NA and Asia city, in total 25+8+8 additional games to run.  [UPDATED]
Reason around challenges on how to monitor the operations of this large and complex network. [REMOVED]
In your report, you need to provide clear reasoning for your choices, briefly discuss alternatives. Elaborate briefly on the challenges of how to monitor, operate, and maintain such a network. How does a router failure impact the network/gameplay? Where is the potential single point of failure? What is the cost of resiliency/redundancy? [REMOVED]


I want to read a report that ANSWERS the questions and motivates the answers. I do not want to ready any teaching or sales-pitches.
In the report I want the following information:
Overview of the Architecture, all seven cities are connected to [1,2…m] (depending on your architecture) ISPs, these are in their turn connected to the Internet. Somewhere on the Internet, the main-site is also located, again with [1,2…m] ISPs. If you have a regional relay/data center (RDC), it is also connected to the Internet via [1,2…m] ISPs, however, the cities are still connected to their ISP’s so the RDC might be seen as a router on a stick, but hopefully, it does not do any routing. In this overview figure, place the main site, cities, RDC if used, and the players.
You need to calculate the required LINK capacities to the ISPs, for each entity. Please note, the main-site DOES not run any games. It's only used to download data at the launch of a GAME. But all cities download from it.
If you use multiple ISPs, calculate first the theoretical requirement, then figure out how you distribute this between the ISPs in the normal case, and then in … less normal cases.
The link capacities to/from ISPs you can buy are; 100Mbps, 200Mbps, 400Mbps, 800Mbps, 1Gbps, 2Gbps, 4Gbps, or 8Gbps. You can buy any combination of these, hence, to get a 10Gbps you buy an 8+2Gbps, to get 5Gbps, you buy a 4Gbps+800+200Mbps. The links are full-duplex, so the speed is both up and down, too, and from the Internet.
In addition to the overview, I want a generic city architecture where you explain your principles, or I want three separate city architectures one for each region, as the cities are identical within the region, showing one is enough. From this architecture, I want to see how many routers, switches, and servers you will use. 

For the routers and switches, I want to see their port count and speeds. Note that a device can have ports with different speeds cf. a device can have 2x 1GE ports and 24x 100BaseT ports. I also need to see your servers, and speed, and the number of network ports they have. 
You are NOT limited to using devices in PacketTracer. Use Google to find suitable -real- devices. However, no need to go to extremes and get huge Core devices, cf. Juniper/Cisco/etc.. chassis devices. 
I.e it's fine to use a 24port GE switch with 2port 10GE uplinks, or,  you can get 24Port 10GE copper, with 4 SPF+ ports.  [UPDATED]

Here, you can have 10GE ports on the server, but the game will not support more than three concurrent games running on a server. One game will have to handle 1000 concurrent connections, having more than 3000 on a single server would probably not be realistic. For each ISP the city uses, the ISP will provide a router of its own. However, the routers of different ISP will not cooperate, you need to handle that. Please note, if you set up a VPN between two regions/cities that VPN will have to be carried across the Internet, thus requiring your ISP links to have the capacity to handle it.
From the growth perspective, we will not scale in multiples of integers. I want you to discuss how your architecture would work if you need to add a few servers in different cities. Please identify, when the growth will push your architecture over a threshold, thus requiring additional resources to be added.
From the monitoring perspective; I want to know what parameters, metrics, etc. you will monitor, and why. It’s not enough to say that we will install <pick your favorite> and it monitors our systems. The focus is WHAT you will monitor and WHY. I also want to know how your monitoring helps with the operations and maintenance of the network(s).
If your architecture is built to handle failures, what is the cost (in the number of devices/ports) for this resiliency/redundancy? Irrespective, I want you to explain if your architecture has any single points of failure. If it has, motivates why you kept it if it does not explain the additional cost (number of devices/ports) it carried.
 Clarification v2
As I've seen so many confusing architectures, in Fig 1. there is an example of an acceptable 'main' architecture, and in Fig 2 there is a template for a City architecture. 

et2598-project-overview (2).pnget2598-project-overview (1).png
Fig 1: OverviewFig 2: City Example

From Fig 1. you need to calculate the capacities:  C_main, C_b, C_s, C_hy, C_ny, C_sj, C_he, and C_p. Their capacity should be expressed in Gigabit/s or Megabit/s [10^9 bit/s or 10^6 bit/s]. I want to be able to trace how and why you got to the capacity values. 

Note again, NO GAMES are run on the main site. Furthermore, the main site is only used to DOWNLOAD player data to the game servers on the launch of one game. 

The game 'sequence' is as follows. A player decides to start a game (they have already downloaded what they need to their computer, outside of assignment). Then the application on their device contacts some entity that directs them to a game server in a city. Normally that would be the city closest to them, but if that does not have any available servers, it would be directed to some other city with capacity, that MAY or MAY not be within the same region. HOW is outside the assignment scope. We assume that the game intensities are equally distributed across the day, so that there will be enough players to launch games. Once 1000 players have signed up, the game server will download 100Kbytes of data for each player, i.e. 1000*100Kbytes=97.66 mebibytes [MiB, 1024^2 bytes]. This download should be completed within 10-20 seconds,  this way, the game launch time can be identical in all cities, and not depending on the number of games or players in the city.

Note KBytes, or KiBytes is a volume value, NOT a rate value, cf kbps (kilo bit/s). You cannot just go from one to the other, i.e. 100Kibytes != 800Kibps. 

Looking at Fig 2. the arch, this is a bit more complicated. At the bottom of the figure, we have the physical servers running the game servers (3 games/server). You are free to decide if they are connecting using 1GE or 10GE, if they have 1 or 2 ports. Each server needs to be connected to one, or more, network devices in the first 'row' above it. I.e Network Device(ND) {row}.{device}, i.e ND 1.x. Here you are free to choose the number of ND's, as this depends on the 1) number of servers, 2) the device you selected to use, and, 3) the architecture you use wrt. capacity and redundancy. What I want to read in the report is a) what device/s will you use as ND, b) how many ports will connect to servers, and at what speed. c) How many ports will act as uplink, and at what speed? 

Depending on your architecture and the device you selected to use in the first two, you may need a second row, ND 2.x, or even a third ND 3.x (not shown here) row. Just like the first row you need to a) what device/s will you use as ND  {row} device, b) how many ports will connect to ND {row-1}, and at what speed. c) How many ports will act as uplink, and at what speed? 

As we move further up, we reach the City Gateway/Routers and ISP's, here it's sufficient that you identify a) how many routers you will use, b) how many ports will connect to ND row below, and at what speed and c) How many ports will act as uplink, and at what speed? IF you planned to have a VPN connection from a router, DO not use it. If a router fails, it is useless, and cannot forward any traffic across the VPN. Such a catastrophic failure has to be handled externally, probably involving the players launching a new game. 

Wrt. ISPs you need to specify how many you use, what link capacities you buy from each of them. Note that the sum of their capacities should match or exceed the capacity that you calculated before. 

The last item that I want to see described in your report is regarding monitoring. Please follow the KISS principle;  I want to know a) what parameters, metrics, etc. you will monitor, and b) why. It’s not enough to say that we will install <pick your favorite> and it monitors our systems. 
Of course,  the parameters/metrics should have a direct connection or relationship to the game service that we are trying to deliver. I.e. monitoring the link utilization or bandwidth is  NOT sufficient. 

So to summaries what I want in your report

  1. No repetition of the assignment text, there is no need for you to re-explain, re-iterate, or repeat the task. 
  2. Two architecture maps, in concept similar to the examples in Fig 1 and Fig2. 

    1. Here Fig 2 can be multiple figures, one per city. 
  3. Traceable calculations for the cities and the main site,  C_main, C_b, C_s, C_hy, C_ny, C_sj, C_he, and C_p, expressed in Gigabit/s or Megabit/s. 
  4. For each city and the main site

    1. traceable calculations on how many physical servers will be used,
    2. Depending on the architecture,

      1. the number of ND's, 
      2. what device you use

        1. how many ports connect to servers and at what speed
        2. how many ports act as uplink and at what speed
        3. how many (if any) port acts as something else, and if so what, and what speed
      3. The last two (4.2.1 and 4.2.2) may be repeated depending on your architecture (this is 4.2.3). 
    3. What link capacities that are purchased from one or multiple ISPs. Note that you can only purchase links at the following speeds 100Mbps, 200Mbps, 400Mbps, 800Mbps, 1Gbps, 2Gbps, 4Gbps, or 8Gbps.
  5. Wrt. Monitoring, what will you monitor and why?
  6. Growth, explain how and where growth would be handled if we need to add 25 games to each city in Europe, 9 games in all other cities. I.e. we add 50 games to Europe, 18 games to NA, and 27 games to Asia. 

As to speed up the review & feedback process, the next submission will require each to have a 15 min chat with me, where we discuss your report. I.e. the procedure will be that you upload your report, then you book a slot for discussions. No need to book a slot, if you haven't uploaded the report. Reviews will occur after the 15th of December but before Christmas break (details to come later). 


Related Questions in engineering category