Computer Communications Spring 1998
Projects give you hands-on practice with network administration, protocols, and programming. The grading method for each project is described along with the detailed instructions for the project.
The purpose of this project is to demonstrate concurrent message processing and switching using sockets and datagrams. Your task is to write a program that forwards input datagrams to output ports. The program will have two input ports for data and a third input port for control. A diagram for the program follows.
|
|
|
|
V
+----------------------------+
| (*) |
------------> | (a) (b) | -------------->
| |
| |
------------> | (c) (d) | -------------->
| |
+----------------------------+
The three input ports in the diagram are (a), (b), and (*). Ports (a) and (b) are data ports -- data to be forwarded arrives on these two ports. Port (*) is a control port -- datagrams arriving at (*) control program behavior.
Initially, datagrams arriving at (a) are forwarded by sending the data on port (b), whereas datagrams arriving on port (c) are forwarded through port (d). Whenever the program receives a datagram on port (*), the forwarding is ``switched'' in the following sense. The first datagram received on (*) causes the program to switch its forwarding behavior: data received on (a) will now be forwarded through (d) and data received on (c) will now be forwarded through (b). This new forwarding behavior persists until another datagram arrives to (*). The second datagram received on (*) causes the data forwarding to revert back to the original (a)->(b) and (c)->(d) mapping. In general, each datagram received on (*) toggles the mapping of input data ports to output ports. The program thus functions as a kind of crossbar switch.
There is no fixed order of data arrival on the three input ports -- datagrams can arrive at any time on these ports. This is a problem because if a program attempts to read() or recvfrom() one of these three ports while there is no data available, then the program waits for data -- and the program can be stuck. For instance, suppose the program uses recvfrom() on the port (a) and therefore waits for data to arrive on (a). While the program is waiting, a datagram arrives on port (b), but the program will not read this data because it remains waiting on port (b)! How can we resolve this? One technique would be to use fork() and devote concurrent processes to each input port. But we use a different -- and simpler -- strategy for this project. Unix provides a system call to determine whether or not your program would wait on a port before your program attempts to read() or recvfrom(). This system call is the select() call, documented by the man select man pages.
The unix documentation for select() is not very thorough or helpful to get started programming, so it is best to see an example. The udpserv3.c program shows a version of the UDP server used in earlier homeworks and projects, but expanded to have five input ports. This program will wait simultaneously on these five input ports and only read from one of them when data is available. Therefore the program does not get ``stuck'' as in the example above. You should use the same select() technique (with FD_SET, FD_ZERO, and FD_ISSET) in your code for the project.
Call your program relay.c. It has three fixed port numbers for the input ports: 5012 is the toggle control input port and ports 5013, 5014 are the data input ports. The relay program should have four command line arguments (argv[1] through argv[4]) that specify the output hosts and their port numbers. An example of the command syntax is:
relay localhost 6822 ox.cs.uiowa.edu 5689This example specifies that the two outputs from relay are directed to (1) the same machine running relay, but with port 6822, and (2) the ox machine on port 5689.
The maximum buffer size for all UDP datagrams should be 1025 bytes, but you are welcome to tune this constant to other values during debugging and experimentation with your program. To test your relay program, two other programs are available. The first program is called pump and it copies data from stdin to a specified UDP port destination. Two examples of the syntax for pump are
pump 0 localhost 5013 < mydataThe first example specifies that UDP datagrams be sent to localhost on port 5013, and the second example specifies port 5014. In each example, the input data is a file called mydata. The first argument, 0 in the first example and 1 in the second, specifies a delay time between sending datagrams. Please consult the source of the program pump.c for further details.
pump 1 localhost 5014 < mydata
The companion to pump is the sink command. The sink command copies UDP datagrams it receives to stdout. Two examples of syntax are
sink 6048The first example specifies that the input port for sink is 6048 and the output will be displayed in the window running the sink command. The second example specifies an input port 5114 and directs the output to a file named nudata.
sink 5114 > nudata
You can exercise pump and sink without the relay program, as follows. Suppose you have a data file called edata. Then try the commands
sink 5003 > fdata &This should copy the edata file to the fdata file. For debugging, it is often better to open two windows for such an example, for instance, one xterm can do the pump and another xterm does the sink. It is, of course, important to start the sink before the pump -- otherwise the pump will fail because it won't find the sink waiting and ready for the datagrams. The source code for sink.c is also available.
pump 0 localhost 5003 < edata
Using two concurrent pump programs and two concurrent sink commands, you can test the relay. However, to show the switching capabality, you would need a third program -- one that sends a datagram to the toggle control port of relay. This should be easy to do using something like the udpcli.c program we have seen before, but customized to use the appropriate port number. This part is left as part of the development and debugging phase of your project.
You can start work on developing the relay program immediately. However, you will need to also do some experiments for this project using relay in various settings.
To make significant tests of relay you will need to have sources of input and ways of evaluating outputs. The first useful tool for generating a source of input is the supply.c program. It generates 1K blocks of data. The syntax for the supply program is
supply m cwhere m is the amount of data you want to generate and c is the fill character for the data generated. For example, the command supply 8 M will produce 8KB of output consisting of the letter M repeated in 64-character lines (each line is actually 63 M characters followed by a newline byte). You can use supply in combination with pump as follows.
supply 38 a | pump 0 localhost 5792This example generates 38KB of 'a' characters, and pump will take this 38KB as input, sending it via UDP datagrams to port 5792. Read about the unix pipe facility (the | operator) in man csh if you have never seen this technique before.
It is also useful to test the output from sink rather than seeing it all on the terminal window or storing it in a file. We can use existing unix commands to do this. This example counts the number of bytes produced by sink from reading port 5792:
sink 5792 | wc -c(See man wc for an explanation of the wc command.) Suppose we want to count only the amount of data produced by sink with the 'j' character in each line. We can use grep to do this.
sink 5792 | grep j | wc -cIn this example, all the sink output was filtered by the grep command, which only let lines containing 'j' to pass on to the wc command, which in turn counted the number of bytes in all such lines.
It is useful to see all of this together in one example, such as
sink 5792 | wc -c &The first line starts the sink-wc combination, running it in the background (you can read about background mode in the man csh). The second line then starts the supply-pump combination. If you try this example, you should see that pump and wc report the same number of bytes. If this example is unclear, then try running the two lines in separate xterm windows so that the output from each command appears in its own window.
supply 64 z | pump 0 localhost 5792
This is basic experimentation with relay. The basic experiments involve no networking and can be done on one workstation, using pump, sink, relay, and udpcli communicating via UDP. The basic experimentation just confirms that your relay works as expected. Here are a few things to try:
sink 5100 | wc -c &
relay localhost 5100 localhost 5100 &
supply 1024 z | pump 0 localhost 5013
sink 5100 | wc -c &(This example may not do exactly what you expect!)
relay localhost 5100 localhost 5100 &
supply 15 z | pump 1 localhost 5013 &
supply 15 a | pump 1 localhost 5014 &
sink 5100 | wc -c &
sink 5101 | wc -c &
relay localhost 5100 localhost 5101 &
supply 15 z | pump 1 localhost 5013 &
supply 15 a | pump 1 localhost 5014 &
The relay program probably does not terminate automatically if you write it following the instructions. For the second experiment, add a time-out facility to your relay program. If the relay does not receive a datagram for 30 seconds, then it should terminate. So, for instance, after you use relay using pump and sink, the relay will automatically quit if you do not use it again with 30 seconds.
This experiment will test more intricate combinations of relay programs. Here are some things to try and questions answered by experiments.
sink 5100 | wc -c &What is the output from such an experiment? What percentage of the output received by the sink is from the first supply (the one providing z-characters)? Are these results repeatable, and what happens if you change the pump delay from 1 to zero? What if you have one pump with a delay of 1 and the other with a delay of 2?
relay localhost 5100 localhost 5012 &
supply 32 z | pump 1 localhost 5013 &
supply 32 a | pump 1 localhost 5014 &
__________ __________
| | | |
V | V |
+-------+ | +-------+ |
-->| |----+ | |----+
| | | |
-->| |------------------>| |-------->
+-------+ +-------+
If equal supplies of 'a' and 'z' characters pump into the first relay, what comes out of the second relay?
Experiment IV will do performance testing of the relay using the lab of PCs in 311. You will add some timing measurements to pump and sink programs, and test relay across the Ethernet with some large data sizes (perhaps 40MB or more).
The first experiment will be to use one pump, one sink, and the relay all running on different machines, and send a reasonably large amount of data, say 10-50MB. The idea is to measure the running time and establish a baseline for further experiments.
The next step is to reduce the flow control and repeat the same experiment. To reduce the flow control you will need to modify pump and sink so that, instead of sending an ACK for each datagram (and expecting an ACK for each datagram sent), these programs send and expect an ACK only once per K datagrams, where K is a parameter experimentally determined. Clearly, if K is extremely large the programs use very little flow control. However, if K is made large then your relay will need to have K buffers and memory is limited. The main programming challenge will be to have the relay manage buffers so that the pump can get ahead of the sink, but not to discard datagrams when there is a buffer shortage. Flow control is the answer, but just how much flow control you need is the question. Less flow control will mean fewer ACK messages and the total running time will be less. Document the results of your experiment.
Optionally, you may also want to test how the relay performs when there are two pump and two sink programs, each sending a large amount of data. If you perform this experiment, compare the results to your baseline experiment.
Each experiment should be documented, on paper, describing the experiment and its results. Also please turn in the program listings for different versions of relay that you write. Please remember to include your student number.
The general guidelines are: 250 points total (this project counts 2.5 times any of the other projects). About 100 points of the total will be for the basic programming of relay, about 20 points for Experiment I, about 40 points for Experiment II, about 30 points for Experiment III, and 60 points for Experiment IV.
In the grading for each experiment, points are awarded for correctness, thorough documentation of how the experiment was performed and what the results were, and an explanation of the results. Some points may also be given for creative questions and variations on the experiments.
This, the second programming project, is a simple illustration of the HTTP protocol. You will write a program that communicates with a web server, request a web page, read the web page, and count the number of 'j' characters in that page.
The protocol for this assignment is TCP, and the network programming is basically an adaptation of the tcpcli.c program you have used previously. What you may not already know is how the HTTP protocol functions.
To see how the HTTP protocol works, try the following:
telnet www.cs.uiowa.edu 80This command establishes a TCP connection between your shell and the CS Web Server at its well-known port, 80. The standard server port is 80 for the HTTP protocol. In response, you will see something like:
Trying...
Connected to www.cs.uiowa.edu.
Escape character is ']'.
GET /In response, you should see the CS Department's main web page. The connection will also break after returning the web page.
GET / herman/22C178/index.html(Warning! HTTP is fussy about spelling mistakes.)
The program you will write takes command-line arguments. For an example of how a program works with command-line arguments, compile and test the parms.c program using the commands
gcc -o parms parms.cA running example of the program you will write is here, which you can copy and run on the departmental HP machines (sorry, it won't run on the SGI machines). Here are some example tests:
parms this is a test
The webcli program is reading a web page, counting the number of 'j' characters in that page (it does not count any 'J' characters), and printing the total.% webcli www.cs.uiowa.edu Number of j's = 2 % webcli www.icaen.uiowa.edu Number of j's = 1 % webcli www.cs.uiowa.edu "~jones/index.html" Number of j's = 26
What to submit. On or before the due date, you should have written a program that behaves like the webcli program. When your program is ready, email it to me (herman@cs.uiowa.edu). The following guidelines are important:
Grading. The grading will be simple: we should be able to compile and test your program and see that it works.
This is the first programming project. Your task is to write the client program and communicate with a server process that has already been written and will be running until the due date of the project. The server program uses the following simple logic:
/* ----- subroutine validates student # ----------------------*/
int isValidStuId(int x) {
const int stuIds[55] = {
166, 289, 408, 557, 613, 814, 935,1044,1083,1226,1464,1561,1743,
1797,1975,2386,2400,2551,2613,2621,2703,2708,2797,2891,2990,3153,
3174,3655,3721,3735,3848,3850,3932,3945,4200,4243,4501,4580,4925,
5275,5308,6661,6871,7606,7710,7846,7907,7908,8144,8410,8978,9517,
9579,9884,9999,4622 };
int i;
for (i=0; i<56; i++) if (x == stuIds[i]) return 1;
return 0;
}
The isValidStuId should only return 1 if the datagram received supplies the last four digits of a student's ID registered in the course. If the datagram correctly has such a number, then the server writes a message to a log file containing the number it received.
The server generates a random number between 0 and 498. The server sends this random number to the client. The client reads this number, call it x. The client should then compute the polynomialTo end this procedure, the server can send the number 1000 to the client: this means the client should close the connection and stop.
(a*x*x*x) + (b*x*x) + (c*x) + d
where a,b,c,d are the final four digits of the student ID. For instance, if the last four digits are 4567, then the client will compute
(4*x*x*x) + (5*x*x) + (6*x) + 7
The client then sends the computed number back to the server. Since the server knows the student ID from the earlier UDP datagram, it checks the calculation. If the calculation is correct, the server writes a message to its log file. The server then may send another random number to the client and repeat the test again, as above. However if the calculation appears incorrect, the server sends the number 1001 back to the client.
The Project. Using the UDP and TCP sample programs given in homework assignments, write a client program that communicates with the server using your student ID.
What to submit. On or before the due date, turn in a listing of your program. Do not email me your program! . Make sure your name and the last four digits of your student ID number are given in the listing.
Grading. The grading will be figured from two items, the listing you submit and the server log. We can see if your program ran (or at least that some program correctly ran) from the server log. The server log may also be valuable for us to monitor your progress in completing the assignment.
Note: It is possible that the server can crash, either because it has a bug or because of some hardware problem. If you are in doubt about whether the server is properly running you can test it using the sample client program pr3cli which tests the server (sorry, this program only runs on HP machines).
We take another look at our friend ping in this project. Our goal is to estimate network bandwidth by trying different ping commands. For this project you will need to experiment and save results of those experiments, analyze the results of the experiments, and explain your results. To justify your conclusions, some background reading will be useful. Here are some online sources for material about the ping command.
chmod +x tcpshow
chmod +x tcpdump
tcpshow < simp.log | moreThis should show you a playback of the Ethernet frames actually observed during one ping execution.
The Project. The experimental part of this project is to use ping with different packet sizes. On a Linux system, the packet size is specified using the -s option, whereas on HP systems, the size is just a positional parameter (given just after the target address in the command). By testing different target addresses with different sizes, you should be able to deduce something about the bandwidth of your network.
What to Submit. The result of this project is a paper, no more than three pages, documenting your experiments and explaining the measurements you collect. It should answer the following questions.
Grading Criteria.
Your project grade will be based on thoroughness, accuracy, clarity of your explanations, and creativity. Roughly, the 100 points for the project will be:
This project is essentially an application of the information provided in Chapter 5 of the Linux Network Administrator's Guide. To complete this project you will use commands to make a network of three machines in the 311 MLH lab. Here is what you are expected to do:
You are welcome to practice in the 311 MLH laboratory. Instructions on how to use the laboratory will be given in the course FAQ (after someone asks me questions!). In addition, you may want to get some idea of what the /etc directory looks like in advance; an approximate image of this directory is available via this link. In addition, it may be useful to look at the Linux man pages for the following commands: