Project I: System Calls and Error Handling

This project is a warm up assignment for the course. The basic concept is very simple: to write a program that copies a file from one place to another. However, the main challenge of engineering operating systems is dealing with errors and unexpected conditions. Thus, the main focus of this assignment will be on the correct handling of errors. The goals of this project are:
  • To review your knowledge of basic C programming.
  • To learn the most essential Unix system calls.
  • To gain experience in rigorous error handling.

Essential Requirements


You will write a program called copyit which simply copies a file from one place to another. The program will be invoked as follows:
copyit SourceFile TargetFile
copyit must create an exact copy of SourceFile under the new name TargetFile. Upon successful completion, copyit should report the total number of bytes copied and exit with result zero. For example:
copyit: Copied 38475 bytes from file foobar to bizbaz.
If the copy takes longer than one second, then every second the program will emit a short message:
copyit: still copying...
copyit: still copying...
copyit: still copying...
If copyit encounters any kind of error or user mistake, it must immediately stop and emit a message that states the program name, the failing operation, and the reason for the failure, and then exit with result 1. For example:
copyit: Couldn't open file foobar: Permission Denied.
copyit: Couldn't write to file bizbaz: Disk Full.

If the program is invoked incorrectly, then it should immediately exit with a helpful message:

copyit: Too many arguments!
usage: copyit <sourcefile> <targetfile>

To carry out this assignment, you will need to learn about the following system calls:
open, creat, read, write, close, strerror, errno, exit, signal, alarm
Manual pages ("man pages") provide the complete reference documentation for system calls. They are available on any Linux machine by typing man with the section number and topic name. Section 1 is for programs, section 2 for system calls, section 3 for C library functions. For example man 2 open gives you the man page for open. You can also use this online service which has the same information.

As you probably already know, man pages are a little difficult to digest, because they give complete information about one call, but not how they all fit together. However, with a little practice, you can become an expert with man pages. Consider the man page for open. At the top, it tells you what include files are needed in order to use open:

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
It also gives the the prototype for several variations of the system call:
int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);
int creat(const char *pathname, mode_t mode);
The documentation contains many more details than you need, but it is usually a good bet to read the first paragraph, and then skim ahead until you find what you are looking for. It is also always fruitful to read the RETURN VALUE section, and to use the SEE ALSO section to look up related system calls.

signal in particular is a little confusing, so it's easiest if I give an example. signal tells the operating system that you want a particular function to be called whenever a signal arrives. We are going to use the SIGALRM alarm signal to trigger the "still copying" message each second. So, you need to write a short function that displays a message when the signal arrives:

void display_message( int s )
{
	printf("copyit: still copying...\n");
}
Then, inside of main, use signal to connect the function to SIGALRM:
#include <signal.h>
signal(SIGALRM,display_message);
Now, read the man page for alarm, which arranges for the signal to be delivered in a certain amount of time. Note that you will need to call alarm more than once in your program...

Handling Errors


The basic outline of your program will look like this:
open the source file
create the target file
loop {
        read a bit of data from the source file
        write a bit of data to the target file
}
close both files
print success message
In practice, it is going to be more complicated.

An operating system (and any program that deals with it) must be vigilant in dealing with errors and unexpected conditions. End users constantly attempt operations that are illegal or make no sense. Digital devices behave in unpredictable ways. So, when you write operating system code, you must always check return codes and take an appropriate action on failure.

How do you know if a system call succeeded or failed? In general read the RETURN VALUE section of the manual page, and it will tell you exactly how success or failure is indicated. However, nearly all Unix system calls that return an integer (open, read, write, close, etc.) have the following convention:

  • If the call succeeded, it returns an integer greater than or equal to zero.
  • If the call failed, it returns an integer less than zero, and sets the global variable errno to reflect the reason for the error.
And, nearly all C library calls that return a pointer (malloc, fopen, etc.) have a slightly different convention:
  • If the call succeeded, it returns a non-null pointer.
  • If the call failed, it returns null, and sets the global variable errno to reflect the reason for the error.
The errno variable is simply an integer that explains the reason behind an errno. Each integer value is given a macro name that makes it easy to remember, like EPERM for permission denied. All of the error types are defined in the file /usr/include/asm/errno.h. Here are a few of them:
#define EPERM            1      /* Operation not permitted */
#define ENOENT           2      /* No such file or directory */
#define ESRCH            3      /* No such process */
#define EINTR            4      /* Interrupted system call */
...
You can check for specific kinds of errors like this:
fd = open(filename,O_RDONLY,0);
if(fd<0) {
	 if(errno==EPERM) {
		printf("Error: Permission Denied\n");
	} else {
		printf("Some other error.\n");
		...
	}
	exit(1);
}
This would get rather tedious with about 130 different error messages. Instead, use the strerror routine to convert errno into a string, and print out a descriptive message like this:
fd = open(filename,O_RDONLY,0);
if(fd<0) {
	 printf("Unable to open %s: %s\n",filename,strerror(errno));
	 exit(1);
}
With read and write, you must be especially careful. If you request to read count bytes of data like this:
int result = read(fd,buffer,count);
There are several possible outcomes. If read was able to access some data, it will return the number of bytes actually read. The number might be as high as count, but it could also be smaller. For example, if you request to read 4096 bytes, but there are only 40 bytes remaining in the file, read will return 40. If there is nothing more left in the file, read will return zero. If there is an error, the result will be less than zero, as above. write operates in a very similar way.

There is still one more wrinkle. If your read or write operation is interrupted by a signal (e.g. the "still copying" message), the operating system will abort it prematurely. The call will return failure and set errno to EINTR, indicating that it was interrupted. In this case, the call should simply be tried again without changing any thing.

So, your program should look like this overall:

set up the periodic message

open the source file or exit with an error
create the target file or exit with an error

loop over {
     read a bit of data from the source file
     if the read was interrupted, try it again
     if there was an error reading, exit with an error
     if no data left, end the loop

     write a bit of data to the target file
     if the write was interrupted, try it again
     if not all the data was written, exit with an error
}
close both files
print success message

Testing


Make sure to use the "-Wall" flag for compilation to catch all warnings and remove all warnings before you submit your code. Make sure to test your program on a wide variety of conditions. Test small files and big files. Make sure to copy a really big file to make sure that your "still copying" message works and the copy recovers correctly from EINTR errors. How do you know if the file copy worked correctly? Use the program md5sum to take the checksum of both files, and double check that it matches:

% md5sum /tmp/SourceFile
b92891465b9617ae76dfff2f1096fc97  /tmp/SourceFile
% md5sum /tmp/TargetFile
b92891465b9617ae76dfff2f1096fc97  /tmp/TargetFile

Grading


Your grade will be based on:
  • Correct functioning of the file copy: 50%
  • Correct handling of all error conditions: 40%
  • Good coding style, including clear formatting, sensible variable names, and useful comments: 10%

Turning In


Turn in a source file named copyit.c and a makefile to your dropbox directory, which is:
/afs/nd.edu/coursesp.16/cse/cse30341.01/dropbox/YOURAFSNAME/project1
This assignment is due at 5PM on Tuesday, February 2nd. Late assignments will not be accepted.

Your program will be compiled and graded on the Linux machines in the Fitzpatrick computer lab. Therefore, you should do your development work either sitting in that lab or using ssh to connect to the machines remotely. The TA will hold office hours in the lab and will be happy to help you with those machines.

If you insist on doing the homework on your personal computer or laptop, then you are on your own. Please do not ask the TA to fiddle around with your personal computers. Leave extra time to ensure that your program works correctly when transferred to the Fitzpatrick machines.

Extra Credit


For ten percent extra credit, write a second version of copyit that copies directories recursively. Turn in this version as copyit_extracredit.c.