Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_This C language Homework Writing Operating System Service requires students to explore the proc virtual file system, generate and display the process tree through system calls and dynamic memory allocation.
CSC209H: Software Tools and Systems Programming
The goal of this assignment is to practice system calls and dynamic memory allocation. You’ll be writing a program that explores the /proc virtual file system to generate (and display) a tree of the processes (the executing programs) on the system. To do so, you will need to dynamically allocate and manipulate nodes in a tree structure.
Please remember that we are using testing scripts to evaluate your work. As a result, it’s very important that (a) all of your files are named correctly and located in the specified locations in your repository and (b) your output matches the expected output precisely.
Pull your git repository. There should be a new folder named a2. All of your code for this assignment should be located in this folder. The starter code includes:
Like Assignment 1, I recommend that you start by doing a walkthrough of the Background section and then, as you explore, create tests that you can use to verify your program. You may want to create a few proc‑like directory trees that you can create for testing. Then, you can write a bash script that executes your code on the tests so that you can check the output.
Your Makefile contains an option to help you with testing. Open it with a text editor. You should see two lines that define FLAGS, one of which is commented out:
FLAGS = -Wall -Werror -std=c99 # FLAGS = -Wall -Werror -g -std=c99 -DTESTIf you with to run tests using your tests/ directory, rather than the actual /proc directory for the system, comment out the first line and uncomment the second. Then, when you compile your code, your program will look for process information in the tests/ directory rather than in /proc. You can add additional examples to your tests/ directory. We have included the processes necessary to reproduce the example at the top of this handout.
Important Note: The tests in the tests/ directory are likely to have small differences from the entries in /proc. For example, the exe file in our tests is a regular file, rather than a link, and none of the cmdline files include anything other than just the executable name.You should also test on a linux machine with /proc, but it’s useful to have a clean, replicable example to test on. The tests directory will also be useful if you are not yet writing your code on Linux, but at this point you should be using Linux, either on your own machine or SSH’d into one of the lab PCs. In this assignment, stopgap solutions such as using Bash on Windows may start to reveal their limitations; also note that macOS does not have a /proc directory since it is based on BSD, which uses a different interface
While creating new tests—or while looking for real processes in /proc to test with—you’ll find the ps command useful (see man ps ). In particular, if you type ps -u your_username , you will get a listing of the processes running on the system that are owned by you. You can open up a second terminal and run commands or open a web browser to get additional processes to test.
A process is an executing program. Every time that you run a program on the command line, you are creating a process. The processes on a Unix system are naturally organized in a tree structure, since each process has a parent. When you execute a program in a shell (on the command line), the shell process is the parent. The shell process also has a parent (which is often the login program that created the shell). Each process knows the identify of its parent as well as some information about itself.
The operating system must track processes to make sure that they are getting the resources they need. For many flavours of Linux, the information about processes is stored in the directory /proc. /proc is very special in that it is a virtual filesystem. It doesn’t contain “real” files; rather, it contains runtime system information that can be accessed as if it were stored in files (recall the Everything is a File UNIX design feature that we discussed in class). Many system utilities, including the ps utility discussed earlier, simply access data from the “files” in this directory. Your task is also to access the data in /proc to get information about the processes running on the system.
Every directory in /proc that we care about is a number. (There are quite a few non‑numeric directories. Those refer to mounted devices, memory information, and other system data. We’ll ignore them for this assignment.) That number is a process ID (PID), which is a unique identifier for a process on the system. Every process has a PID, and every process running on the system has an entry in the /proc directory. In this document, we’ll refer to the entry for a single PID as the directory /proc/[PID].
You will need to access three files inside of /proc/[PID]: exe, cmdline, and children:
Protip: You may want to check man proc and see the description of /proc/[pid]/cmdline to understand how it is formatted. It may not hurt to read other useful parts as well, like /proc/[pid], /proc/[pid]/exe, and so on.
In addition, you may find the file /proc/[PID]/status to be useful for verification purposes. It contains the parent process’s ID (PPID). Take a few minutes to log onto a Linux machine and browse through its /proc directory. Use cat and grep to view and search through
these files to make sure you know how they are formatted, so you can begin to think about extracting information from them. You’ll likely find that you don’t have access to the contents of one file; you can check if that link exists but cannot open the file it refers to (why?). You’ll find that the other files are sometimes empty (what does that suggest?).
To access the data in /proc, you’ll need to use a number of system calls that access the filesystem. A filesystem is an operating system module that encapsulates the data structures used to track files on a disk and the functions used to manipulate those data structures. For this course, we don’t need to know much about those structures to use files (but just wait until CSC369!). Instead, we simply need to understand the system calls that the operating system exposes for users to access the file system and the macros that C provides to make using those system calls easier.
Here is a set of system and library calls that we think you’ll need for this assignment. You should look each of them up with man, so you know what they do and how to use them. For accessing the filesystem, we expect you will need fopen, fread, and fclose to access and read data from files. You will need lstat to determine that a file exists without opening it. You will also need the library call malloc to dynamically create nodes for the tree structure you will be creating to track process information. You may find the call snprintf (note that this is the safe variant of sprintf—refer back to the week 4 slides to understand why this is important) helpful for
creating filenames, and various functions from the string library like strchr and strtok_r (the latter is less error‑prone to use than its ancestor strtok) helpful for parsing the contents of files. For any string functions that you use, ensure that you are using the safe variant, and use #define macros to keep track of your string lengths—there may be marks assigned to this. This is both for the purpose of getting you accustomed to safe programming practices, and to save you the inevitable debugging headaches that will arise from the buffer overflows when you use unsafe functions.
You may not use the system function or functions like it that allow you to execute commands on the shell. The intent of the assignment is to have you working with system calls—not to use shell programs that will invoke those system calls for you.
You must write two functions, generate_ptree and print_ptree, that are located in ptree.c. You must also write code in print_ptree.c to handle the expected command line arguments and to emit appropriate errors.
Your program should emit an appropriate error message (which we are not specifying) to stderr and return 1 if an error is detected in the command line arguments used. The arguments may be incorrect if an insufficient or too many arguments are provided, or if an option is provided that is not supported. The only option we will support is -d. Your program should return 0 if it is able to run without error. If there is some sort of error that is not a command line error, return 2 (for example, if a library call or generate_ptree were to fail).
The expected command line usage is already provided in the starter code. The program takes one optional flag, -d, which requires an argument. This sets the maximum depth of the process ID tree to be displayed. If the depth is not provided, it is assumed to be 0 (which means “print the entire tree”). The program requires one argument, a process ID, which must be an integer.
While it’s not required, it’s strongly recommended that you use getopt (see Parsing program options using getopt in the GNU C manual or man 3 getopt ) to parse the command line.
You should complete the print_ptree function before the other required function, since you can test it by manually creating small trees for it to print. (See the starter code file test_print.c for an example of how to test the print function separately.) Then, you can test the other function, generate_ptree, by printing the trees it generates.
This function should be recursive. It accepts the root of a tree, as described in the section about generate_ptree and a depth
argument. The function prints (to stdout) the tree to the desired depth, not including nodes at the specified depth, i.e., nodes from depth 0 to the specified depth are printed, not including the latter endpoint. If the depth argument is 0, then the entire tree should be printed.
The function will print one process per line. The root of the tree will be printed flush to the left. Each child process will be indented two spaces to the right with respect to its parent. Child processes should be displayed in the order in which they occur in the list of children. Each line will contain the process’s ID (its PID), a colon and a space, and the name of the executable. If the executable is not named, then the line will just contain the PID (no colon, no space, no executable name).
Here is the output of your function on a tree encoding the example in the a2/tests directory in your git repository:
This function may need a recursive helper function. It takes a PID as an argument and creates a tree that describes that process and all of its children. The root of the tree is “returned” by assigning it to the parameter root. The entire tree, including the name strings in each tree node, should be dynamically allocated using malloc, and it must be built as described below.
Each node in tree is a struct TreeNode. It contains the PID of the process it is describing, as well as the name of the executable associated with that PID. If there is no executable name (from /proc/[PID]/cmdline), then the name pointer should be NULL. The node also contains pointers to build a linked list. This linked list is used to store the children of this node. Why do we use a list for the child nodes?
In CSC148, you probably built binary trees: trees with two child nodes. In those cases, it makes sense to create links to just those two nodes (often named “left” and “right”). However, in this case, each node in your tree can have an arbitrary number of children: it’s not binary. Therefore, we have to use a different method for storing children: a list. If the child_procs pointer has value NULL, then the node has no children. If child_procs is non‑null, then the first element of the list is the first (left‑most, if you prefer) child. The node it points to, using the next_sibling pointer, is the second child. If the next_sibling pointer is NULL, then the node is the last child of its
parent. The children should appear in the list in the same order as they were observed in the children file.