Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Difficulty Gauge 1 = Easy 2 3 4 5 = Hard
Grading (30 points total) What You Need To Do
This assignment must be completed individually.
3 points off for each late day before waivers (waivers applied at the end of quarter).
Contact your instructor if you have any issues with the due date (e.g. illness, emergencies).
Remember: There will be questions on the material covered in this and future programming assignments in both quizzes and exams.
We will be testing each of the four (4) files you submit both independently and together in order to give you credit for what works. There are 30 points available as follows:
Up to 8 points (at our discretion) if the following files compile without warnings. This has to be a
meaningful attempt to write the code that meets the specifications of the program. Random C
statements, or empty files will not be awarded any points: csv.c (3 points), setorder.c (1 points), splitrow.c (2 points), wrtrow.c (2 points)
Up to 4 points (2 points each) for passing the two (2) tests in the text fixture.
Up to 18 points for passing the gradescope tests (these will only be run after the late deadline) these will test data test and error condition handling in both main and token.
Minus 6 points for one or more uses of array indexing in your code. So there should be no use of [ ] anywhere in your code except for the variable array definitions in main() - in the file csv.c. You must only use pointers in this assignment. This means [ or ] MUST NOT appear anywhere in the code except array definitions, even comments.
Minus 1 point for each submitted file that compiles with a warning (max of 4 points deducted). Minus 2 points at our discretion, for not following the C style guidelines: CSE30 C Style Guide
Need Help?
You are always welcome to go to either Instructor or TA office hours. Their hours are listed on both the Canvas calendar and the autograder calendar. In office hours, conceptual questions will be prioritized.
If you need help while working on this PA, make sure to attend labs hours with a tutor at:
https://autograder.ucsd.edu
You can also post questions on Piazza.
https://piazza.com/ucsd/spring2024/cse30_sp24_a00/home
Assignment – Writing a Data Extraction Utility and Pointer Practice
The goals of PA4 are:
1. To get practice writing code using C pointers, string arrays, arrays of pointers to strings, and working with basic command line arguments (argc, argv).
2. Learn about text files whose contents are in the Comment Separated Value (CSV) format.
3. Practice with C library routines for reading input one line at a time and converting strings to integer values.
4. Learn to develop and test individual code modules.
5. More practice in writing test data for a test fixture. You will need to do this as we only supply simple tests and much of your career writing software will include testing/debugging your code.
Overview of CSV File Format used in this PA
A common format for text files for use in moving data between computer programs is the CSV (comma-separated values) format. Spreadsheet programs like Excel and Google sheets use CSV formatted files to import and export the table data cells with other programs. Database systems use CSV formatted files to move data as part of the ETL (Extract/Transform and Load) processing. For more information on what the ETL process is (if interested) you can start with:
https://en.wikipedia.org/wiki/Extract,_transform,_load
There are no formal standards for the format of CSV files. In the list below are the characteristics of the CSV files you will be working within this programming assignment. Please read it very carefully.
In this writeup, a descending "b" is a single blank (space) and a descending "n" is a single newline .
1. A CSVfile stores tabular data in printable ASCII (see the table in the lecture slides).
2. Each line (or row) of a CSV file, called a data record, is always terminated (the last character in the
row) by a single newline n.
3. For this assignment you can assume that every row in the input data is properly terminated with a single newline.
4. There are no restrictions on the number of characters in a row. Caution: You need to be aware that a row may be truncated (the newline is missing) by your program due to buffer size limits when reading the row (discussed later in this writeup).
5. Rows in the input are numbered from the top of the file to the bottom of the file where the first row in the file is row number 1.
6. A valid CSV data record consists of two or more data fields (or columns) separated by a single delimiter.
7. Columns in each row are numbered from left to right in the row. The first column in the row is column number 1.
8. There are two types of delimiters that are used to separate columns in a row:
a. A single comma , character that separates two adjacent columns in a row.
b. A single newline n character. A newline in a CSV file has two (2) uses. It signifies both the end of the last column (rightmost column) in a row and the end of a row.
9. The data in a column starts with the first character right after the delimiter to the left and ends with the last character right before the delimiter to the right. For column 1, column data starts with the first
character in the row and ends with the last character right before the delimiter to the right. 10. The first row of a CSV file is the required header row which has two uses:
a. A description of what data can be found in each column (how to interpret the data). In other words, the header row is a guide on how to use data in each column. You do not need to consider how to use column data (including the header row) in this PA.
b. The number of columns in the header row also specifies the number of columns in the rows that follow it. For example here is a header row that contains three (3) columns. After reading the header row, we now know that each valid row in this CSV file always contains only three (3) columns.