Learning Objectives
Objectives
To practice with:
. structs
. binary file I/O
. data representations
. dynamic memory allocation
. random numbers
. makefile
. development tools `gdb`, `valgrind`, `git`
For this project, you will be working with images in a PPM formt (described below). You will need to view images directly on the ugrad servers, which requires a few extra steps: 1. Set up X-tunnelling by installing XMING on Windows or XQuartz on Mac (Reach out to us if you need help to set this up). 2. Once you have the appropriate program running, enable X-tunnelling when you connect to the ugrad server: use -Y on a Mac when you `ssh` into ugrad or with Putty, simply enable the X-11 forwarding option before connecting. 3. `feh` is a very simple command-line image viewer. It is available on ugrad machines and you can simply run the program with the name of an image file as a command-line argument, and it will [slowly] display the image on your screen. e.g.
$ feh myimage.ppm
Please note that if you are using emacs while X-tunnelling is enabled, you will have to run it with the command `emacs -nw` to still run it in the terminal ("nw" stands for "no window".) Other in-terminal editors may have similar options. Note: Before connecting to ugrad and running `feh`, make sure either Xming or Xquartz is running and x-tunneling is enabled as described above.
If you are using a different platform, you are welcome to use an image viewer of your choice; feh is easy to install using most linux package managers, but there are other open source image viewing programs, as well as alternatives for Windows and MacOSx.
Program Description
This program will be an image processing program, in the vein of Photoshop. It will have a command - line-based user interface (UI), so there will be no graphical interface, and the range of operations will be limited, but the algorithms you will use are similar to the ones used in programs like Photoshop or GIMP.
At a basic level, your program will be able to read image files from disk (ie, the file system), perform one of a variety of image processing tasks, and then write the result back to disk as a new image file. Since your program will not have a GUI, you will use external programs to view the images. If you are on ugrad (either locally, or remotely with X-tunnelling), you can use the program feh.
While there are many formats for storing image files, your program will only need to read and write one, the PPM format. This is essentially the simplest and easiest format to read and write, which is why it was chosen; its main drawback is that it does not use any kind of compression, so images stored in this format tend to be on the large side when compared to formats like JPEG or PNG or GIF. An implementation to read PPM files is provided for you. However, you will need to write the corresponding function to write to a PPM file format. (See ppm_io.h and ppm_io.c in the starter code.) Starter Files
Make sure to do a `git pull` on the public repo before starting to work to get the starter files for this project. You must work with the starter files!
PPM image format
For this assignment, we will use a very simple image-file format called PPM. It’s an uncompressed format, meaning that the images will take up a lot of disk space (compared to JPG or PNG files), but it’s very easy to read and write from C code (which is why we’re using it). For the formal “official” description of the PPM format, see the netpbm site. Because these PPM files can be very large, be careful not to fill up your ugrad storage quota with too many cat pictures …
'convert' Command
NOTE: you can use a unix program called `convert` to convert between image formats; e.g. to convert
an existing file called "selfie.jpg" into a PPM, you would type:
$ convert selfie.jpg selfie.ppm
This works for most image format file extensions; it converts to/from most known image formats, including .jpg, .gif, .png, .ppm, .tiff, and .pdf, and is installed on the ugrad machines. If it's not installed on your local machine (or virtual machine), most linux package managers can install it (or can install ImageMagick, which is the suite of tools that `convert` is part of).
The PPM format itself is pretty simple (compared to most other image formats). Basically, at the top of the file will be a special “tag” that marks the file as a PPM; this should be P6. Then, there are three numbers, separated by whitespace; these numbers represent the size of the image as columns, rows, and colors. Columns and rows specify the width and height of the image (in pixels) respectively. (BEWARE: columns come before rows in this format!) The colors value encodes how many different shades of each color a pixel can take on; for this assignment, this number must always be 255 (you must reject any image that uses a different value, but you’re unlikely to encounter one). Immediately after the 255, the binary data encoding all the image pixels will begin.
Optionally, there may be lines starting with a #, which are comments and should be ignored; these may be intermixed with the above information. You don’t need to store these; if you read a file and then re-write it, it’s fine if the comments get lost. The files we test your code with, however, will have either 0 or 1 comment lines just after the P6 tag, but no comment lines between the other header values (see trees.ppm in the course public repo for an example).
All of this will be ANSI text, so you can use the normal text I/O functions (e.g. fgetc(), fscanf(), fprintf() etc.) to read/write the header information.
After the color size specification, there will be a single whitespace character (usually a newline, but that’s not guaranteed), after which the remainder of the file will be the actual pixel values. Basically, each “pixel” consists of three values; the first value is the “red” channel, the second value is the “green” channel, and the third value is the “blue” value. Taken together, these three values specify a single color, which is the RGB color value of that pixel. Since the max color value is 255, each of these values will be in the range 0-255, which fits exactly in one byte of memory. For more information about RGB color codes see Wikipedia.
The easiest way to read the pixel values is to create a struct that contains three unsigned char variables, one per color channel. Then, create an array of your pixel structs with rows * cols elements. At that point, you can just use fread() to read the entire array of pixels from the file in one go. Similarly, you can use fwrite() to write the whole pixel array with a single function call. We’ve started this off for you in the provided ppm_io.h and ppm_io.c files.
Your first coding task for this project is to write a few of the functions in the ppm_io.c implememtation file:
. write_ppm - function to write from an Image variable to an external file in the PPM format
. make_image - function to allocate memory for an Image of a specified size
. free_image - function to free the dynamically allocated memory for an Image
We have provided the implememtation of read_ppm (function to read a PPM formatted file into an Image), along with the struct definitions this project must use.
Operational Overview
Your program will be a command line tool, always run with the name of the executable file project followed by (minimally) the name of an input PPM file, the name of a desired output PPM file, and the specific lower-case name of an image processing operation, as listed below. Some operations require additional arguments, which will also be supplied at the command line by the user, at the end of the line. There are no prompts and no input entered by the user interactively.
First Two Commandline Args
Regardless of the desired operation, the first two arguments after the executable name project are always interpreted as the input file name followed by the output file name. The next argument is always interpreted as the operation name, and the operation's arguments (if any) come after that.
The operations your program will be able to recognize and perform are all of the following, listed roughly in easiest to hardest implementation. The bolded words are the operation names to be entered at the command line by the user. More detail for each operation is provided below.
1. invert - invert the colors (i.e. black becomes white, white becomes black, etc.)
2. crop - crop the input image given corner pixel locations
3. zoom_out - zoom out on an image
4. binarize - convert the input image to black and white by thresholding
5. pointilism - apply a pointilist filter to the input
6. blur - blur the image using a Gaussian filter with a prescribed standard deviation sigma For example, at the command prompt, a user of your program might type:
$ ./project building.ppm building_crop.ppm crop 50 50 500 500
to crop the input image building.ppm (in PPM format) and output the cropped image to building_crop.ppm, where (50, 50) and (500, 500) specify the top-left and bottom-right pixel locations of the cropped region.
For another example, at the command prompt, a user of your program might type:
$ ./project trees.ppm trees_blur.ppm blur 3
to blur the input image trees.ppm (in PPM format) and output the blurred image to trees_blur.ppm, where 3 specifies the standard deviation of the Gaussian filter.
Once you implement the missing ppm_io.c functions, you can then checkout the provided demo program checkerboard. Compile the demo program by running make checkerboard; an executable checkerboard should be generated. The program demonstrates how to use ppm_io to read and write PPM formated files, and also shows how the struct Pixel and struct Image are used.
Compare
In the starter code, we also provide a helper executable `img_cmp`, which you could run on ugrad machine to compare if two PPM files are the same up to a tolerance. It's usage is:
$ ./img_cmp PPM_file1 PPM_file2 [tolerance = 0]
The program takes two PPM files with the same dimension and compares them pixel by pixel. It counts how many pairs of pixels are within the given tolerance. A pair of pixels (with same row and col indices) of two images is said within the tolerance if each absolute difference of their three channel values is less than or equal to the tolerance. For example, if you run:
$ ./img_cmp checkerboard1.ppm checkerboard2.ppm 5
it shou|d te|| you how many pair of pixe|s in 、checkerboard1.ppm、and 、checkerboard2.ppm、have an abso|ute difference of more than 5 (i.e. intensity difference).
After you have reviewed checkerboard.c and ppm_io.h, and comp|eted the ppm_io.c fi|e, as an initia| test to be sure you’re on the right track, try to read in a PPM fi|e and write it out unchanged. Use the img_cmp program to verify the two fi|es are exact|y the same (to|erance 0). Once this works we||, begin successive|y working through the operationa| commands as |isted.
Scaffolding Folder
The scaffo|ding (i.e. starter code) fo|der for this project (avai|ab|e in the pub|ic repository) provides you with ppm_io.c, ppm_io.h, checkerboard.c, project.c, img_cmp.c, and a Makefi|e for the project. It a|so contains some testing PPM fi|es in a fo|der named data and some expected resu|ts in a subfo|der named resu|ts, which is in the PPM format. Last|y, we provide starter fi|es image_manip.h and image_manip.c which is where your imp|ementations of the various transformation operations shou|d be added.
Note
Note that the resu|ts disp|aying on this page are png versions. You shou|d use the provided PPM ones for comparison.
We encourage you to store the provided PPM images and a|| created images in a subfo|der of your own repository named data, to keep your images separate from your source code fi|es. You don’t need to submit any PPM fi|es to us; keeping them in a separate fo|der wi|| he|p you avoid accidenta||y inc|uding them.
Tip
If you're using the、data、subfo|der, we suggest you to execute your code from within the、data、fo|der by typing ../project, so you can refer to input fi|enames whi|e the program is executing direct|y as 、kitten.ppm、, rather than、data/kitten.ppm、, saving yourse|f the extra typing whi|e testing.
Implementation Details
This section contains detai|ed descriptions of the image processing operations that you wi|| imp|ement for this assignment. We use the fo||owing samp|e images for a|| the a|| examp|es/operations i||ustrated be|ow.
|
The original kitten image
|
|
The original trees image
|
Invert
Inverting color values is very straightforward; simply take the value of each component of each pixel, and calculate its “inverse” by subtracting its value from 255. If you apply the invert transform to the kitten.ppm and trees.ppm images, the results should be as shown below. If you invert that resulting photo, you should get the original photo back.
|
The inverted kitten image
|
|
The inverted trees image
|
Crop
Cropping an image is pretty common. For this operation the user must specify the two corners of the section they want to crop (ie, keep) - one inclusive and one exclusive. That will mean 4 integer values: the column and then row of the upper-left corner (both inclusive values), and the column and then row of the lower-right corner (both exclusive values). By looking at the differences between those values, you can calculate the size of the new image; this will let you allocate the correct amount of space for the pixel array. Once you’ve done that, you can just use a loop to go through the pixels of the specified region in the original image, and copy each component of each pixel to the new image. You should check whether exactly 4 additional arguments are provided for the cropping operation, and check if the specified corners are senseless or not. You should report appropriate errors.
If you crop the kitten.ppm image from (top col=200, top row=200) to (bottom col=300, bottom row=300), the result should have 100 rows and 100 columns and look like:
|
The kitten image cropped with 200 200 300 300
|
Binarize
To binarize an image into a black and white format, we use thresholding. Therefore, this operation will take an additional input parameter as a \(threshold\), which is expected to be an integer and in the range between \(0\) and \(255\) inclusively. In your program, you should check if there is exactly one parameter provided for the binarize operation. Otherwise, you should report an error. You also need to check if the input \(threshold\) is an integer or not, and check if it is a valid number between \(0\) and \(255\). If not, you should report an error.
To implement this operation, you will need to first convert each pixel to a grayscale version using the provided pixel_to_gray helper function. Then, you can calculate a single \(binary\) value by comparing the \(grayscale\) value with the \(threshold\) value. The \(binary\) value is set to \(0\) if the \(grayscale\) value is smaller than the threshold. Otherwise, it is set to \(255\). For each pixel, assign the same \(binary\) value to all three color channels of your output image.
For example, if you run the below command:
$ ./project kitten.ppm kitten-binarize-127.ppm binarize 127
the result should look like:
|
The binarized kitten image with threshold 127
|
If you run:
$ ./project kitten.ppm kitten-binarize-200.ppm binarize 200 the result should look like:
|
|
The binarized kitten image with threshold 200
|
Zoom_out
To keep things straightforward, you will only implement a single zoom out scale. In order to perform a zoom out, we take a 2X2 square of pixels in the input image and average each of the three color channels of the four pixels to make a single pixel. This means a zoomed out picture has half as many rows and half as many columns as the original image. However, note that the number of rows and/or columns in the input image might be odd. In this case we will simply discard the data in the odd
bottom row and/or odd rightmost column.
./project kitten.ppm kitten_zoom_out.ppm zoom_out will result in:
|
The zoomed out kitten image
|
and,
./project trees.ppm trees_zoom_out.ppm zoom_out will result in:
|
|
The zoomed out trees image
|
Tip
Make sure that your output image has correct dimensions; that is both the number of rows and number of columns in the zoomed out image is half the number of rows and number of columns in the original image. Also note that "zoom_out" is the corresponding input command line argument that we type in when executing this operation.
Pointilism
Pointilism is a painting technique that uses distinct colored dots. You can read about it here. In this part, we would like to apply a pointilism-like effect to an input image. In order to accomplish this, we randomly select a small set (3%) of the pixels in the input image and apply the effect to them. To do so, imagine that each randomly selected pixel is at the center of a circle with a random radius of between 1 and 5. Now, applying the pointilism effect comes down to coloring all the pixels that reside in that circle with the same color as the color of the randomly selected pixel which is at the center of the circle. This gives the input image a cool “painting-like” look by creating a set of small filled circles (i.e., dots) across the image, where each dot is uniformly colored with the color of the randomly selected pixel which is located at the center of that dot. Note that if a randomly selected pixel is near or at a boundary, then you do not need to apply the effect to the parts that may extend past the borders of the image.
For example, if we do:
./project trees.ppm trees_pointilism.ppm pointilism
the following will result:
|
Pointilism applied on trees image
|
Note 1
Make sure to apply the effect on only 3% of the total pixels of the input image. For instance, if the input image dimensions are 800x600, then you should apply the effect to only 14400 randomly selected pixels. Also, for each randomly selected pixel, use a random radius between 1 and 5 (i.e, 1 ≤ radius ≤ 5) when creating the filled colored dot.
Note 2
When implementing this operation, you will need to generate random numbers. Use the `srand` function to set the random seed according to the operation's `seed` parameter. To facilitate the testing and grading of your work, do not make any other calls to the `srand` function. When calling the pointilism function, always pass an argument value of 1 for the `seed` parameter.