Skip to main content

IO Functions

C provides a number of functions for communicating with external devices, called input and output functions, or I/O functions for short. Import refers to getting external data, and export refers to passing data to the outside.

Caching and byte streams

Strictly speaking, the input and output functions do not communicate directly with external devices, but indirectly through a cache (buffer). This sub-section describes what a cache is.

Ordinary files are generally stored on disk, and reading or writing data to or from disk is a very slow operation compared to the CPU. The solution in C is to set up a buffer in memory for a file whenever it is opened.

When the program writes to the file, it puts the data into the cache first, and when the cache is full, it writes the data to the disk file in one go. At that point, the cache is empty, and the program puts new data into the cache and repeats the process.

When a program reads data from a file, the file first puts some of the data into the cache, then the program gets the data from the cache, and when the cache is empty, the disk file puts the new data into the cache and repeats the whole process.

Memory can be read and written much faster than disk, and the cache design reduces the number of reads and writes to disk, greatly improving the efficiency of program execution. In addition, moving large chunks of data at once is much faster than moving small chunks of data multiple times.

This read/write pattern is a bit like a stream for a program, instead of reading or writing all the data at once, it is a continuous process. First you manipulate a portion of the data, wait until the cache has finished gobbling up that portion, and then manipulate the next portion. This process is called a byte stream operation.

Because the cache is empty when it is finished, byte stream reads are only read once and not the second time. This is very different from reading a file.

C's input and output functions, where they involve reading or writing to a file, are all byte stream operations. The input function gets data from the file and operates on the input stream; the output function writes data to the file and operates on the output stream.

printf()

printf() is the most commonly used output function for screen output, the prototype is defined in the header file stdio.h, see the chapter Basic Syntax for details.

scanf()

Basic usage

The scanf() function is used to read the user's keyboard input. When the program runs to this statement, it stops and waits for the user to type from the keyboard. Once the user has entered the data and pressed the enter key, scanf() processes the user's input and stores it in a variable. Its prototype is defined in the header file stdio.h.

The syntax of scanf() is similar to that of printf().

scanf("%d", &i);

Its first argument is a format string with placeholders placed inside (essentially the same as those in printf()), telling the compiler how to interpret the user's input and what type of data it needs to extract. This is because C data is typed, and scanf() must know in advance what type of data the user is inputting in order to process the data. The rest of its arguments are the variables that hold the user input, and there are as many placeholders inside the format string as there are variables.

In the example above, the first argument to scanf(), %d, indicates that the user input should be an integer. %d is a placeholder, % is the sign of the placeholder, and d indicates an integer. The second argument, &i, indicates that the integer entered by the user from the keyboard will be stored in the variable i.

Note that the variable must be preceded by the & operator (except for pointer variables), because scanf() passes not a value but an address, i.e. the address of the variable i points to the value entered by the user. If the variables here were pointer variables (such as string variables), then the & operator would not be needed.

Here is an example of reading keyboard input into more than one variable at a time.

scanf("%d%d%f%f", &i, &j, &x, &y);

In the example above, the format string %d%d%f%f, indicates that the first two user inputs are integers and the last two are floating point numbers, e.g. 1 -20 3.4 -4.0e3. These four values are put into the four variables i, j, x and y in that order.

When scanf() handles numeric placeholders, it automatically filters for whitespace characters, including spaces, tabs, line breaks, etc. Therefore, one or more spaces between user-entered data does not affect the way scanf() interprets the data. Also, the user's use of the enter key, which breaks the input into several lines, does not affect interpretation.

1
-20
3.4
-4.0e3

In the above example, the user has split the input into four lines and the result is exactly the same as a single line of input. Each time the enter key is pressed, scanf() will start interpreting, and if the first line matches the first placeholder, then the next time the enter key is pressed, it will start interpreting from the second placeholder.

The way scanf() works with user input is that the user's input is first put into a cache, and when the Enter key is pressed, the cache is interpreted according to the placeholders. When interpreting user input, it starts with the first character left over from the previous interpretation and continues until the cache is read, or until the first unqualified character is encountered.

int x;
float y;

// user input " -13.45e12# 0"
scanf("%d", &x);
scanf("%f", &y);

In the above example, when scanf() reads the user input, the %d placeholder ignores the space at the beginning and fetches the data from -, stopping at -13 because the . is not a valid character for an integer. This means that the placeholder %d will read up to -13.

The second call to scanf() will continue reading down from where it stopped decoding the last time. This time the first character read is . , and since the corresponding placeholder is %f, it will read .45e12, which is the floating point format using scientific notation. The # that follows is not a valid character for floating point numbers, so it will stop there.

Since scanf() can handle multiple placeholders in succession, the above example could also be written as follows.

scanf("%d%f", &x, &y);

The return value of scanf() is an integer indicating the number of variables successfully read. If no items are read, or if the match fails, 0 is returned. If the end of the file is read, the constant EOF is returned.

Placeholders

The placeholders commonly used by scanf() are as follows, and are essentially the same as those used by printf().

  • %c: character.
  • %d: integers.
  • %f: float type floating point numbers.
  • %lf: double type floating point number.
  • %Lf: long double type floating point number.
  • %s: string.
  • %[]: specifies a set of matching characters in square brackets (e.g. %[0-9]), the matching will stop when a character not in the set is encountered.

All of the above placeholders, except %c, will automatically ignore whitespace characters at the beginning. %c does not ignore whitespace and always returns the first character, whether or not it is a space. To force whitespace characters to be skipped before a character, write scanf(" %c", &ch), which means that %c is preceded by a space to indicate that zero or more whitespace characters are being skipped.

The following is a special mention of the placeholder %s, which cannot really be simply equated to a string. Its rule is that it is read from the first non-whitespace character present until it encounters a whitespace character (i.e. space, newline, tab, etc.). Because %s will not contain whitespace characters, it cannot be used to read multiple words unless multiple %s are used together. This also means that scanf() is not suitable for reading strings that may contain spaces, such as book titles or song titles. Also, scanf() will store a null character \0 at the end of a string variable when it encounters a %s placeholder.

When scanf() reads a string into an array of characters, it does not detect whether the string exceeds the length of the array. Therefore, when storing a string, it is likely to exceed the bounds of the array, leading to unintended results. To prevent this, when using the %s placeholder, the maximum length of the read string should be specified, i.e. written as %[m]s, where [m] is an integer indicating the maximum length of the read string, and subsequent characters will be discarded.

char name[11];
scanf("%10s", name);

In the above example, name is an array of characters of length 11. The placeholder %10s in scanf() means that up to 10 characters of the user input will be read, and any subsequent characters will be discarded so that there is no risk of array overflow.

The assignment ignore character

Sometimes the user input may not conform to the intended format.

scanf("%d-%d-%d", &year, &month, &day);

In the example above, if the user enters 2020-01-01, the year, month and day will be interpreted correctly. The problem is that the user may enter another format, such as 2020/01/01, in which case scanf() will fail to parse the data.

To avoid this, scanf() provides an assignment suppression character *. As long as * is added after the percent sign of any placeholder, that placeholder will not return a value and will be discarded after parsing.

scanf("%d%*c%d%*c%d", &year, &month, &day);

In the example above, %*c is the assignment ignore character * added after the percent sign of the placeholder, indicating that this placeholder has no corresponding variable and does not have to be returned after decoding.

sscanf()

The sscanf() function is very similar to scanf(), with the difference that sscanf() takes data from inside a string, rather than from user input. Its prototype is defined in the header file stdio.h.

int sscanf(const char* s, const char* format, ...) ;

The first argument to sscanf() is a pointer to a string from which to retrieve data. All other arguments are the same as scanf().

sscanf() is mainly used to process strings read in by other input functions to extract data from them.

fgets(str, sizeof(str), stdin);
sscanf(str, "%d%d", &i, &j);

In the above example, fgets() first fetches a line of data from standard input (see the next chapter for a detailed description of fgets()) and deposits it into the character array str. Then, sscanf() takes two integers from the string str and puts them into the variables i and j.

One advantage of sscanf() is that its data source is not stream data, so it can be used over and over again, unlike scanf() whose data source is stream data and can only be read once.

The return value of scanf() is the number of variables successfully assigned, or a constant EOF if the extraction fails.

getchar(), putchar()

(1) getchar()

The getchar() function returns a character entered by the user from the keyboard and is used without any arguments. The program runs to this command and pauses, waiting for the user to type from the keyboard, which is equivalent to reading a character using the scanf() method. Its prototype is defined in the header file stdio.h.

char ch;
ch = getchar();

// Equivalent to
scanf("%c", &ch);

getchar() does not ignore whitespace at the beginning and always returns the first character currently read, whether it is a space or not. If the read fails, the constant EOF is returned, and since EOF is usually -1, the type of the return value is set to int, not char.

Since getchar() returns the character read, it can be used in a loop condition.

while (getchar() ! = '\n')
;

The above example exits the loop only if the character read is equal to the newline character (\n), and is commonly used to skip a line. The body of a while loop does not have any statements, indicating that no operation is performed on the line.

The following example calculates the length of a character on a particular line.

int len = 0;
while(getchar() ! = '\n')
len++;

In the above example, the length variable len is added by 1 for each character read by getchar() until a newline is read, at which point len is the length of the character on that line.

The following example skips the space character.

while ((ch = getchar()) == ' ')
;

The above example ends the loop with the variable ch equal to the first non-whitespace character.

(2) putchar()

The putchar() function outputs its argument character to the screen, which is equivalent to outputting a character using printf(). Its prototype is defined in the header file stdio.h.

putchar(ch);
// Equivalent to
printf("%c", ch);

When the operation succeeds, `putchar() returns the output character, otherwise it returns the constant EOF.

(3) Summary

Since the usage of the functions getchar() and putchar() is simpler than scanf() and printf(), and is usually implemented with macros, it is faster than scanf() and printf(). If manipulating a single character, it is recommended that these two functions be used in preference.

puts()

The puts() function is used to display the argument string on the screen (stdout) and automatically adds a newline character to the end of the string. Its prototype is defined in the header file stdio.h.

puts("Here are some messages:");
puts("Hello World");

In the above example, puts() outputs two lines on the screen.

On a successful write, puts() returns a non-negative integer, otherwise it returns the constant EOF.

gets()

The gets() function, formerly used to read an entire line of input from stdin, has been deprecated and is still presented here.

This function reads a line of user input, without skipping the blank character at the beginning, until it encounters a newline. This function discards the newline character, puts the rest of the characters into the argument variable, and adds a null character \0 to the end of those characters to make a string.

It is often used in conjunction with `puts().

char words[81];

puts("Enter a string, please");
gets(words);

The above example uses puts() to output a prompt on the screen, and then uses gets() to get the user's input.

As the string fetched by gets() may exceed the maximum length of the character array variable and is a security risk, it is recommended not to use it, but to use fgets() instead.