I suggest using an inner loop to handle the special characters. In C99:
Code:
#include <stdio.h>
#include <ctype.h>
int main(void)
{
int c;
c = getchar();
while (isspace(c))
c = getchar();
while (c != EOF) {
if (isspace(c)) {
int newline = 0;
while (isspace(c)) {
if (c == '\n' || c == '\r')
newline = 1;
c = getchar();
}
if (newline)
putchar('\n');
else
putchar(' ');
} else {
putchar(c);
c = getchar();
if (c == EOF)
putchar('\n');
}
}
return 0;
}
If you save it as
filter.c, you can compile it using
Code:
gcc -std=c99 filter.c -Wall -O3 -fomit-frame-pointer -o filter.exe
This code does not have the exact same functionality, though. This code
- Uses the locale convention for whitespace
In the default POSIX/C locale: spaces, tabs (\t and \v), form feeds (\f), linefeeds (\n) and carriage returns (\r)
- Skips all initial whitespace in the file
- Removes all leading and trailing whitespace on each line
- Combines any sequence of whitespace (excluding newlines) into single space
- Converts all newline conventions to the default one (\n)
(Note that in Windows, if the standard input is open in text mode, it may be converted to CR LF, i.e. \r\n. I don't use Windows, so I'm not sure.)
- Makes sure the file ends with a single newline
The main difference with the other ones in this thread is that in this one,
c is read from input during the iteration (loop body), not at the start of the iteration. Because of this, we can use inner loops to read further input. (If you always read the next character at the start of the iteration, the inner loops would need to use
ungetc() to push back the character that ended the inner loop; making for complex and confusing code.)