Imagine a list of files (either in a text file, or a command that outputs a list of files). First make sure these files are newline seperated. Why newline? because filenames dont support the newline char. They also dont support null chars, but we wont cover that here (most provided file lists come as a list of file path, one on each line)

So imagine a list of files:

file.txt contents:
/cloudfs/cc1/path1/asdfasdf.txt
/cloudfs/cc1/path1/asdfasdf.txt
/cloudfs/cc1/path1/asdfa sdf.txt

Most people would do this:

(IFS=$'\n'
for i in `cat file.txt`; do
dosomething "$i"
done)

By default for loops iterate on spaces, tabs and newlines. So its important to set the IFS variable. IFS is a system variable thats always there (read up more here: http://www.infotinks.com/ifs-cheatsheet-setting-to-default-to-newline-for-files-folders-with-spaces/). If you forgot the IFS=$’\n’, then above would of processed that last file as two different files /cloudfs/cc1/path1/asdfa and another called sdf.txt. But since we didnt that helps. I used a subshell because IFS variable should be set back if changed. Subshells dont touch your own environment variables (only its own; and its childrens), so you dont have to worry about anything.

Another option is to do this

OLDIFS=$IFS
IFS=$'\n'
for i in `cat file.txt`; do
dosomething "$i"
done
IFS=$OLDIFS

That way IFS variable gets set back. However thats an extra line. I like the subshell method better, it forgives you from having to set variables (like system variable IFS) back to their default. Also subshells run everything like a script, at once, not line by line showing you the PS1 prompt everytime (very annoying; subshells avoid this annoyance). Also a subshell will allow that i variable to not carry over into your main/parent shell.

So whats the best way???? not with a for loop. And you dont even need a subshell & you dont need to set any system variables. While read loops are the way to go.

Why are while loops best for reading lists of files? because while read loops work 1 line at a time

(cat file.txt | while read i; do
dosomething "$i"
done)

Or without a subshell (I personally prefer to run everything in a subshell, so I would use above method. I only avoid subshells in actual shell scripts – since I like alot of copy paste command; write in notepad, copy, paste into putty/shell, therefore subshells are my best buddy)

cat file.txt | while read i; do
dosomething "$i"
done

Here is an example of a “tee” replacement (incase you are in freebsd and you dont have tee – or some other OS)

somecommand | while read i; do
echo "$i"
echo "$i" > output.txt
done

# or in 1 line
somecommand | while read i; do echo "$i"; echo "$i" > output.txt; done

To change to append mode tee just change > output.txt  to >> output.txt .

The end.

One thought on “Don’t Use “for” loops for File Iteration – Use “while read” Loops

  1. Only caveat to this method is that vars inside the while loop can’t be accessed outside of the loop… vars from the outside of the while loop, as long as mentioned before the while loop, can be accessed inside the while loop… where as for loops vars can be accessed outside the for loop, and vars outside the for loop can be accessed inside the for loop… the way i think about it, its like while loops are a subshell

Leave a Reply

Your email address will not be published. Required fields are marked *