Saturday, November 8, 2014

Nothing beats Bash scripting for file manipulation

One of its strength of course is that it comes standard with most Linux distro. You can also have the power of bash scripts under Windows using Cygwin. A powerful element of bash scripting is its capacity to use external utilities like AWK or SED.
I use AWK a lot so let me share with you a way to integrate AWK script in bash scripts without having to use separate files. Here is the simplest way I found:

1) Put the script in a variable using cat and heredoc

vtest=$(cat - <<-'_EOT_'
    /^[ \t]*package[ \t]+[a-zA-Z_]+/ {
        print $0
    }
    /^[ \t]*import[ \t]+/ {
        print $0
    }
    {}
_EOT_
)
Simply feed the heredoc to cat and then put this in a variable using command substitution. Don't forget the single quotes around the _EOT_ at the top.

2) Feed the script to AWK using a redirected echo expression as if it were a file (-f).

awk -f <(echo "$vtest") file_to_process

In this case here, notice the use of double quotes around the variable to preserve the end of line and other formatting info. I use this if the AWK scripts gets complicated enough and readability becomes a factor. Of course you don't want to abuse this by putting really large scripts inside variables. In my example the script was made meaningless by removing a few lines for simplification but because of the nice formatting with heredoc you can still clearly see that both the java package and import statements received special treatment (the script was used to process Java source files).
What do you think about this trick ?