Automating the Boring Stuff, Too

In our last post we discussed using Bash to transform data in one format to an arrangement suitable for import in to Google Groups. Now, let's take a soup-to-nuts look at the process.

For reference, our source data takes the following format:

# list-o-Steves
#
Stephen Gary Wozniak <imalegend@example.com>
Stephen William Hawking <imaboygenius@example.com>
Steven Anthony Ballmer <imapc@example.com>
Steve Austin <imastonecoldsunofagun@example.com>
Steven Paul Jobs <imamac@example.com>
Stevland Hardaway Morris <imawonder@example.com>.

and each individual record in the source needs to be modified to the following:

,<email@example.com>,USER,MEMBER,Firstname.

So how can we get there? Well, we can start by discarding lines that begin with a # in our source file, as they are irrelevant. To do that, tell grep to not (^) print lines that begin with #:

grep "^[^#]" "$file".

This yields:

Stephen Gary Wozniak <imalegend@example.com>
Stephen William Hawking <imaboygenius@example.com>
Steven Anthony Ballmer <imapc@example.com>
Steve Austin <imastonecoldsunofagun@example.com>
Steven Paul Jobs <imamac@example.com>
Stevland Hardaway Morris <imawonder@example.com>.

Notice that in each row of the returned list, a column is separated by one (1) blank space. That's awfully useful for our purposes, as we can inform awk that a blank space " " indicates a separate field (-F " "). Then, when awk reads through each row, we can tell it exactly what we want to print:

'{print "", $NF, "USER", "MEMBER", $1}',

which yields:

,<imalegend@example.com>,USER,MEMBER,Stephen
,<imaboygenius@example.com>,USER,MEMBER,Stephen
,<imapc@example.com>,USER,MEMBER,Steven
,<imastonecoldsunofagun@example.com>,USER,MEMBER,Steve
,<imamac@example.com>,USER,MEMBER,Steven
,<imawonder@example.com>,USER,MEMBER,Stevland

That's nifty! Here's a breakdown:

"" $NF "USER" "MEMBER" $1
(blank) last column of row USER MEMBER first column of row

Taken all together:

grep "^[^#]" $file | awk -v OFS=, -F " " '{print "", $NF, "USER", "MEMBER", $1}',

wherein we pipe (|) the results from grep to awk, and tell awk to separate each value (-v OFC=,) it prints with a comma (,). We can then extend the example by redirecting standard output to the .csv file we created in Part 1 of this project with:

>> "$file"."$file_type".

The result? A nicely formatted .csv file that's easily imported in to Google Groups:

Group Email [Required], Member Email, Member Type, Member Role, Member Name
,<imalegend@example.com>,USER,MEMBER,Stephen
,<imaboygenius@example.com>,USER,MEMBER,Stephen
,<imapc@example.com>,USER,MEMBER,Steven
,<imastonecoldsunofagun@example.com>,USER,MEMBER,Steve
,<imamac@example.com>,USER,MEMBER,Steven
,<imawonder@example.com>,USER,MEMBER,Stevland.

Next, one could iterate over all of their lists using the snippets provided, and "Automate the Boring Stuff" without a line of Python. We did exactly that here.

Cheers.