In our last post we discussed using Bash to transform data in one format to an arrangement suitable for import in to Google Groups. Now, let's take a soup-to-nuts look at the process.

For reference, our source data takes the following format:

# list-o-Steves
#
Stephen Gary Wozniak <imalegend@example.com>
Stephen William Hawking <imaboygenius@example.com>
Steven Anthony Ballmer <imapc@example.com>
Steve Austin <imastonecoldsunofagun@example.com>
Steven Paul Jobs <imamac@example.com>
Stevland Hardaway Morris <imawonder@example.com>.

and each individual record in the source needs to be modified to the following:

,<email@example.com>,USER,MEMBER,Firstname.

So how can we get there? Well, we can start by discarding lines that begin with a # in our source file, as they are irrelevant. To do that, tell grep to not (^) print lines that begin with #:

grep "^[^#]" "$file".

This yields:

Stephen Gary Wozniak <imalegend@example.com>
Stephen William Hawking <imaboygenius@example.com>
Steven Anthony Ballmer <imapc@example.com>
Steve Austin <imastonecoldsunofagun@example.com>
Steven Paul Jobs <imamac@example.com>
Stevland Hardaway Morris <imawonder@example.com>.

Notice that in each row of the returned list, a column is separated by one (1) blank space. That's awfully useful for our purposes, as we can inform awk that a blank space " " indicates a separate field (-F " "). Then, when awk reads through each row, we can tell it exactly what we want to print:

'{print "", $NF, "USER", "MEMBER", $1}',

which yields:

,<imalegend@example.com>,USER,MEMBER,Stephen
,<imaboygenius@example.com>,USER,MEMBER,Stephen
,<imapc@example.com>,USER,MEMBER,Steven
,<imastonecoldsunofagun@example.com>,USER,MEMBER,Steve
,<imamac@example.com>,USER,MEMBER,Steven
,<imawonder@example.com>,USER,MEMBER,Stevland

That's nifty! Here's a breakdown:

"" $NF "USER" "MEMBER" $1
(blank) last column of row USER MEMBER first column of row

Taken all together:

grep "^[^#]" $file | awk -v OFS=, -F " " '{print "", $NF, "USER", "MEMBER", $1}',

wherein we pipe (|) the results from grep to awk, and tell awk to separate each value (-v OFC=,) it prints with a comma (,). We can then extend the example by redirecting standard output to the .csv file we created in Part 1 of this project with:

>> "$file"."$file_type".

The result? A nicely formatted .csv file that's easily imported in to Google Groups:

Group Email [Required], Member Email, Member Type, Member Role, Member Name
,<imalegend@example.com>,USER,MEMBER,Stephen
,<imaboygenius@example.com>,USER,MEMBER,Stephen
,<imapc@example.com>,USER,MEMBER,Steven
,<imastonecoldsunofagun@example.com>,USER,MEMBER,Steve
,<imamac@example.com>,USER,MEMBER,Steven
,<imawonder@example.com>,USER,MEMBER,Stevland.

Next, one could iterate over all of their lists using the snippets provided, and "Automate the Boring Stuff" without a line of Python. We did exactly that here.

Cheers.