In our last post we discussed using Bash to transform data in one format to an arrangement suitable for import in to Google Groups. Now, let's take a soup-to-nuts look at the process.
For reference, our source data takes the following format:
# list-o-Steves
#
Stephen Gary Wozniak <imalegend@example.com>
Stephen William Hawking <imaboygenius@example.com>
Steven Anthony Ballmer <imapc@example.com>
Steve Austin <imastonecoldsunofagun@example.com>
Steven Paul Jobs <imamac@example.com>
Stevland Hardaway Morris <imawonder@example.com>
.
and each individual record in the source needs to be modified to the following:
,<email@example.com>,USER,MEMBER,Firstname
.
So how can we get there? Well, we can start by discarding lines that begin with a #
in our source file, as they are irrelevant. To do that, tell grep to not (^
) print lines that begin with #
:
grep "^[^#]" "$file"
.
This yields:
Stephen Gary Wozniak <imalegend@example.com>
Stephen William Hawking <imaboygenius@example.com>
Steven Anthony Ballmer <imapc@example.com>
Steve Austin <imastonecoldsunofagun@example.com>
Steven Paul Jobs <imamac@example.com>
Stevland Hardaway Morris <imawonder@example.com>
.
Notice that in each row of the returned list, a column is separated by one (1) blank space. That's awfully useful for our purposes, as we can inform awk that a blank space " "
indicates a separate field (-F " "
). Then, when awk
reads through each row, we can tell it exactly what we want to print:
'{print "", $NF, "USER", "MEMBER", $1}'
,
which yields:
,<imalegend@example.com>,USER,MEMBER,Stephen
,<imaboygenius@example.com>,USER,MEMBER,Stephen
,<imapc@example.com>,USER,MEMBER,Steven
,<imastonecoldsunofagun@example.com>,USER,MEMBER,Steve
,<imamac@example.com>,USER,MEMBER,Steven
,<imawonder@example.com>,USER,MEMBER,Stevland
That's nifty! Here's a breakdown:
"" | $NF | "USER" | "MEMBER" | $1 |
---|---|---|---|---|
(blank) | last column of row | USER | MEMBER | first column of row |
Taken all together:
grep "^[^#]" $file | awk -v OFS=, -F " " '{print "", $NF, "USER", "MEMBER", $1}'
,
wherein we pipe (|
) the results from grep
to awk
, and tell awk
to separate each value (-v OFC=,
) it prints with a comma (,). We can then extend the example by redirecting standard output to the .csv
file we created in Part 1 of this project with:
>> "$file"."$file_type"
.
The result? A nicely formatted .csv
file that's easily imported in to Google Groups:
Group Email [Required], Member Email, Member Type, Member Role, Member Name
,<imalegend@example.com>,USER,MEMBER,Stephen
,<imaboygenius@example.com>,USER,MEMBER,Stephen
,<imapc@example.com>,USER,MEMBER,Steven
,<imastonecoldsunofagun@example.com>,USER,MEMBER,Steve
,<imamac@example.com>,USER,MEMBER,Steven
,<imawonder@example.com>,USER,MEMBER,Stevland
.
Next, one could iterate over all of their lists using the snippets provided, and "Automate the Boring Stuff" without a line of Python. We did exactly that here.
Cheers.