In his book: "Automating the Boring Stuff", Al Sweigart details how to automate boring, repetitive digital tasks using the Python programming language. A job that could take hours of manual work, he explains, can be completed by a computer program faster--and with greater accuracy--than a human being.

Python can add a layer of complexity to a pipeline's automation, though. For example, Python frequently requires the import of one or more modules to achieve a task that a command-line interpreter such as Bash can do directly.

Let's look at a real world example: migrating scores of mailing lists from a decrepit Sendmail server to Google Groups. Our source material takes the following form (abridged):

# List-o-Steves
#
Stephen Gary Wozniak <imalegend@example.com>
Stephen William Hawking <imaboygenius@example.com>
Steven Anthony Ballmer <imapc@example.com>
Steve Austin <imastonecoldsunofagun@example.com>
Steven Paul Jobs <imamac@example.com>
Stevland Hardaway Morris <imawonder@example.com>,

while Google Groups requires formatted .csv files that look like this:

Header: Group Email [required],Member Email,Member Type,Member Role
Entry: yourgroup@email.com, membername@email.com,USER,MEMBER.

For reference:
ggroups.

The task, then, is to extract data from our source, and transform that data to a format suitable for import in to Google Groups. But how?

Well, let's start by declaring our variables:

domain="example.com"
file_type="csv"
file="list-o-steves"

and creating a .csv file with appropriately-named headers:

printf "%s\n" "Group Email [Required], Member Email, Member Type, Member Role, Member Name" > "$file"."$file_type"

which returns this file:

list-o-steves.csv

with the following on its first line:

Group Email [Required], Member Email, Member Type, Member Role, Member Name

Progress! We'll cover the nuts-and-bolts of extracting our data in a coming post.

Cheers.