I started a newsletter!
You can read more about it here.

Nik Kantar

Sunday, July 18, 2021

Quick and Dirty Python: HOWTO

Let’s build up small Python script.

A few months ago I shared a script I wrote to automate a tedious task. While that’s all well and good, I thought it might be fun to go through the process of actually writing it, step by step. For more thorough context you can refer to that post, but we’ll go over the basics here for the sake of completeness.

The Problem

We need to figure out the best health insurance plan based on our main need: out-of-network mental health coverage.

Here is the data we have to work with:

Plan Name Deductible Percentage Reimbursed Monthly Premium
Bronze $150 20% $0
Silver $300 35% $25
Gold $500 50% $75
Platinum $1000 65% $100

And here’s the formula to actually calculate the annual cost:

  (annual cost - deductible) * percentage paid
+ deductible
+ annual premiums
= total

The target output is probably a list of totals. We’ll figure that out when we get there.

The Skeleton

I start every Python script with the following boilerplate:

#!/usr/bin/env python3


def main():
    ...  # TODO


if __name__ == "__main__":
    main()

The Skeleton, Part One

The first part is a hashbang:

#!/usr/bin/env python3

This line enables us to mark the file as executable (via chmod +x) and run it without specifying the Python interpreter (./calc.py instead of python3 calc.py). I don’t actually do this too often, but it’s not a bad habit, since it does happen sometimes.

The Skeleton, Part Two

The second part is just an empty function:

def main():
    ...  # TODO

This is where we’ll put stuff we actually want to do. Its value will become clear in the next section.

The Skeleton, Part Three

The last part is one of my favorite bits of Python magic:

if __name__ == "__main__":
    main()

In short, these two lines ensure that the main function is executed when the file is run. Merely importing something from it in another file won’t trigger the execution, meaning our script is immediately also a reusable module. How’s that for convenience? You can read heaps more about this on Real Python.

The Data

Since we already know what the source data looks like, we can add it now, starting with the plans:

PLANS = [
    {"name": "Bronze", "deductible": 150, "percentage": 0.2, "monthly": 0},
    {"name": "Silver", "deductible": 300, "percentage": 0.35, "monthly": 25},
    {"name": "Gold", "deductible": 500, "percentage": 0.5, "monthly": 75},
    {"name": "Platinum", "deductible": 1000, "percentage": 0.65, "monthly": 100},
]

Looking at our formula, another fixed data point is the annual cost of therapy:

SESSION = 100  # single session cost
GROSS = 52 * SESSION

Protip: since utmost performance isn’t a consideration in this case, there’s nothing wrong with calculating the actual cost by multiplying the cost of a single session by the number of weeks in a year. This is a lot more legible and less error-prone than hardcoding GROSS = 5200, and really shows its benefits for more complex use cases, like for example the number of seconds in a year: YEAR_IN_SECONDS = 365 * 24 * 60 * 60 vs. YEAR_IN_SECONDS = 31_536_000.

I like to keep immutable data in constants at the top of the file, as it’s thus separate from the mutable world of business logic and visually out of the way, so we get this:

#!/usr/bin/env python3


SESSION = 100  # single session cost
GROSS = 52 * SESSION
PLANS = [
    {"name": "Bronze", "deductible": 150, "percentage": 0.2, "monthly": 0},
    {"name": "Silver", "deductible": 300, "percentage": 0.35, "monthly": 25},
    {"name": "Gold", "deductible": 500, "percentage": 0.5, "monthly": 75},
    {"name": "Platinum", "deductible": 1000, "percentage": 0.65, "monthly": 100},
]


def main():
    ...  # TODO


if __name__ == "__main__":
    main()

The Business Logic

The Test

Our script doesn’t actually do anything useful yet, so let’s change that by testing out what we’ve set up so far:

def main():
    for plan in PLANS:
        print(plan["name"])

Running this produces the following output:

Bronze
Silver
Gold
Platinum

Great—we’re in business!

The Math

Now that we have iteration and output, we can start implementing the math:

def plan_total(plan):
    post_deductible = GROSS - plan["deductible"]
    percentage = 1 - plan["percentage"]  # we want percentage of cost, not reimbursement
    premiums = 12 * plan["monthly"]
    total = (post_deductible * percentage) + plan["deductible"] + premiums
    return total

def calculate_plan_totals():
    for plan in PLANS:
        text = f"{plan['name']}: {plan_total(plan)}"
        print(text)

if __name__ == "__main__":
    calculate_plan_totals()

At this point we can already separate plan total calculation into its own function for legibility. I’m a big fan of small functions like this because of how neatly they package discrete functionality.

We can also rename main into calculate_plan_totals, as that’s what it actually does.

Running this produces effectively what we’re looking for:

Bronze 150: 4190.0
Silver 300: 3785.0
Gold 500: 3750.0
Platinum 1000: 3670.0

What’s not so ideal is that my actual list was much longer than this, and it wasn’t exactly a given that I’d spot the lowest number myself. Wouldn’t it be great if the script could tell us that? Of course it would!

The Lowest

For the sake of satisfyingly pretty UI, let’s add an arrow pointing at the lowest plan. For that we need to calculate all the totals and find that lowest value before outputting anything:

def calculate_plan_totals():
    plans = [{"name": plan["name"], "total": plan_total(plan)} for plan in PLANS]
    lowest = min([plan["total"] for plan in plans])
    for plan in plans:
        text = f"{plan['name']}: {plan['total']}"
        if plan["total"] == lowest:
            text = f"{text} <-"
        print(text)

There are now two iterations, which is sure to incur a performance hit, but we can probably spare a few extra milliseconds, especially considering how cool our output is now:

Bronze 150: 4190.0
Silver 300: 3785.0
Gold 500: 3750.0
Platinum 1000: 3670.0 <-

The Script

In its final form, the script looks like this:

#!/usr/bin/env python3


SESSION = 100  # single session cost
GROSS = 52 * SESSION
PLANS = [
    {"name": "Bronze", "deductible": 150, "percentage": 0.2, "monthly": 0},
    {"name": "Silver", "deductible": 300, "percentage": 0.35, "monthly": 25},
    {"name": "Gold", "deductible": 500, "percentage": 0.5, "monthly": 75},
    {"name": "Platinum", "deductible": 1000, "percentage": 0.65, "monthly": 100},
]


def plan_total(plan):
    post_deductible = GROSS - plan["deductible"]
    percentage = 1 - plan["percentage"]  # we want percentage of cost, not reimbursement
    premiums = 12 * plan["monthly"]
    total = (post_deductible * percentage) + plan["deductible"] + premiums
    return total


def calculate_plan_totals():
    plans = [{"name": plan["name"], "total": plan_total(plan)} for plan in PLANS]
    lowest = min([plan["total"] for plan in plans])
    for plan in plans:
        text = f"{plan['name']}: {plan['total']}"
        if plan["total"] == lowest:
            text = f"{text} <-"
        print(text)


if __name__ == "__main__":
    calculate_plan_totals()

Pretty cool, right?

The Bonus

There are always things we can do to refactor a given piece of code. Always.

We could use sorted and define a custom sorting function to use the calculated totals and not iterate through the list twice ourselves. We could probably clean up some variable names—I’m mildly annoyed at the lack of clarity of percentage, for example. We could add function parametrization to make everything even more reusable. We could add types, tests, and docs.

But this is probably enough. It’s a simple script that does what it needs to do. Just some quick and dirty Python. :)