Tutorial: Defensive programming in Python; Using and extending the typeconstraints module

in #utopian-io6 years ago (edited)

Repository

https://github.com/pibara-utopian/typeconstraints

What Will I Learn?

In this tutorial we will teach you how using typeconstraints can help prevent specific types of type-system related bugs in your code, you will learn.

  • The types of easy to make hard to find bugs that can arise from a lack of type constraints.
  • How adding proxy methods and constraint checking code to your code can prevent argument type related bugs at the price of adding considerably more code to each of your methods.
  • How you can do the same thing with the typeconstraints decorator, with verry little code.
  • How the typeconstraints library implements type constraints and how you can add your own custom typeconstraint assertion classes to your code if you need to.

Requirements

To follow this tutorial, you should:

  • Have a good working knowledge of Python
  • Understand the difference between statically typed and dynamically typed languages.
  • Have at least some experience with a statically typed language.

Difficulty

  • Intermediate

A first bug: forgotten square braces

Imagine that you wrote this code. A simple function that calculates the average length of strings in a list and returns the result in a dictionary. For flexability you added an optional argument to allow tuning the returned dictionary by using a different name for the result within the returned dictionary:

def average_string_length(stringlist,fieldname="stringlen"):
    #Initialize empty rval
    rval = dict()
    #Initialize the total count of all characters in all strings
    totalcount = 0
    #Count all characters in each of the strings
    for string in stringlist:
        totalcount += len(string)
    #Put the average string count in the return value.
    rval[fieldname] = totalcount/len(stringlist)
    return rval

A correct invocation of the above code could look something like this:

#Correct invocation of our function
r1 = average_string_length(["amsterdam","cotonou"])

If we were to print r1, it would look something like this:

{'stringlen': 8.0}

Now consider what would happen if we made the little mistake to forget the square braces:

#Incorrect invocation of our function
r1 = average_string_length("amsterdam","cotonou") 

The code would run as if everything was just fine. The things the code will end up doing though would be totally wrong. If you were to look at the return value, it should become obvious something is wrong, but chances are this fact will not show up in an obvious way anywhere close to the place where things went wrong.

{'cotonou': 1.0}

Have a look at the code and try to understand how it results in the above dict being returned.

Type asserts without the typeconstraints decorator

So how can we prevent a bug from this from arising? One way is adding asserts code inside of our function.

def average_string_length(stringlist,fieldname="stringlen"):
    #assert the types of our two function arguments is correct
    assert isinstance(stringlist,list)
    assert isinstance(fieldname,str)
    #assert the type of each individual element in the string list is correct
    for stringn in stringlist:
        assert isinstance(stringn,str)

    #Original body of our function
    rval = dict()
    totalcount = 0
    for string in stringlist:
        totalcount += len(string)
    rval[fieldname] = totalcount/len(stringlist)
    return rval

We check the shallow type of the function arguments using asserts and then dive into the string list to assert the list elements each are of the expected type. This way running the faulty code should result in an exception being raised and the bug will be quickly found and located.

A second bug: returning different types

Now imagine the following code:

def do_something():
   ...
   if (somecondition):
       ..
       raise SomeException("OOPS")
   else:
       ...
       return rval

Consider the missing code denoted by the elipsis is rather long, now someone comes along looks at the code and thinks, hey, I could fix this condition:

def do_something():
   ...
   if (somecondition):
       ..
       if not fix_error_condition():
           raise SomeException("OOPS")
   else:
       ...
       return rval

Seems innocent enough, but consider the return value if the condition is actually fixed. Just as unintended wrong types going into a function can lead to hard to find bugs, so can returning the wrong type from a function.

Return type check wrapper without typeconstraints decorator

Asserting return values as we did for function arguments is a bit more involved for return values. What we need to do is move our function body into an inner function, then add an assert to the outer function.

def do_something():
    #Our real business logic that may exit in many ways
    def _do_something():
        ....
    #Invike the real business logic
    rval = _do_something()
    #asert the type of the return value is as expected
    assert isinstance(rval,bool)
    return rval

Combining the two

Let's go back to our original problem code. What should the code look like if we want to combine both function argument asserts and return value asserts?

def average_string_length(stringlist,fieldname="stringlen"):
    #Our real business logic
    def _average_string_length(stringlist,fieldname):
        rval = dict()
        totalcount = 0
        for string in stringlist:
            totalcount += len(string)
        rval[fieldname] = totalcount/len(stringlist)
        return rval 

    #Assert the function arguments have the proper types
    assert isinstance(stringlist,list)
    assert isinstance(fieldname,str)
    for stringn in stringlist:
        assert isinstance(stringn,str)
    #Call the real business logic
    rval = _average_string_length(stringlist,fieldname)
    #Assert the return valuehas the proper type. 
    assert isinstance(rval, dict)
    for k in rval.keys():
        assert isinstance(k,str)
        assert isinstance(rval[k],float)
    return rval

The alternative: typeconstraints

Noticed the assert code in our example was longer than our actual code? There is quite a bit of code overhead to defensive programming in Python this way. I hope by now it has become clear that there is a lot to be won if we can do the same in a more concise way.

This is where the typeconstraints module comes in. Before we can use the module, install it with pip3

pip3 install typeconstraints

Then at the start of our code we should import the typeconstraint decorator and whatever helper classes we need in our program

from typeconstraints import typeconstraints, ARRAYOF,DICTOF

Let's jump right in. Before explaining the how and why, let us have a look at what our example code looks like using typeconstraints:

@typeconstraints([ARRAYOF(str),str],[DICTOF(float)])
def average_string_length(stringlist,fieldname="stringlen"):
    rval = dict()
    totalcount = 0
    for string in stringlist:
        totalcount += len(string)
    rval[fieldname] = totalcount/len(stringlist)
    return rval

The above code will get you the same assertion guarantees that the last combined code got us with a lot less code for you to write and maintain. One single, admitedly cryptic line gives us all the type constraints we need.

Understanding the type constraints

Let us examine the decorator line a bit closer using a simpler example first:

@typeconstraints([str,int][float])
def silly_function(message,divider):
    return len(message)*1.0/divider

The best way to understand the above code is comparing it with a C++ version of the same code

float silly_function(string message, int divider) {
    return message.size()*1.0 / divider;
}

The typeconstraints decorator takes two arguments:

  • A list of type constraint definitions for the function arguments in the same order as the argument list.
  • A list of type constraint definitions for the return value(s)

Simulating static types

The simple example above used just basic types. You can use typeconstraints like that using types like int, float, str, bool, list, and dict. Just realize that when using dicts or lists, the asserts will be shallow. In a language like C++, dicts and lists will always be typed. That is, a list is always a list of elements of one specific type and the same is true of a dict-like container. The typeconstraints library comes with two type constraint classes that simulate such constraints:

  • ARRAYOF
  • DICTOF

Let us look at the first lines of the typeconstraint decorator usage for our original function once more:

@typeconstraints([ARRAYOF(str),str],[DICTOF(float)])
def average_string_length(stringlist,fieldname="stringlen"):
    ...

ARRAYOF(str) denotes an argument should be of a list type and each element in the list should be of the str type. Likewise DICTOF(float) denotes an argument of a dict type where each value in the dict should be of the type float.

Dicts as a statically typed struct

While single type dicts do happen in Python programming, more often than not, dicts will be used mor like records or structs where different fields have different types, but the type belonging to a field designated by a specific key will be fixed. For these scenarios, we have the MIXEDDICT class to help us. The one mandatory argument to MIXEDDICT is a dict defining the type constraints for each relevant key.

@typeconstraints([MIXEDDICT({"name": str, "salery": float, "position": str})], 
    [bool])

Often a field could be optional, but should have a certain type if pressent, for that, we can define some of the fields as optional.

@typeconstraints([MIXEDDICT({"name": str, "salery": float, "position": str}, 
    optionals=["position"])], [bool])

By default MIXEDDICT will fail its assert if there are unknown keys. This behaviour can be changed by setting the ignore_extra attribute.

@typeconstraints([MIXEDDICT({"name": str, "salery": float, "position": str},
    optionals=["position"],ignore_extra=True)], [bool])

Lists as statically typed struct (of sorts)

With Python lists being as type flexible as they are, lists are often used as structs as well where the position in the list determines its meaning and by that it's type. The MIXEDARRAY class allows us to define a constraint for these purposes.

@typeconstraints([MIXEDARRAY([str, float,str])], [bool])

Nonnable arguments

If an argument is a container, like a dict or list and should default to an empty container, itis unsafe in Python to define [], list(), {} or dict() as default. For that reason, it is common practice to implement such a defaultlike this:

def some_function(arg1, arg2=None):
    if arg2==None:
        arg2= None
    ...

To accommodate this usage pattern, the typeconstraints library has the NONNABLE class.

@typeconstraints([str,NONNABLE(ARRAYOF(str))],[bool])of 
def some_function(arg1, arg2=None):
    if arg2==None:
        arg2= None

This allows the argument to be of the specified type but also allows the default None as a valid value for the argument.

Arguments that can take multiple types

While strict constraints are useful for many cases, there are situations where we know our code works for a range of types not necessarily connected hierarchically. For those situations, the ANYOF class can help us define a more flexible constraint.

@typeconstraints([ANYOF(int,float), ANYOF(int,float)],[ANYOF(int,float)])
def multiply(x,y)
    return x*y

Duck Typing

One of the core idioms of Python with respect to types for objects is duck typing. With duck typing, an object's suitability for usage by a method or function is determined by the presence of certain methods. The DUCK class gives us a rudimentary handle for using duck typing based type constraints in our typeconstraints.

class Foo(object):
    def method1(self,arg1,arg2):
        ...
    def method1(self,arg1,arg2):
        ...

class Bar(object):
    def method1(self,arg1,arg2):
        ...
    def method1(self,arg1,arg2):
        ...

@typeconstraints([DUCK(Foo)][bool])
def baz(foo):
    ...

Writing your own type constraint class

While the type constraint classes provided by the typeconstraints library provide many options suitable for many scenarios, chances are you will run into scenarios where none of them is a sufficient fit. In those cases, you will need to write a type constraints class yourself. A type constraints class is a callable that should take one argument, the argument that needs to be type asserted. Note that the callable should not throw AssertionError exceptions when called (throwing them from the constructor of the callable though is OK). The callable should return a boolean. Further it is advisable for clarity of nested error information to make use of a error_msg attribute in your class.

Let us revisit our previous example:

@typeconstraints([str,int][float])
def silly_function(message,divider):
    return len(message)*1.0/divider

While there are better ways to fix this, we may want a type constraint here that assures us a divider is a positive number. Not a negative and not zero.

Let's look how a class for that should look

class POSITIVEINT(object):
    def __init__(self):
        #Initialize error_msg
        self.error_msg = ""
    def __call__(self,arg):
        #Clear error_msg
        self.error_msg = ""
        #Validate that the argument supplied is an integer
        if isinstance(arg, int):
            #Validate the integer is positive
            if arg > 0:
                return True
            else:
                #Set our internal error message
                 self.error_msg = "Argument is not a positive number"
        else:
             #Set our internal error message
             self.error_msg = "Argument is not an integer."
        return False

Now we can use the new class in our typeconstraints invocation

@typeconstraints([str,POSITIVEINT()][float])
def silly_function(message,divider):
    return len(message)*1.0/divider

Contribute

If you end up writing type constraint classes that you feel you can reuse, chances are other people would find them usefull as well. Consider forking typeconstraints on github and putting your type constraint class in a pull requests so others might end up using it. If you do, please also look at the test.py script and be sure to ad at least one positive and at least one negative test to the script for each type constraint class you submit.

Sort:  

Thank you for your contribution @mattockfs.
We've been reviewing your tutorial and suggest the points below:

  • We suggest you enter comments in your code. It is very important to have comments in the code as it helps the reader to better understand the code that you have written.

Your tutorial is very well explained and detailed. Thank you for your work.

Your contribution has been evaluated according to Utopian policies and guidelines, as well as a predefined set of questions pertaining to the category.

To view those questions and the relevant answers related to your post, click here.


Need help? Write a ticket on https://support.utopian.io/.
Chat with us on Discord.
[utopian-moderator]

Thank you for your review, @portugalcoin!

So far this week you've reviewed 3 contributions. Keep up the good work!

Hi @mattockfs!

Your post was upvoted by @steem-ua, new Steem dApp, using UserAuthority for algorithmic post curation!
Your post is eligible for our upvote, thanks to our collaboration with @utopian-io!
Feel free to join our @steem-ua Discord server

Hey, @mattockfs!

Thanks for contributing on Utopian.
Congratulations! Your contribution was Staff Picked to receive a maximum vote for the tutorials category on Utopian for being of significant value to the project and the open source community.

We’re already looking forward to your next contribution!

Get higher incentives and support Utopian.io!
Simply set @utopian.pay as a 5% (or higher) payout beneficiary on your contribution post (via SteemPlus or Steeditor).

Want to chat? Join us on Discord https://discord.gg/h52nFrV.

Vote for Utopian Witness!

I upvoted your post.

Keep steeming for a better tomorrow.
@Acknowledgement - God Bless

Posted using https://Steeming.com condenser site.

So I'm really excited for somebody to come out with a DAPP on top of steem for learning code.. Thanks for sharing!

Posted using Partiko Android

Coin Marketplace

STEEM 0.23
TRX 0.12
JST 0.029
BTC 66916.74
ETH 3481.88
USDT 1.00
SBD 3.17