Flask and Pytest coverage

I have written before about Flask and obtaining test coverage results here and with an update here. This is pretty trivial if you're writing unit tests that directly call the application, but if you actually want to write tests which animate a browser, for example with selenium, then it's a little more complicated, because the browser/test code has to run concurrently with the server code.

Previously I would have the Flask server run in a separate process and run 'coverage' over that process. This was slightly unsatisfying, partly because you sometimes want coverage analysis of your actual tests. Test suites, just like application code, can grow in size with many utility functions and imports etc. which may eventually end up not actually being used. So it is good to know that you're not needlessly maintaining some test code which is not actually invoked.

We could probably get around this restriction by running coverage in both the server process and the test-runner's process and combine the results (or simply view them separately). However, this was unsatisfying simply because it felt like something that should not be necessary. Today I spent a bit of time setting up the scheme to test a Flask application without the need for a separate process.

I solved this now, by not using Flask's included Werkzeug server and instead using the WSGI server included in the standard-library wsgiref.simple_server module. Here is, a minimal example:

import flask

class Configuration(object):
    TEST_SERVER_PORT = 5001

application = flask.Flask(__name__)
application.config.from_object(Configuration)


@application.route("/")
def frontpage():
    if False:
        pass # Should not be covered
    else:
        return 'I am the lizard queen!' # Should be in coverage.



# Now for some testing.
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
import pytest
# Currently just used for the temporary hack to quit the phantomjs process
# see below in quit_driver.
import signal

import threading

import wsgiref.simple_server

class ServerThread(threading.Thread):
    def setup(self):
        application.config['TESTING'] = True
        self.port = application.config['TEST_SERVER_PORT']

    def run(self):
        self.httpd = wsgiref.simple_server.make_server('localhost', self.port, application)
        self.httpd.serve_forever()

    def stop(self):
        self.httpd.shutdown()

class BrowserClient(object):
    """Interacts with a running instance of the application via animating a
    browser."""
    def __init__(self, browser="phantom"):
        driver_class = {
            'phantom': webdriver.PhantomJS,
            'chrome': webdriver.Chrome,
            'firefox': webdriver.Firefox
            }.get(browser)
        self.driver = driver_class()
        self.driver.set_window_size(1200, 760)


    def finalise(self):
        self.driver.close()
        # A bit of hack this but currently there is some bug I believe in
        # the phantomjs code rather than selenium, but in any case it means that
        # the phantomjs process is not being killed so we do so explicitly here
        # for the time being. Obviously we can remove this when that bug is
        # fixed. See: https://github.com/SeleniumHQ/selenium/issues/767
        self.driver.service.process.send_signal(signal.SIGTERM)
        self.driver.quit()


    def log_current_page(self, message=None, output_basename=None):
        content = self.driver.page_source
        # This is frequently what we really care about so I also output it
        # here as well to make it convenient to inspect (with highlighting).
        basename = output_basename or 'log-current-page'
        file_name = basename + '.html'
        with open(file_name, 'w') as outfile:
            if message:
                outfile.write("<!-- {} --> ".format(message))
            outfile.write(content)
        filename = basename + '.png'
        self.driver.save_screenshot(filename)

def make_url(endpoint, **kwargs):
    with application.app_context():
        return flask.url_for(endpoint, **kwargs)

# TODO: Ultimately we'll need a fixture so that we can have multiple
# test functions that all use the same server thread and possibly the same
# browser client.
def test_server():
    server_thread = ServerThread()
    server_thread.setup()
    server_thread.start()

    client = BrowserClient()
    driver = client.driver

    try:
        port = application.config['TEST_SERVER_PORT']
        application.config['SERVER_NAME'] = 'localhost:{}'.format(port)

        driver.get(make_url('frontpage'))
        assert 'I am the lizard queen!' in driver.page_source

    finally:
        client.finalise()
        server_thread.stop()
        server_thread.join()

To run this you will of course need flask as well as pytest, pytest-cov, and selenium:

$ pip install flask pytest pytest-cov selenium

In addition you will need the phantomjs to run:

$ npm install phantomjs
$ export PATH=$PATH:./node_modules/.bin/

Then to run it, the command is:

$ py.test --cov=./ app.py
$ coverage html

The coverage html is of course optional and only if you wish to view the results in friendly HTML format.

Notes

I've not used this extensively myself yet, so there may be some problems when using a more interesting flask application.

Don't put your virtual environment directory in the same directory as app.py because in that case it will perform coverage analysis over the standard library and dependencies.

In a real application you will probably want to make a pytest fixture out of the server thread and browser client. So that you can use each for multiple separate test functions. Essentially your test function should just be the part inside the try clause.

I have not used the log_current_page method but I frequently find it quite useful so included it here nonetheless.

Python and Lambdas

I come from a functional programming background, so I a lot of love for functions and so-called anonymous functions or lambdas. However, I have realised that I don't make use of Python's lambda syntax much, and I wanted to articulate why. This may mostly sound pretty negative towards lambdas, but bear in mind I'm certainly not against lambdas in general.

A lambda expression could always be avoided by simply defining the function in question. So the question becomes when is a lambda more readable than a definition? That is almost always because the function you're attempting to create is so simple that the definition syntax gets in the way. It is often used when the function in question is going to be used as an argument to some method. For example the sort method on a list takes as argument they key to be used when comparing the elements. So if you have a list of items and you want to sort them by their prices you can do something such as:

list_of_items.sort(key=lambda x: x.price)

We could of course have gotten the use of lambda with a definition:

def price(item):
    return item.price
list_of_items.sort(key=price)

This is not quite equivalent because it introduces a new name into the current scope that would not otherwise have been there, but I don't ever recall reaching for a lambda expression in order to avoid polluting the name space.

We could even define a generic higher-order function to make a function out of attribute access.

def f_attr(attr_name):
    def attr_getter(item):
        return getattr(item, attr_name)
    return attr_getter

list_of_items.sort(key=f_attr('price'))

This is surely worse than the first two attempts, but if you have multiple sorts to do, then using definitions can get tedious:

def price(item):
    return item.price
list_of_items.sort(key=price)
do_something(list_of_items)

def quantity(item):
    return item.quantity
list_of_items.sort(key=quantity)
do_something(list_of_items)

def margin(item):
    return item.margin
list_of_items.sort(key=margin)
do_something(list_of_items)

as compared with the lambda version:

list_of_items.sort(key=lambda x: x.price)
do_something(list_of_items)
list_of_items.sort(key=lambda x: x.quantity)
do_something(list_of_items)
list_of_items.sort(key=lambda x: x.margin)
do_something(list_of_items)

The f_attr version is similar:

list_of_items.sort(key=f_attr('price'))
do_something(list_of_items)
list_of_items.sort(key=f_attr('quantity'))
do_something(list_of_items)
list_of_items.sort(key=f_attr('margin'))
do_something(list_of_items)

In fact that could be done with a loop:

for attribute in ['price', 'quantity', 'margin']:
    list_of_items.sort(key=f_attr(attribute))
    do_something(list_of_items)

I think the lambda version is at least sometimes more readable. However, I often miss the self-documenting nature of a definition, in that giving the function a name acts as some pretty decent documentation especially for the kind of simple function you might otherwise use a lambda expression for.

So, for a lambda expression to be the right choice, the function in question has to be so simple as to not benefit from the extra documentation afforded by its name (such as simple attribute access).

I just don't seem to find myself in that situation very often. Perhaps I'm just not making my own functions general enough by accepting functions as arguments.

Minor Refactorings

Looking through the code of an open-source Python project I came across code that amounts to the following:

    host_name = self.connection.getsockname()[0]
    if self.server.host_name is not None:
        host_name = self.server.host_name

This looks odd because if the condition is True then we forget about the host_name derived from the call to self.connection.getsockname. However it could be that that call has some side-effect. Assuming that it does not we can re-write the above code as:

    if self.server.host_name is not None:
        host_name = self.server.host_name
    else:
        host_name = self.connection.getsockname()[0]

Or even, if you prefer:

    host_name = self.server.host_name if self.server.host_name is not None else self.connection.getsockname()[0]

Or:

    host_name = self.server.host_name
    if host_name is None:
        host_name = self.connection.getsockname()[0]

Or finally:

    host_name = self.server.host_name or self.connection.getsockname()[0]

Note that this last version has slightly different semantics in that it will use the call to getsockname if self.server.host_name is at all falsey, in particular if it is ''.

I love finding minor refactoring opportunities like this. I enjoy going through code and finding such cases. I believe if a project's source code has many such opportunites it hints that the authors do not engage in routine refactoring. However, this post is more about how such a refactoring interacts with Python's `@property' decorator.

The @property decorator

In this case we call getsockname, so when considering our refactoring we have to be careful that there are no significant side-effects. However even if the code had been:

    host_name = self.connection.host_name
    if self.server.host_name is not None:
        host_name = self.server.host_name

We would still have to go and check the code to make sure that either connection or host_name is not implemented as property and if so, that they do not have significant side-effects.

One question arises. What to do if it does have significant side-effects? Leaving it as is means subsequent readers of the code are likely to go through the same thought process. The code as is, looks suspicious.

First off, if I could, I'd refactor self.connection.host_name so that it does not have side-effects, perhaps the side-effects are put into a separate method that is called in any case:

    self.connection.setup_host_name()
    if self.server.host_name is not None:
        host_name = self.server.host_name
    else:
        host_name = self.connection.host_name

Assuming I don't have the ability to modify self.connection.host_name, I could still refactor my own code to have the side-effects separately. Something like:

class Whatever(...):
    def setup_host_name(self):
        self._host_name = self.connection.host_name

    ....
    self.setup_host_name()
    if self.server.host_name is not None:
        host_name = self.server.host_name
    else:
        host_name = self._host_name

I think that is better than the easy opt-out of a comment. Obviously in this case we have some control over the class in question since we're referencing self. But imagine that self was some other variable that we don't necessarily have the ability to modify the class of, then a more lightweight solution would be something like:

    if other.server.host_name is not None:
        other.connection.host_name
        host_name = `other`.server.host_name
    else:
        host_name = `other`.connection.host_name

This of course assumes that the side-effects in question are not related to other.server.host_name. An alternative in either case:

    connection_host_name = other.connection.host_name
    if other.server.host_name is not None:
        host_name = other.server.host_name
    else:
        host_name = connection_host_name

I'm not sure that either really conveys that other.connection.host_name is evaluated for its side-effects. Hence, in this case, I would opt for a comment, but you may disagree.

Conclusion

The @property decorator in Python is very powerful and I love it and use it regularly. However, we should note that it comes with the drawback that it can make code more difficult to evaluate locally.

A comprehensive test-suite would alay most fears of making such a minor refactor.

Finally, finding such code in a code-base is well worth refactoring. The odd bit of code like this is always likely to sneak in unnoticed, and can often be the result of several other simplifications. However, if you find many of these in a project's source code, I believe it is symptomatic of at least one of two things; either the authors don't engage in much refactoring or general code-maintenance, or there is a lack of a decent test-suite which makes such refactorings more of a chore.

Final irrelevant aside

In a previous post I made a suggestion about if-as syntax. I'm still against this idea, however this is the sort of situation it would (potentially) benefit. Our second syntax possibility could have been:

    host_name = self.server.host_name as hn if hn is not None else self.connection.getsockname()[0]

Whilst this is shorter and reduces some repetition, it's unclear to me if this is more readable.

List arguments and isinstance

More than a decade and a half ago, Kragen Javier Sitaker wrote a blog post isinstance() considered harmful, a lot of which I believe holds up pretty well today. It's well worth reading it in its entirety but the gist is that rather than testing a value against a specific type, you generally wish to check that a value confirms to a particular interface. Or if you prefer, using isinstance is imparting nominative typing in a language where duck typing is the convention.

I came across what I initially thought was a reasonable use of isintance and this post is about my attempts to remove the reliance on isinstance.

The case

Writing a web application often involves checking the contents of the current page for specific content. Usually in the form of CSS selectors, or simply text. I had written a simple method to check whether specific texts were contained in any elements matching a given CSS selector. For example, you may have some way of flashing messages and you want to check that after a given request the resulting web page contains the list of messages that you expect.

However, the general idea can be condensed to the more simple task of checking whether a list of texts are present with a main text. So the general idea is something like the following:

def contains_texts(content, texts):
    for t in texts:
        if t not in content:
            return False
    return True

test_content = "abcdefg qwerty"
assert contains_texts(test_content, ['abcdefg', 'qwerty']) # Passes

This works fine, but this is in testing code, so we want to make sure that tests do not inadvertently pass. The problem here is that the texts argument is intended to be a list, or more generally, a container of some strings. The potential problem is that it is easy enough to make the mistake of calling this method with a single string. Strings are iterable, so unfortunately the following assertion passes:

test_content = "abcdefg qwerty"
assert contains_texts(test_content, 'gfedcba') # Passes

This is because each of the characters in the single string argument are themselves in the main content, although the string itself is not.

A potential fix for this is to use isinstance on the argument:

def isinstance_contains_texts(content, texts):
    if isinstance(texts, str):
        return texts in content
    for t in texts:
        if t not in content:
            return False
    return True

test_content = "abcdefg qwerty"
assert isinstance_contains_texts(test_content, 'abcdefg') # Passes
assert not isinstance_contains_texts(test_content, 'gfedcba') # Passes

In this case we do the right thing if the method is called with a single string as the argument, but you could of course instead raise an error. So this basically solves our problem, but at the cost of using isinstance.

What happens if the content becomes a bytestring, and the input strings themselves are bytestrings:

assert isinstance_contains_texts(b'hello world', [b'hello', b'world']) # Passes
assert isinstance_contains_texts(b'hello world', b'olleh') # Passes

We have the same problem as before. We could of course solve this by using if isintance(texts, str) or isinstance(texts, bytes), but that feels like we're not really solving the essence of the problem.

Without isinstance

So instead of using isinstance we could check for what interface is supported by the argument. If we want to check that it is a singleton argument then we would be looking for something supported by strings and bytestrings, but not by, lists, sets, or other general containers:

>>> str_and_bytestring = [x for x in dir('') if x in dir(b'')]
>>> stringy_but_not_listy = [x for x in str_and_bytestring if x not in dir([])]
>>> stringy_but_not_listy
['__getnewargs__', 'capitalize', 'center', 'endswith', 'expandtabs', 'find', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

Okay so our test could be whether or not the texts argument has an attribute capitalize:

def interface_contains_texts(content, texts):
    if hasattr(texts, 'capitalize'):
        return texts in content
    for t in texts:
        if t not in content:
            return False
    return True

test_content = "abcdefg qwerty"
assert interface_contains_texts(test_content, 'abcdefg') # Passes
assert not interface_contains_texts(test_content, 'gfedcba') # Passes
assert interface_contains_texts(b'hello world', [b'hello', b'world'])
assert not interface_contains_texts(b'hello world', b'olleh')

Sort of success, but this feels as if we're not really writing down what we mean. We don't really care if texts has a capitalize attribute at all. What we want to avoid is texts being a single argument that happens to be iterable. We can check which attributes containers such as lists and sets have that strings or bytestrings do not:

>>> list_and_set = [x for x in dir([]) if x in dir(set())]
>>> str_or_bytestring = dir('') + dir(b'')
>>> listy_but_not_string = [x for x in list_and_set if x not in str_or_bytestring]
>>> listy_but_not_string
['clear', 'copy', 'pop', 'remove']

So we can check if the first argument has a 'clear' attribute:

def interface_contains_texts(content, texts):
    if not hasattr(texts, 'clear'):
        return texts in content
    for t in texts:
        if t not in content:
            return False
    return True

test_content = "abcdefg qwerty"
assert interface_contains_texts(test_content, 'abcdefg') # Passes
assert not interface_contains_texts(test_content, 'gfedcba') # Passes
assert interface_contains_texts(b'hello world', [b'hello', b'world']) # Passes
assert not interface_contains_texts(b'hello world', b'olleh') # Passes

Unfortunately, you now cannot pass a generator, the following fails with a type error because interface_contains_texts attempts to check if the generator is in the given string.

assert interface_contains_texts(test_content, (x for x in test_content.split())) 

Unfortunately there is no attribute that is on a list, set, and generator that is not on a string or bytestring. But generators do have a close attribute so we could do:

def interface_contains_texts(content, texts):
    if not hasattr(texts, 'clear') and not hasattr(texts, 'close'):
        ...

This does feel close to what we want. We are saying "If the argument is not a list, set style container, or a generator, then we treat it as one singleton".

All of this could be avoided

Another way to do this would be to allow a variable number of arguments into the contains_texts function:

def variable_args_contains_texts(content, *texts):
    for t in texts:
        if t not in content:
            return False
    return True

test_content = "abcdefg qwerty"
assert variable_args_contains_texts(test_content, 'abcdefg') # Passes
assert not variable_args_contains_texts(test_content, 'gfedcba') # Passes
assert variable_args_contains_texts(test_content, *['abcdefg', 'qwerty']) # Passes
assert variable_args_contains_texts(b'hello world', b'hello', b'world') # Passes
assert not variable_args_contains_texts(b'hello world', b'olleh') # Passes
assert variable_args_contains_texts(test_content, *(x for x in test_content.split())) # Passes

This works for all our cases. You do have to star the argument if you already have a list or generator:

assert variable_args_contains_texts(test_content, *['abcdefg', 'qwerty']) # Passes
assert variable_args_contains_texts(test_content, *(x for x in test_content.split())) # Passes

However, recall our original problem was that if you accidentally passed in a single string as the argument your test function may quietly, and erroneously, pass. However, if you forget the star to this function:

>>> variable_args_contains_texts(test_content, ['abcdefg', 'qwerty'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in variable_args_contains_texts
TypeError: 'in <string>' requires string as left operand, not list

You get an error. That's quite alright with me, this will flag up immediately that I've written my test incorrectly and I can fix it immediately. Even if I had the original contains_texts and update it to this variable args version, when I re-run my test-suite it will let me know of any places I've forgotten to update the call.

Conclusion

Its debateable whether or not the interface_contains_texts version was better than the original isintance_contains_texts version. I prefer it because I feel it is a bit more general. I definitely prefer the eventual variable_args_contains_texts version. I guess that's the main conclusion for now, that checking your uses of isinstance may lead to a better implementation.

Although I was able to use variable arguments to the solve the problem in this case, there will obviously be cases where that fix does not apply. However, the main takeaway is that you don't necessarily have to swap your isinstance use for a hasattr use, thereby directly translating the use of isinstance. It could be that by having a slightly broader look at the general problem you're trying to solve, another solution is available.

Login not required pattern

Introduction

Typically routes in a web application that require login are explicitly marked as such. Whilst routes that are open to general (anonymous) users, are left unmarked and hence implicitly do not require login. Because the default is to allow all users to visit a particular route, it is easy to forget to mark a route as requiring a login.

I'm going to show a small pattern for making sure that all routes in a web application are explicitly marked as either requiring login or not-requiring login. As an example this will be done in a Python, Flask-based web application using Flask-Login but the general idea probably works in at least some other Python web application frameworks.

The Basics

Let's start with a very simple Flask web application that just has two routes:

import flask

app = flask.Flask(__name__, static_folder=None)

@app.route('/unprotected')
def unprotected():
    return 'Hello everyone, anonymous users included!'

@app.route('/protected')
def protected():
    return 'Hello users, only those of you logged-in!'

if __name__ == '__main__':
    app.run()

Now, obviously we want the protected route to only be available to those users who have logged in, whilst the unprotected route is available to all. We'll leave the details of how users actually sign-up, log-in and log-out, see the Flask-Login documentation for examples.

To mark our protected route we can use the decorator provided by Flask-Login:

import flask_login
...

@app.route('/protected')
@flask_login.login_required
def protected():
    return 'Hello users, only those of you logged-in!'
...

I only added two lines, the import of flask_login and a second decorator to the protected method. This is how Flask-Login works. You decorate those routes that you wish to be protected. As I said, this scheme is fine, but it is easy enough to forget to mark a route that should be protected.

Explicitly mark all routes

The scheme I used, was to make a decorator that accepted a parameter. If the parameter is True then all routes associated with the decorated view function are marked as requiring login. In addition, this decorator sets an attribute on the view function itself indicating whether or not login is required. We can then check all view functions and assert that they have the chosen attribute. So first the new view functions, with a decorator we'll add.

@app.route('/unprotected')
@login_required(False)
def unprotected():
    return 'Hello everyone, anonymous users included!'

@app.route('/protected')
@login_required(True)
def protected():
    return 'Hello users, only those of you logged-in!'

So fairly simple stuff. Now to create the decorator:

def login_required(required):
    def decorator(f):
        if required:
            f = flask_login.login_required(f)
        f.login_required = required
        return f
    return decorator

This just calls the original @flask_login.login_required decorator in the case that the argument is True, but in addition adds an attribute to the view function that we can check later.

Ensure all are marked

So to ensure that all routes have been marked in some way you just need to check all view functions have the login_required attribute set:

for name, view_f in app.view_functions.items():
    message = "{} needs to set whether login is required or not".format(name)
    assert hasattr(view_f, 'login_required'), message

if __name__ == '__main__':
    app.run()

Now if you run this you will get such an error. That is because you don't set up all of your own routes, in particular Flask provides a static route. That's easy enough to ignore though:

ignored_views = ['static']

for name, view_f in app.view_functions.items():
    message = "{} needs to set whether login is required or not".format(name)
    assert name in ignored_views or hasattr(view_f, 'login_required'), message

if __name__ == '__main__':
    app.run()

Tidy up

There are just a couple of small caveats. Firstly, Flask-Login, provides a decorator to indicate that not only is login required, but it must be a fresh login. We can just allow passing "fresh" as an argument to our login_required decorator:

def login_required(required):
    def decorator(f):
        if required == 'fresh':
            f = flask_login.fresh_login_required(f)
        elif required:
            f = flask_login.login_required(f)
        f.login_required = required
        return f
    return decorator

To avoid polluting the namespace with our ignored_views name and to indicate what the code is doing without needing a comment we can wrap our check in a function:

def check_all_views_declare_login_required():
    ignored_views = ['static']

    for name, view_f in app.view_functions.items():
        message = "{} needs to set whether login is required or not".format(name)
        assert name in ignored_views or hasattr(view_f, 'login_required'), message
check_all_views_declare_login_required()

if __name__ == '__main__':
    app.run()

Flask extensions

Many Flask extensions provide their own routes which are mapped to view functions that they have defined. You could add those to the ignored_views list, but this seems like additional maintenance. Worse, something like Flask-Admin adds a large number of view functions. In addition, if you setup Flask-Admin to automatically generate model views for each of the models in your database, then if you add to your database model it will generate more view functions and you will have to update your ignored_views list again. The alternative is to perform your check_all_views_declare_login_required before you call init_app on your admin instance. Suppose you have:

import flask
import flask_login

import flask_debugtoolbar
from flask_sqlalchemy import SQLAlchemy

database = SQLAlchemy()
admin = flask_admin.Admin(name='My app', template_mode='bootstrap3')

.... Code to setup the database and add flask-admin model views ...

.... Actual web application code ...

check_all_views_declare_login_required()
# Initialise the Flask-Admin and Flask-DebugToolbar extensions after we have
# checked our own views for declaring login required.
admin.init_app(app)
flask_debugtoolbar.DebugToolbarExtension(app)

if __name__ == '__main__':
    app.run()

Final code

Just to provide the code in a simplest-as-possible form:

import flask
import flask_login

app = flask.Flask(__name__, static_folder=None)

def login_required(required):
    def decorator(f):
        if required == 'fresh':
            f = flask_login.fresh_login_required(f)
        elif required:
            f = flask_login.login_required(f)
        f.login_required = required
        return f
    return decorator


@app.route('/unprotected')
@login_required(False)
def unprotected():
    return 'Hello everyone, anonymous users included!'

@app.route('/protected')
@login_required(True)
def protected():
    return 'Hello users, only those of you logged-in!'

def check_all_views_declare_login_required():
    ignored_views = ['static']

    for name, view_f in app.view_functions.items():
        message = "{} needs to set whether login is required or not".format(name)
        assert name in ignored_views or hasattr(view_f, 'login_required'), message
check_all_views_declare_login_required()

if __name__ == '__main__':
    app.run()

Conclusion

I quite like that I now explicitly state for each route whether I require login or not, rather than relying on a default. You could of course extend this to your particular privileges scheme, for example you may have levels of authentication, such as, administrator, super-user, paid-user, normal-user, anonymous, or whatever.

Because the check is done whenever the module is imported, this check will also be performed when running your test-suite.

This won't work if your view functions are actually view methods because you won't be able to set the attribute on the view method. It is for exactly this reason that we had to use a list of ignored_views and could not just do:

    static_view_function = app.view_functions['static']
    static_view_function.login_required = False

An alternative to this scheme would be to create your own route decorator that takes as parameter whether login is required as well as the normal route information. Essentially combine route and login decorators.

if as syntax possibility

I have a small niggle that comes up in my Python programming. I'm going to describe it and propose a possible addition to the Python programming language that would mostly solve the problem. However, I have not thought through this language modification at all thoroughly so at the moment it's just a germ of an idea. It wouldn't be a major change or fix any important problem in any case, it would just solve my own personal peeve.

Properties

First of all, a brief reminder of the very useful 'property' decorator in Python. You may have an attribute on a class which is initially a simple value. However, at some point later you realise that you need to calculate that attribute everytime it is accessed. You may have something like:

class Person(object):
    def __init__(self, gender):
        self.gender = gender
        self.age = 0

You then realise that this means you would have to continually update the person's age. So instead you implement age as a "property":

class Person(object):
    def __init__(self, gender):
        self.gender = gender
        self.dob = datetime.today()

    @property
    def age(self):
        return (datetime.today() - self.dob).years

For the most part I like this ability and use it frequently. You may even have two distinct classes which both implement a particular protocol, part of which is that the object in question must have a particular attribute. It may be that for one kind of object the attribute is indeed a simple attribute, but that for another it must be calculated.

So properties are, I think, generally a good addition to the language. One downside is that sometimes a simple attribute access, my_object.my_attribute can be a more expensive operation than it looks because it is actually implemented as a property.

Peeve

In Python I find myself doing something like the following quite often:

if object.attribute:
    do_something(object.attribute)
else:
    do_something_else(object)

For example, I might be showing a user of a web application why they cannot perform some action they are attempting to:

if post.unpublishable_reasons:
    for reason in post.unpublishable_reasons:
        flask.flash(reason)
    return render_template('error.html', ...)
else:
    return render_template('published.html', ...)

Where post.unpublishable_reasons might be a property on a class which may take some time to calculate:

class Post(database.Model):
    ....
    @property
    def unpublishable_reasons(self):
        ... something that takes time and ultimately calculates reasons...
        return reasons

Which then leads me to have code like:

reasons = post.unpublishable_reasons
if reasons:
    for reason in reasons:
        flask.flash(reason)
    return render_template('error.html', ...)
else:
    return render_template('published.html', ...)

This just irks me as a little inelegant. So I confess that I would quite like Python to allow something like:

if post.unpublishable_reasons as reasons:
    for reason in reasons:
        flask.flash(reason)
    return render_template('error.html', ...)
else:
    return render_template('published.html', ...)

Comprehensions

This might also partially solve a problem with comprehensions in that you can filter-then-map but not map-then-filter. A filter-then-map looks like:

positive_doubles = [i * 2 for i in my_numbers if i > 0]

We first filter on each number (whether it is greater than zero) then each of those numbers which makes it through the filter is multiplied by two. It's a bit awkward to do the opposite, which is mapping the value first and then filtering on the result of the map. So if we wanted all the squares of a list of numbers that are less than 100 we require to do the operation twice:

small_squares = [i * i for i in my_numbers if (i * i) < 100]

Notice we had to have i * i twice. It might be that the operation in question is quite expensive, so instead we can have two comprehensions:

small_squares = [x for x in (i * i for i in my_numbers) if x < 100]

Of course if the list is very long this might be a bit slow because we iterate through it twice. Edit. Not true as we use a generator on the inner loop as pointed out in the comments below.

Now if we allow some kind of if-as syntax we might be able to do something like:

new_list = [y for x in old_list if f(x) as y]

This doesn't allow general filtering but would allow filtering out of falsey values. I'm much less keen on this as I feel if Python were to attack the map-then-filter problem for comprehensions then it should solve it completely.

In particular this would not work for the small_squares example, for that we would need to allow something like:

small_squares = [x for i in my_numbers if (i * i as x) > 100]

Note that this is even more of an extension than that proposed. That is assigning x to a particular sub-expression of the condition.

Lazy calculation

In many cases whilst programming there is a decision to be made as to whether to store some state, or (re)caluate it as and when needed. Obviously every situation is different and therefore there is no one answer which fits. In this post I'm going to attempt to explain the distinction and the benefits/drawbacks of either approach. I hope that just remembering that this choice exists will force me to make an explicit choice, such that I may think about it a bit more.

Distinction

The distinction is a little akin to that between eager evaluation and lazy evalution but it is not the same. The distinction here is about code-maintenance. I'll start with a very simple example, and then move to a more realistic example which involves access to a database, which makes the decision a bit more interesting.

Suppose you have a very simple class representing a person and the children that they may have:

class Person(object):
    def __init__(self, gender):
        self.gender = gender
        self.boys = []
        self.girls = []

    def have_baby_girl(self)
        girl = Person('female')
        self.girls.append(girl)
        return girl

    def have_baby_boy(self):
        boy = Person('male')
        self.boys.append(boy)
        return boy

Now, suppose somewhere in your code you wish to return the number of children that a particular person has. You can either keep track of this, or calculate it on the fly, here is the keep-track-of-it version:

class Person(object):
    def __init__(self, gender):
        self.gender = gender
        self.boys = []
        self.girls = []
        self.number_of_children = 0

    def have_baby_girl(self)
        girl = Person('female')
        self.girls.append(girl)
        self.number_of_children += 1
        return girl

    def have_baby_boy(self):
        boy = Person('boy')
        self.boys.append(boy)
        self.number_of_children += 1
        return boy

Here is a possible calculate-it-on-the-fly version:

class Person(object):
    def __init__(self, gender):
        self.gender = gender
        self.boys = []
        self.girls = []

    @property
    def number_of_children(self):
        return len(self.boys) + len(self.girls)

    def have_baby_girl(self)
        girl = Person('female')
        self.girls.append(girl)
        return girl

    def have_baby_boy(self):
        boy = Person('boy')
        self.boys.append(boy)
        return boy

In this particular case I prefer the calculate-it-on-the-fly approach. I like that I did not have to modify any existing code, I only had to add some. If I add some other method (suppose an adopt method) then in the keep-track-of-it version I have to make sure I update our state variable number_of_children appropriately. Finally, if we change our definition of what a 'child' is, suppose they have to be under 18 years-of-age, then keeping track-of-it, might not work at all, or if it does I have to be very careful about updating parents whenever a child ages.

In terms of performance, this is often trickty to evaluate correctly. Essentially, you're asking whether the calculation of the state on the fly, is more expensive, than code to keep track of it. This of course depends hugely on often you inspect the state. You may do a lot of work to keep-track of a state variable that is never inspected, or inspected only very rarely. On the other hand, if it is inspected often, but not updated much, the calculate-it-on-the-fly approach, may be needlessly re-doing the exact same computation many times.

As a side-note there is a lazy package for Python that lets you calculate attributes once when needed, and then stores the result for later retrieval. Of course if you update anything the calculation depends upon you have to make sure an invalidate the stored result. I've found it is useful in the case that an attribute won't ever need to be re-calculated, but might never need to be calculated at all (and is expensive to do so).

More interesting example

For in-memory occurrences such as the simple example above, often the choice is pretty clear. Code is often clearer if you calculate-it-on-the-fly, and only resort to keep-track-of-it whenever the value is somewhat expensive to calculate, used very often, or particularly simple to keep track of.

However, the choice becomes more interesting when the calculation of the value involves some external state, even if keeping-track-of-it is done on the external state. A common case is when the state is kept in a database. In Python, you may well be using an ORM to access the external state.

Consider an online game website, for some game that has four players. Let's suppose the game is turn-based and players are not expected to be logged in for the entire duration, but just take their turn whenever they are online. The game then can be in multiple states: waiting for players, running, finished. So in some ORM you might describe a game like:

class Game(database.Model):
    player_one_id = Column(Integer, ForeignKey('user.id'))
    player_two_id = Column(Integer, ForeignKey('user.id'))
    player_three_id = Column(Integer, ForeignKey('user.id'))
    player_four_id = Column(Integer, ForeignKey('user.id'))

    winner_id = Column(Integer, ForeignKey('user.id'))

    .... bunch of other columns that actually describe the game state ....

Now, suppose you wish to display a list of running games for a user, you could do it something like this:

class Game(database.Model):
    ... as before ...
    state_names = ['waiting', 'running', 'finished']
    state = Column(Enum(*state_names), nullable=False, default='waiting')

    @staticmethod
    def running_games(user):
        games = Game.query.filter(
            Game.state == 'running',
            or_(
                Game.player_one_id == user.id,
                Game.player_two_id == user.id,
                Game.player_three_id == user.id,
                Game.player_four_id == user.id,
            )
        ).all()
        return games

    @staticmethod
    def open_games(user):
        games = Game.query.filter(
            Game.state == 'waiting',
            and_(
                Game.player_one_id != user.id,
                Game.player_two_id != user.id,
                Game.player_three_id != user.id,
                Game.player_four_id != user.id,
            )
        ).all()
        return games

Using this way you would have to make sure that when the fourth player joins a game the state is set to 'running', and when the game finishes the state is set to 'finished', along with the 'winner' being set.

Instead one could use calculate-it-on-the-fly as in:

class Game(database.Model):
    ... as before ...
    # No state column

    @staticmethod
    def running_games(user):
        games = Game.query.filter(
            Game.player_one_id != None,
            Game.player_two_id != None,
            Game.player_three_id != None,
            Game.player_four_id != None,
            or_(
                Game.player_one_id == user.id,
                Game.player_two_id == user.id,
                Game.player_three_id == user.id,
                Game.player_four_id == user.id,
            ),
            Game.winner_id == None
        ).all()
        return games

    @staticmethod
    def open_games(user):
        games = Game.query.filter(
            or_(
                Game.player_one_id == None,
                Game.player_two_id == None,
                Game.player_three_id == None,
                Game.player_four_id == None,
            )
            and_(
                Game.player_one_id != user.id,
                Game.player_two_id != user.id,
                Game.player_three_id != user.id,
                Game.player_four_id != user.id,
            )
        ).all()
        return games

This could be simplified a bit if we assume that players fill slots in order such that it is never the case that Game.player_four_id is not None whilst one of the others is.

In this particular case then I think the keep-track-of-it is a little simpler. But this hugely depends on the logic for joining a game, taking ones turn, and finishing the game. In particular when a game is finished you have to set the winner_id column anyway, so it seems like not too much of a burden to additionally set the state column.

However, there are many situations in which, calculate-it-on-the-fly is more appropriate. A common case, occurs when the rules for state changes may change. For example, in the game case, we may decide that as long as there are two players, people can play the game, which is therefore running. People may join a running game, provided it is not full, and may leave a running game. In this case, calculate-it-on-the-fly simply updates the rules for when a game is running. However, keep-track-of-it, not only has to update its rules for when to modify a state (including now reverting a game from running to waiting when someone leaves a two player game), but must also go through the database and modify the state column of any existing games. To do this, the database update code will essentially have to mimic the calculate-it-on-the-fly code anyway.

Conclusion

There is often a choice to be made between maintaining some kind of state up-front, and calculating it whenever it is required. Both are useful and should be used depending on the situation. Recalling that there is such a choice may help a programmer to explicitly make that choice, and perhaps even record the reasons for it.

Covering dead code

Dougal Matthews has written a blog post detailing how Vulture can be used to find some dead code. For me this was an important reminder not to rely on coverage analysis to detect dead code and remove it from the your maintenance burden. More generally, whilst I adore automated analysis tools that assist the developer in maintaining their code, such automated analysis can give a false sense of completeness, or lead to the developer believing that their code is "good enough". It is not a problem I have any solution for though. The rest of the post will try to illuminate this view point through the example of dead-code removal.

Dead code seems like something that should be automatically detected by tools such as both Vulture and coverage.py and indeed many instances of dead code are automatically detected by such tools. However it is worth remembering that there are instances of dead code which can never be automatically detected.

As a brief reminder, dead code is code that we should delete. We should delete it generally because it either has no way of being invoked, or because we no longer require its functionality. Because the former category has a more or less formal definition much of it can (at least in theory) be detected automatically. The latter category is often more difficult to detect because there are no hard rules for it. For example, you may have some code to log the state of a particular object, and this code is invoked by production code. However, the reason for logging the state of a particular object is no longer required. Pretty much no automated analysis can detect this because simply writing down the rules for when such code is dead is at best non-trivial.

Here are some example categories of dead code along with how we might detect/track such dead code.

Unused Variables

If you define a variable, but then never use it, the definition is likely dead-code.

def my_function():
    x = assigned-expr
    # some code that never uses x

Unless the right-hand side of the definition (assigned-expr) has some side-effect which is important then the assignment is dead-code and should be removed. Note here that coverage analysis would tell you that the line is being executed.

Detection

As noted coverage analysis won't work here, and you would have to use something like Vulture. Many decent IDEs will also warn you about most such circumstances.

Unused Methods/Class definitions

If you simple define a method or class which you never then invoke. The exception here is if you're developing a library or otherwise exposing an interface. In this case you should have some automated tests which should invoke the method/class.

Detection

Can generally be done by coverage analysis. There are however some tricky situations which were described in the above mentioned blog post. Essentially you may add a unit-test to test a particular method, which later becomes unused by the actual application but is still invoked by the unit test.

Unused Counters

At one point, you may have decided to keep a count of some particular occurrence, such as the number of guesses. Perhaps at one stage you displayed the number of guesses remaining, but later decided to make the number of guesses unlimited. You may end up with code that looks something like:

guesses = 0
def make_guess():
    guess = get_input()
    global guesses
    guesses += 1
    return guess

Originally your get_input looked something like this:

total_guesses = 20
def get_input():
    remaining = total_guesses - guesses
    return input('You have {} guesses remaining:'.format(remaining))

But since you decided to give unlimited guesses you got rid of that and it is now simply:

def get_input():
    return input("Please input a guess:")

Detection

Slightly more tricky this one since the variable guesses is inspected, it is inspected in the update guesses += 1. Still you could make ask that your automated tool ignore such uses and, in this case, still report the variable as being defined but not used (perhaps Vulture allows this, I don't know).

However, it is not hard to come up with similar examples in which some value is maintained but never actually used. For example we might have written something like:

if total_guesses - guesses > 0:
    guesses += 1

Which would likely fool most automated analyses.

Of course I've called this category "Counters", but it refers to maintaining any kind of state that you don't utlimately make use of. You may have originally kept a list/set of guesses made so far so as to prevent someone making the same guess more than once. If you later decided against this you might forget to remove the code which updates the set of guesses that have been made.

Unused Web Application Routes

You may have a route in your web application which is never linked to by any part of the rest of your application. Using Flask, for this example:

@route('/misc/contact', methods=['GET'])
def contact_page():
    """Display a contact us page"""
    return flask.render_template('contact.jinja')

Now, if, in the rest of your application, you never link to this page, then the page is not likely to be discovered by a user. You may even have a different contact page, perhaps called "support" or "feedback". Perhaps this new contact page was built to replace the older one which it has done, but you left the code for the old route available.

Detection

This is tricky. First of all, you may perfectly well have a page which is not linked to within the remainder of your application but you do want to have available. For example you may have a route (or routes) for an API used by your associated mobile application.

If you have some tests you can use coverage analysis, but if you are doing that you likely originally had some test which covered this page, even if that unit test only visited the page and checked that it contained some content, for example you may have had:

def test_contact_page(self):
    rv = self.app.get('/misc/contact')
    assert b'id="contact-form"' in rv.data

If this test still runs, then your dead route will still be covered by your tests. Checking whether or not the method is ever referenced directly will not work because either such a test will not pick up the unused method because it is used within the @route decorator call, or such a test would ignore that but then flag all your routes as unused.

The only relatively robust way would be to check for calls to flask.url_for("test_contact_page"). Such a check would have to look in templates as well. It may still fail because such a call might never actually be invoked. So the test would have to check the coverage analysis as well.

Conclusion

I take it for granted that checking for (and removing) dead code is a useful activity that improves your code quality and removes some of your technical debt burden. In other words, I take it for granted that dead code represents a form of technical debt. With that in mind it seems useful to deploy any automated analyses which can do part of the job for you. However, any code analysis tool (whether static or dynamic) that cannot detect all (of a class of) problems, has the disadvantage that it will tend to foster a false sense of completeness.

The hope is that doing the automatable part automatically frees the developer up to do the non-automatable parts. In practice I've found that there is a tendency to move the goal-posts from "remove all dead-code" to "remove all dead-code that the automated analysis complains about". More generally from "maintain code free from problem X" to "maintain code such that the automated tools do not complain about problem X".

I'm certainly not arguing not to use such automated analyses. However I don't have a solution for the problem of this implicit and accidental moving (or rather widening) of the goal posts.

In what ways are dynamically typed languages more productive?

Introduction

I do not aim to answer the question in the title more raise the question. The question kind of implies that dynamically typed languages are more productive in at least some ways. It does not imply that statically typed languages are less productive in general, or the opposite.

Before going any further, I'm talking about the distinction between static and dynamic typing which is not the same as strong vs weak typing. Static means the type checking is done at compilation before the program is run, whilst dynamic means types are checked whilst the program is running. Not the same as weak vs strong typing and also not the same as explicit vs implicit typing.

Background

My PhD involved the development of a novel static type system. When I begun my PhD I was firmly of the opinion that statically typed languages were better than dynamically typed languages. My basic contention was that all the guarantees you get from static types you get largely for free, so throwing those away is a stupefying act of self harm. I believe that prior to the turn of the century (if not a bit later), this the majority view in programming language research. It is still a position commonly held but I'm unsure whether it may still be a majority view or not.

New data, should force us to seek out new theories which explain all of the available data. In the past 20 years, one thing is obvious. Dynamically typed languages have seen significant growth and success. Some researchers choose to ignore this. Some explain the success as a fluke, that such languages have become popular despite being dynamically typed. This is a recurrence of an argument made by functional programmers/researchers when C++ and then Java became wildly popular.

The prevailing view amongst researchers was that functional languages were inherently better than Java, but Java got a lot of financial support which meant that a lot of support was added to, in particular, the standard library. This meant that programmers were particularly productive using Java, but that that increase in productivity was mis-attributed by the programmers to the language, when it should have been placed firmly in the excellent standard library support. Much of this support is work that open source programmers are not quite so keen on, because frankly, it's a bit boring and unrewarding. I personally find this argument at least lightly compelling.

However, in the first decade of this century, as I said, dynamically typed languages have had at least significant success. Python, PHP, and Ruby are the most obvious examples. None of these were backed financially by any large corporation, at least not prior to their success. I again suspect that much of the productivity gained with the use of such languages can be placed in the library support. But that does not explain where the library support has come from. If dynamically typed languages were so obviously counter-productive, then why did anyone waste their time writing library support code in-and-for them?

Some Wild Hypotheses Appear

I am now going to state some hypotheses to explain this. This does not mean I endorse any of these.

Short term vs Long Term

One possible answer. Dynamically typed languages increase short term productivity at the cost of long term productivity. I don't personally believe this but I do find it plausible. However, I do not know of any evidence for or against this position. I'm not even sure there is much of a logical argument for it.

The kinds of bugs that functional programming languages help prevent are the kind of bug that is hard to demonstrate. It is easy enough to show a bug in an imperative program that would not occur in a functional program because you do not have mutable state. However, such demonstration bugs tend to be a bit contrived, and it is hard to show that such bugs come up in real code frequently. On top of that, to show that functional languages are more productive one would have to show that by restricting mutable state you do not lose more productivity than you gain by avoiding such bugs. If you did manage to show this, you would have a reasonable argument that functional languages are bad for short-term productivity, due to the restrictions on mutable state changes, but compensate in greater long-term productivity.

So, a similar kind of argument could be made for statically typed languages. If you could show that statically typed languages prevent a certain class of bugs and that the long-term productivity gained from that is more than enough to compensate for any short-term loss in productivity brought on by restrictions due to the type-system.

So I will leave my verdict on this hypothesis as, I believe it to be false but it is plausible. Just to note, there is no great evidence that either statically typed languages have greater long-term productivity, or that dynamically typed languages have greater short-term productivity.

Testing Tools

A trend that seems to have tracked (ie. correlates with, either via causation in either direction or by coincidence) the trend in use of (and success of) dynamically typed languages is the trend towards more rigorous testing, or rather the rise in popularity of more rigorous testing. In particular test-driven development style methodologies have gained significant support.

I believe that having a comprehensive test suite, somewhat dilutes the benefits gained from a static type system. Here is a good challenge, try to find a bug that is caught by the type checker, that would not be caught by a test suite with 100% code coverage.

One possibility is an exhaustive pattern match, however, if your test suite is not catching this, it's not a great test suite. Still, exhaustive pattern match tests is something you get more or less for free with a static type checker, whilst a test suite has to work at it.

It is certainly possible to come up with a bug that is caught by static type checking and not by a test suite that has full coverage. Here is an example:

    if (a and b):
        x = 1
    else
        x = "1"
    if b:
        return string(x + 2)
    else:
        return x

This is a bug, because b might be true, whilst a is false. Which would mean that x is set to a string, but later treated as an integer, because b is true. A good test suite, will of course catch this bug. But it is still possible to achieve 100% code coverage (at the statement) level, and not catch this bug. Still, you have to try quite hard to arrange this.

Mutation testing, which tests your tests, rather than your implementation code, should catch this simple example (because it will mutate the condition (a and b) to be (a or b) which won't make any difference if your tests never caught the bug initially. This will mean that the mutant will pass all tests, and you should at that point realise your tests are not comprehensive enough.

Dynamically Typed Language Benefits

So you should have a comprehensive test suite for all code whether you are using a statically or dynamically typed language. We may then accept the theory that a comprehensive test suite somewhat dilutes any benefits of using a statically typed language. However, that theory does not give any reasaon why a static type system is detrimental to productivity.

This was always my main contention, that whatever might be the benefits of static typing, you are getting them for free, so why not? I honestly do not know what, if any, benefit there is from having a dynamic type system. I can think of some plausible candidates, but have no evidence that any of these are true:

  1. Freeing the language from a type system, allows the language designers to include some productivity boosting features that are not available in a statically typed language. I find this suggestion a little weak, no one has ever been able to point me to such a language feature.
  2. Knowing there is no safety net of the type system encourages developers to test (and document) more. I find this theory more compelling.
  3. One can try simple modifications without the need to make many fiddly changes, such as, for example, adding an exception to a method signature, often in many places.

I suspect, that if there are significant benefits to using a dynamically typed language, then it is a combination of 2 and 3, or some other reason.

For the third, a rebuttal often mentions automatic refactoring tools. Which may well in theory be something of a good rebuttal, but in practice developers simply don't use such tools often. I'm not sure why not, I myself have never taken to them. So perhaps there is a productivity gain from using a dynamically typed language which would be all but negated if only the developers would use automatic refactoring tools, but they don't. So it shouldn't be a productivity win for dynamically typed languages, but in practice it is (this is all still conjecture).

The second one has some evidence from psychology. There is a lot of evidence to suggest that safety mechanisms often do not increase overall safety but simply allow more risky behaviour. A very famous example is a seat-belt study done in Germany. Wherein mandating the wearing of seat-belts caused drivers to drive faster. This means that you are more likely to have a crash, but less likely to be seriously injured in one. This has similarly been done with anti-lock braking systems, where the brakes being significantly better did not reduce accidents but rather increased risky driving so that the number of accidents remained largely constant.

I mentioned documentation because it's an important one. There are plenty of libraries for statically typed languages for which the only documentation for many of the functions/methods is the type signature. This is often seen as "good enough", or at least good enough to mean that API documentation is not at the top of the todo stack. A dynamically-typed language does not typically have signatures for methods/functions. As a result, they tend to have fewer undocumented libraries, simply because the developer of the library knows that their methods will otherwise not be used. If that is the case, what is the point in the library? So they tend to write something, and once you are writing something it isn't so hard to write something useful.

Summary

This is getting too long. So I'll stop there for now. The main points are:

  1. Dynamically typed languages have had a lot of success this century, which remains largely unexplained
  2. I think that with comprehensive testing, the gains from a static type system are diluted significantly
  3. It might be that dynamically typed language encourage more/better testing (or have some other non-obvious advantage)
  4. Otherwise, there is scant evidence for anything a dynamically typed language actually does better
  5. I think an obvious win for statically typed languages would be to make the type checking phase optional.

I am not sure why we do not really see any examples of languages that have optional static type checking. In other words languages in which the user decided when type checking should be done.

As a final point. For dynamically typed languages, it is common to deploy some form of static analyser. Often these static analysers fall short of the kind of guarantees afforded by a static type system, but they have two significant advantages. Firstly, you can run the program without running the static analyser. In particular, you can run your test suite, which may well give you more information about the correctness of your code than a type checker would. Especially in the case that the type checker fails. It tells you about one specific problem in your code, but not how the rest of your code does or does not pass your tests. Secondly you can deploy different static analysers for different problems. For example a statically typed language has to decide whether or not to include exceptions in types. A dynamically typed language can easily offer both. I suppose a statically typed language could offer both as well.

Selenium and Javascript Events

Selenium is a great way to test web applications and it has Python bindings. I explained in a previous post how to set this up with coverage analysis.

However, writing tests is non-trivial, in particular it is easy enough to write tests that suffer from race conditions. Suppose you write a test that includes a check for the existence of a particular DOM element. Here is a convenient method to make doing so a one-liner. It assumes that you are within a class that has the web driver as a member and that you're using 'pytest' but you can easily adapt this for your own needs.

def assertCssSelectorExists(self, css_selector):
    """ Asserts that there is an element that matches the given
    css selector."""
    # We do not actually need to do anything special here, if the
    # element does not exist we fill fail with a NoSuchElementException
    # however we wrap this up in a pytest.fail because the error message
    # is then a bit nicer to read.
    try:
        self.driver.find_element_by_css_selector(css_selector)
    except NoSuchElementException:
        pytest.fail("Element {0} not found!".format(css_selector))

The problem is that this test might fail if it is performed too early. If you are merely testing after loading a page, this should work, however you may be testing after some click by a user which invokes a Javascript method.

Suppose you have an application which loads a page, and then loads all comments made on that page (perhaps it is a blog engine). Now suppose you wish to allow re-loading the list of comments without re-loading the entire page. You might have an Ajax call.

As before I tend to write my Javascript in Coffeescript, so suppose I have a Coffeescript function which is called when the user clicks on a #refresh-comment-feed-button button:

refresh_comments = (page_id) ->
  posting = $.post '/grabcomments', page_id: page_id
  posting.done receive_comments
  posting.fail (data) ->
    ...

So this makes an Ajax call which will call the function receive_comments when the Ajax call returns (successfully). We write the receive_comments as:

receive_comments = (data) ->
  ... code to delete current comments and replace them with those returned

Typically data will be some JSON data, perhaps the comments associated with the page_id we gave as an argument to our Ajax call.

To test this you would navigate to the page in question and check that there are no comments, then open a new browser window and make two comments (or alternatively directly adding the comments to the database), followed by switching back to the first browser window and then performing the following steps:

    refresh_comment_feed_css = '#refresh-comment-feed-button'
    self.click_element_with_css(refresh_comment_feed_css)
    self.check_comments([first_comment, second_comment])

Where self.check_comments is a method that checks the particular comments exist on the current page. This could be done by using find_elements_by_css_selector and then looking at the text attributes of each returned element.

The problem is, that the final line is likely to be run before the results of the Ajax call invoked from the click on the #refresh-comment-feed-button are returned to the page.

A quick trick to get around this is to simply change the Javascript to somehow record when the Ajax results are returned and then use Selenium to wait until the relevant Javascript evaluates to true.

So we change our receive_comments method to be:

comments_successfully_updated = 0
receive_comments = (data) ->
  ... code to delete current comments and replace them with those returned
  comments_successfully_updated += 1

Note that we only increment the counter after we have updated the page.

Now, we can update our Selenium test to be:

    refresh_comment_feed_css = '#refresh-comment-feed-button'
    self.click_element_with_css(refresh_comment_feed_css)
    self.wait_for_comment_refresh_count(1)
    self.check_comments([first_comment, second_comment])

The 1 argument assumes that this will be the first time the comments are updated during your test. Of course as you run down your test you can increase this argument as required. The code for the wait_for_comment_refresh_count is given by:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.common.by import By
...

class MyTest(object):
    ... # assume that 'self.driver' is set appropriately.
    def wait_for_comment_refresh_count(self, count):
        def check_refresh_count(driver):
            script = 'return comments_successfully_updated;'
            feed_count = driver.execute_script(script)
            return feed_count == count
        WebDriverWait(self.driver, 5).until(check_refresh_count)

The key point is executing the Javascript to check the comments_successfully_updated variable with driver.execute_script. We then use a WebDriverWait to wait for a maximum of 5 seconds until the our condition is satisfied.

Conclusion

Updating a Javascript counter to record when Javascript events have occurred can allow your Selenium tests to synchronise, that is, wait for the correct time to check the results of a Javascript event.

This can solve problems of getting a StaleElementReferenceException or a NoSuchElementException because your Selenium test is running a check on an element too early before your page has been updated.