Covering dead code

Dougal Matthews has written a blog post detailing how Vulture can be used to find some dead code. For me this was an important reminder not to rely on coverage analysis to detect dead code and remove it from the your maintenance burden. More generally, whilst I adore automated analysis tools that assist the developer in maintaining their code, such automated analysis can give a false sense of completeness, or lead to the developer believing that their code is "good enough". It is not a problem I have any solution for though. The rest of the post will try to illuminate this view point through the example of dead-code removal.

Dead code seems like something that should be automatically detected by tools such as both Vulture and coverage.py and indeed many instances of dead code are automatically detected by such tools. However it is worth remembering that there are instances of dead code which can never be automatically detected.

As a brief reminder, dead code is code that we should delete. We should delete it generally because it either has no way of being invoked, or because we no longer require its functionality. Because the former category has a more or less formal definition much of it can (at least in theory) be detected automatically. The latter category is often more difficult to detect because there are no hard rules for it. For example, you may have some code to log the state of a particular object, and this code is invoked by production code. However, the reason for logging the state of a particular object is no longer required. Pretty much no automated analysis can detect this because simply writing down the rules for when such code is dead is at best non-trivial.

Here are some example categories of dead code along with how we might detect/track such dead code.

Unused Variables

If you define a variable, but then never use it, the definition is likely dead-code.

def my_function():
    x = assigned-expr
    # some code that never uses x

Unless the right-hand side of the definition (assigned-expr) has some side-effect which is important then the assignment is dead-code and should be removed. Note here that coverage analysis would tell you that the line is being executed.

Detection

As noted coverage analysis won't work here, and you would have to use something like Vulture. Many decent IDEs will also warn you about most such circumstances.

Unused Methods/Class definitions

If you simple define a method or class which you never then invoke. The exception here is if you're developing a library or otherwise exposing an interface. In this case you should have some automated tests which should invoke the method/class.

Detection

Can generally be done by coverage analysis. There are however some tricky situations which were described in the above mentioned blog post. Essentially you may add a unit-test to test a particular method, which later becomes unused by the actual application but is still invoked by the unit test.

Unused Counters

At one point, you may have decided to keep a count of some particular occurrence, such as the number of guesses. Perhaps at one stage you displayed the number of guesses remaining, but later decided to make the number of guesses unlimited. You may end up with code that looks something like:

guesses = 0
def make_guess():
    guess = get_input()
    global guesses
    guesses += 1
    return guess

Originally your get_input looked something like this:

total_guesses = 20
def get_input():
    remaining = total_guesses - guesses
    return input('You have {} guesses remaining:'.format(remaining))

But since you decided to give unlimited guesses you got rid of that and it is now simply:

def get_input():
    return input("Please input a guess:")

Detection

Slightly more tricky this one since the variable guesses is inspected, it is inspected in the update guesses += 1. Still you could make ask that your automated tool ignore such uses and, in this case, still report the variable as being defined but not used (perhaps Vulture allows this, I don't know).

However, it is not hard to come up with similar examples in which some value is maintained but never actually used. For example we might have written something like:

if total_guesses - guesses > 0:
    guesses += 1

Which would likely fool most automated analyses.

Of course I've called this category "Counters", but it refers to maintaining any kind of state that you don't utlimately make use of. You may have originally kept a list/set of guesses made so far so as to prevent someone making the same guess more than once. If you later decided against this you might forget to remove the code which updates the set of guesses that have been made.

Unused Web Application Routes

You may have a route in your web application which is never linked to by any part of the rest of your application. Using Flask, for this example:

@route('/misc/contact', methods=['GET'])
def contact_page():
    """Display a contact us page"""
    return flask.render_template('contact.jinja')

Now, if, in the rest of your application, you never link to this page, then the page is not likely to be discovered by a user. You may even have a different contact page, perhaps called "support" or "feedback". Perhaps this new contact page was built to replace the older one which it has done, but you left the code for the old route available.

Detection

This is tricky. First of all, you may perfectly well have a page which is not linked to within the remainder of your application but you do want to have available. For example you may have a route (or routes) for an API used by your associated mobile application.

If you have some tests you can use coverage analysis, but if you are doing that you likely originally had some test which covered this page, even if that unit test only visited the page and checked that it contained some content, for example you may have had:

def test_contact_page(self):
    rv = self.app.get('/misc/contact')
    assert b'id="contact-form"' in rv.data

If this test still runs, then your dead route will still be covered by your tests. Checking whether or not the method is ever referenced directly will not work because either such a test will not pick up the unused method because it is used within the @route decorator call, or such a test would ignore that but then flag all your routes as unused.

The only relatively robust way would be to check for calls to flask.url_for("test_contact_page"). Such a check would have to look in templates as well. It may still fail because such a call might never actually be invoked. So the test would have to check the coverage analysis as well.

Conclusion

I take it for granted that checking for (and removing) dead code is a useful activity that improves your code quality and removes some of your technical debt burden. In other words, I take it for granted that dead code represents a form of technical debt. With that in mind it seems useful to deploy any automated analyses which can do part of the job for you. However, any code analysis tool (whether static or dynamic) that cannot detect all (of a class of) problems, has the disadvantage that it will tend to foster a false sense of completeness.

The hope is that doing the automatable part automatically frees the developer up to do the non-automatable parts. In practice I've found that there is a tendency to move the goal-posts from "remove all dead-code" to "remove all dead-code that the automated analysis complains about". More generally from "maintain code free from problem X" to "maintain code such that the automated tools do not complain about problem X".

I'm certainly not arguing not to use such automated analyses. However I don't have a solution for the problem of this implicit and accidental moving (or rather widening) of the goal posts.

In what ways are dynamically typed languages more productive?

Introduction

I do not aim to answer the question in the title more raise the question. The question kind of implies that dynamically typed languages are more productive in at least some ways. It does not imply that statically typed languages are less productive in general, or the opposite.

Before going any further, I'm talking about the distinction between static and dynamic typing which is not the same as strong vs weak typing. Static means the type checking is done at compilation before the program is run, whilst dynamic means types are checked whilst the program is running. Not the same as weak vs strong typing and also not the same as explicit vs implicit typing.

Background

My PhD involved the development of a novel static type system. When I begun my PhD I was firmly of the opinion that statically typed languages were better than dynamically typed languages. My basic contention was that all the guarantees you get from static types you get largely for free, so throwing those away is a stupefying act of self harm. I believe that prior to the turn of the century (if not a bit later), this the majority view in programming language research. It is still a position commonly held but I'm unsure whether it may still be a majority view or not.

New data, should force us to seek out new theories which explain all of the available data. In the past 20 years, one thing is obvious. Dynamically typed languages have seen significant growth and success. Some researchers choose to ignore this. Some explain the success as a fluke, that such languages have become popular despite being dynamically typed. This is a recurrence of an argument made by functional programmers/researchers when C++ and then Java became wildly popular.

The prevailing view amongst researchers was that functional languages were inherently better than Java, but Java got a lot of financial support which meant that a lot of support was added to, in particular, the standard library. This meant that programmers were particularly productive using Java, but that that increase in productivity was mis-attributed by the programmers to the language, when it should have been placed firmly in the excellent standard library support. Much of this support is work that open source programmers are not quite so keen on, because frankly, it's a bit boring and unrewarding. I personally find this argument at least lightly compelling.

However, in the first decade of this century, as I said, dynamically typed languages have had at least significant success. Python, PHP, and Ruby are the most obvious examples. None of these were backed financially by any large corporation, at least not prior to their success. I again suspect that much of the productivity gained with the use of such languages can be placed in the library support. But that does not explain where the library support has come from. If dynamically typed languages were so obviously counter-productive, then why did anyone waste their time writing library support code in-and-for them?

Some Wild Hypotheses Appear

I am now going to state some hypotheses to explain this. This does not mean I endorse any of these.

Short term vs Long Term

One possible answer. Dynamically typed languages increase short term productivity at the cost of long term productivity. I don't personally believe this but I do find it plausible. However, I do not know of any evidence for or against this position. I'm not even sure there is much of a logical argument for it.

The kinds of bugs that functional programming languages help prevent are the kind of bug that is hard to demonstrate. It is easy enough to show a bug in an imperative program that would not occur in a functional program because you do not have mutable state. However, such demonstration bugs tend to be a bit contrived, and it is hard to show that such bugs come up in real code frequently. On top of that, to show that functional languages are more productive one would have to show that by restricting mutable state you do not lose more productivity than you gain by avoiding such bugs. If you did manage to show this, you would have a reasonable argument that functional languages are bad for short-term productivity, due to the restrictions on mutable state changes, but compensate in greater long-term productivity.

So, a similar kind of argument could be made for statically typed languages. If you could show that statically typed languages prevent a certain class of bugs and that the long-term productivity gained from that is more than enough to compensate for any short-term loss in productivity brought on by restrictions due to the type-system.

So I will leave my verdict on this hypothesis as, I believe it to be false but it is plausible. Just to note, there is no great evidence that either statically typed languages have greater long-term productivity, or that dynamically typed languages have greater short-term productivity.

Testing Tools

A trend that seems to have tracked (ie. correlates with, either via causation in either direction or by coincidence) the trend in use of (and success of) dynamically typed languages is the trend towards more rigorous testing, or rather the rise in popularity of more rigorous testing. In particular test-driven development style methodologies have gained significant support.

I believe that having a comprehensive test suite, somewhat dilutes the benefits gained from a static type system. Here is a good challenge, try to find a bug that is caught by the type checker, that would not be caught by a test suite with 100% code coverage.

One possibility is an exhaustive pattern match, however, if your test suite is not catching this, it's not a great test suite. Still, exhaustive pattern match tests is something you get more or less for free with a static type checker, whilst a test suite has to work at it.

It is certainly possible to come up with a bug that is caught by static type checking and not by a test suite that has full coverage. Here is an example:

    if (a and b):
        x = 1
    else
        x = "1"
    if b:
        return string(x + 2)
    else:
        return x

This is a bug, because b might be true, whilst a is false. Which would mean that x is set to a string, but later treated as an integer, because b is true. A good test suite, will of course catch this bug. But it is still possible to achieve 100% code coverage (at the statement) level, and not catch this bug. Still, you have to try quite hard to arrange this.

Mutation testing, which tests your tests, rather than your implementation code, should catch this simple example (because it will mutate the condition (a and b) to be (a or b) which won't make any difference if your tests never caught the bug initially. This will mean that the mutant will pass all tests, and you should at that point realise your tests are not comprehensive enough.

Dynamically Typed Language Benefits

So you should have a comprehensive test suite for all code whether you are using a statically or dynamically typed language. We may then accept the theory that a comprehensive test suite somewhat dilutes any benefits of using a statically typed language. However, that theory does not give any reasaon why a static type system is detrimental to productivity.

This was always my main contention, that whatever might be the benefits of static typing, you are getting them for free, so why not? I honestly do not know what, if any, benefit there is from having a dynamic type system. I can think of some plausible candidates, but have no evidence that any of these are true:

  1. Freeing the language from a type system, allows the language designers to include some productivity boosting features that are not available in a statically typed language. I find this suggestion a little weak, no one has ever been able to point me to such a language feature.
  2. Knowing there is no safety net of the type system encourages developers to test (and document) more. I find this theory more compelling.
  3. One can try simple modifications without the need to make many fiddly changes, such as, for example, adding an exception to a method signature, often in many places.

I suspect, that if there are significant benefits to using a dynamically typed language, then it is a combination of 2 and 3, or some other reason.

For the third, a rebuttal often mentions automatic refactoring tools. Which may well in theory be something of a good rebuttal, but in practice developers simply don't use such tools often. I'm not sure why not, I myself have never taken to them. So perhaps there is a productivity gain from using a dynamically typed language which would be all but negated if only the developers would use automatic refactoring tools, but they don't. So it shouldn't be a productivity win for dynamically typed languages, but in practice it is (this is all still conjecture).

The second one has some evidence from psychology. There is a lot of evidence to suggest that safety mechanisms often do not increase overall safety but simply allow more risky behaviour. A very famous example is a seat-belt study done in Germany. Wherein mandating the wearing of seat-belts caused drivers to drive faster. This means that you are more likely to have a crash, but less likely to be seriously injured in one. This has similarly been done with anti-lock braking systems, where the brakes being significantly better did not reduce accidents but rather increased risky driving so that the number of accidents remained largely constant.

I mentioned documentation because it's an important one. There are plenty of libraries for statically typed languages for which the only documentation for many of the functions/methods is the type signature. This is often seen as "good enough", or at least good enough to mean that API documentation is not at the top of the todo stack. A dynamically-typed language does not typically have signatures for methods/functions. As a result, they tend to have fewer undocumented libraries, simply because the developer of the library knows that their methods will otherwise not be used. If that is the case, what is the point in the library? So they tend to write something, and once you are writing something it isn't so hard to write something useful.

Summary

This is getting too long. So I'll stop there for now. The main points are:

  1. Dynamically typed languages have had a lot of success this century, which remains largely unexplained
  2. I think that with comprehensive testing, the gains from a static type system are diluted significantly
  3. It might be that dynamically typed language encourage more/better testing (or have some other non-obvious advantage)
  4. Otherwise, there is scant evidence for anything a dynamically typed language actually does better
  5. I think an obvious win for statically typed languages would be to make the type checking phase optional.

I am not sure why we do not really see any examples of languages that have optional static type checking. In other words languages in which the user decided when type checking should be done.

As a final point. For dynamically typed languages, it is common to deploy some form of static analyser. Often these static analysers fall short of the kind of guarantees afforded by a static type system, but they have two significant advantages. Firstly, you can run the program without running the static analyser. In particular, you can run your test suite, which may well give you more information about the correctness of your code than a type checker would. Especially in the case that the type checker fails. It tells you about one specific problem in your code, but not how the rest of your code does or does not pass your tests. Secondly you can deploy different static analysers for different problems. For example a statically typed language has to decide whether or not to include exceptions in types. A dynamically typed language can easily offer both. I suppose a statically typed language could offer both as well.

Selenium and Javascript Events

Selenium is a great way to test web applications and it has Python bindings. I explained in a previous post how to set this up with coverage analysis.

However, writing tests is non-trivial, in particular it is easy enough to write tests that suffer from race conditions. Suppose you write a test that includes a check for the existence of a particular DOM element. Here is a convenient method to make doing so a one-liner. It assumes that you are within a class that has the web driver as a member and that you're using 'pytest' but you can easily adapt this for your own needs.

def assertCssSelectorExists(self, css_selector):
    """ Asserts that there is an element that matches the given
    css selector."""
    # We do not actually need to do anything special here, if the
    # element does not exist we fill fail with a NoSuchElementException
    # however we wrap this up in a pytest.fail because the error message
    # is then a bit nicer to read.
    try:
        self.driver.find_element_by_css_selector(css_selector)
    except NoSuchElementException:
        pytest.fail("Element {0} not found!".format(css_selector))

The problem is that this test might fail if it is performed too early. If you are merely testing after loading a page, this should work, however you may be testing after some click by a user which invokes a Javascript method.

Suppose you have an application which loads a page, and then loads all comments made on that page (perhaps it is a blog engine). Now suppose you wish to allow re-loading the list of comments without re-loading the entire page. You might have an Ajax call.

As before I tend to write my Javascript in Coffeescript, so suppose I have a Coffeescript function which is called when the user clicks on a #refresh-comment-feed-button button:

refresh_comments = (page_id) ->
  posting = $.post '/grabcomments', page_id: page_id
  posting.done receive_comments
  posting.fail (data) ->
    ...

So this makes an Ajax call which will call the function receive_comments when the Ajax call returns (successfully). We write the receive_comments as:

receive_comments = (data) ->
  ... code to delete current comments and replace them with those returned

Typically data will be some JSON data, perhaps the comments associated with the page_id we gave as an argument to our Ajax call.

To test this you would navigate to the page in question and check that there are no comments, then open a new browser window and make two comments (or alternatively directly adding the comments to the database), followed by switching back to the first browser window and then performing the following steps:

    refresh_comment_feed_css = '#refresh-comment-feed-button'
    self.click_element_with_css(refresh_comment_feed_css)
    self.check_comments([first_comment, second_comment])

Where self.check_comments is a method that checks the particular comments exist on the current page. This could be done by using find_elements_by_css_selector and then looking at the text attributes of each returned element.

The problem is, that the final line is likely to be run before the results of the Ajax call invoked from the click on the #refresh-comment-feed-button are returned to the page.

A quick trick to get around this is to simply change the Javascript to somehow record when the Ajax results are returned and then use Selenium to wait until the relevant Javascript evaluates to true.

So we change our receive_comments method to be:

comments_successfully_updated = 0
receive_comments = (data) ->
  ... code to delete current comments and replace them with those returned
  comments_successfully_updated += 1

Note that we only increment the counter after we have updated the page.

Now, we can update our Selenium test to be:

    refresh_comment_feed_css = '#refresh-comment-feed-button'
    self.click_element_with_css(refresh_comment_feed_css)
    self.wait_for_comment_refresh_count(1)
    self.check_comments([first_comment, second_comment])

The 1 argument assumes that this will be the first time the comments are updated during your test. Of course as you run down your test you can increase this argument as required. The code for the wait_for_comment_refresh_count is given by:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.common.by import By
...

class MyTest(object):
    ... # assume that 'self.driver' is set appropriately.
    def wait_for_comment_refresh_count(self, count):
        def check_refresh_count(driver):
            script = 'return comments_successfully_updated;'
            feed_count = driver.execute_script(script)
            return feed_count == count
        WebDriverWait(self.driver, 5).until(check_refresh_count)

The key point is executing the Javascript to check the comments_successfully_updated variable with driver.execute_script. We then use a WebDriverWait to wait for a maximum of 5 seconds until the our condition is satisfied.

Conclusion

Updating a Javascript counter to record when Javascript events have occurred can allow your Selenium tests to synchronise, that is, wait for the correct time to check the results of a Javascript event.

This can solve problems of getting a StaleElementReferenceException or a NoSuchElementException because your Selenium test is running a check on an element too early before your page has been updated.

Method Cascading

Method Cascading

Vasudev Ram has a thoughful post about method chaining/cascading that I picked up from planet python in which he basically argues for the use of method cascading. I'm going to disagree. Essentially, I simply don't understand any benefit of using cascading. It's a nice post though and includes some references to other method cascading links.

Method chaining is the writing of multiple method calls directly after one another, usually on the same line, such as (to take Vasudev's example):

foo.bar().baz()

Cascading is the specific case of chaining in which each intermediate object is the same object. To achieve this bar must return self (in Python, or this in other object oriented languages).

Here is Vasudev's first example:

Let's say we have a class Foo that contains two methods, bar and baz. We create an instance of the class Foo:

foo = Foo()

Without method chaining, to call both bar and baz in turn, on the object foo, we would do this:

# Fragment 1
foo.bar() # Call method bar() on object foo.
foo.baz() # Call method baz() on object foo.

With method chaining, we can this:

# Fragment 2
# Chain calls to methods bar() and baz() on object foo.
foo.bar().baz()

So the claim for method cascading then is:

One advantage of method chaining is that it reduces the number of times you have to use the name of the object: only once in Fragment 2 above, vs. twice in Fragment 1; and this difference will increase when there are more method calls on the same object. Thereby, it also slightly reduces the amount of code one has to read, understand, test, debug and maintain, overall. Not major benefits, but can be useful.

So method cascading reduces the number of times you have to use the name of an object, but this makes it inherently less explicit that you're operating on the same object. Looking at foo.bar().baz() does not tell me that baz is being called on the same object as bar. Unless you're keen on method cascading and use it yourself, it looks like the opposite.

Method cascading may therefore reduce

the amount of code one has to read, understand, test, debug and maintain, overall.

However it does so, only in a "code-golf" way. There is no point in reducing the amount of code to understand if by doing so you increase the difficulty with which you can understand it.

A common example of method cascading is one Vasudev includes, that of string processing. Here we have a line such as (which I've translated into Python 3):

print ('After uppercase then capitalize:',
        sp.dup().uppercase().capitalize().rep())

Whilst it is quite nice to be able to do this in one line without using a new variable name, I would write this without method cascading as:

duplicate = sp.dup()
duplicate.uppercase()
duplicate.capitalize()
print('After uppercase then capitalize:', duplicate.rep())

Now it is obvious that dup returns something new, in this case it is a duplicate of the original string. It is also clear that uppercase and capitalize do not return new objects but modify the duplicate object.

So, I'm afraid I just don't see the use case for cascading.

Test First and Mutation Testing

Test First and Mutation Testing

I'm going to argue that mutation testing has a strong use in a test first development environment and I'll conclude by proposing a mechanism to link mutation testing to the source code control mechanism to further aid test first development.

Test First

Just to be clear, when I say 'test first' I mean development in which before writing a feature, or fixing a bug, you first write a test which should only pass once you have completed that feature. For the purposes of this post you needn't be doing that for every line of code you write. The idea here applies whether you are writing the odd feature by first writing a test for it, or whether you have a strict policy of writing no code until there is a test for it.

Mutation Testing

Mutation testing is the process of automatically changing some parts of your source code generally to check that your test suite is not indifferent to the change. For example, your source code may contain a conditional statement such as the following:

    if x > 0:
        do_something()

Now if we suppose that the current condition is correct, then changing it to a similar but different condition, for example x >= 0 or x > 1 then presumably this would turn correct code into incorrect code. If your tests are comprehensive then at least one of them should fail due to the now incorrect code.

Fail First

It's easy enough to unintentionally write a test that always passes, or perhaps passes too easily. One of the reasons for writing the test first is to make sure that it fails when the feature has not yet been implemented (or fixed). However, often such a test can fail for trivial reasons. For example you may write a unit test that fails simple because the method it tests is not yet defined. Similarly a web test may fail because the route is not yet defined. Unless you continue to run the test during development of your feature you won't necessarily know that your test is particularly effective at catching when your feature is broken.

Fail After

Whether you write the test before your new feature or after the feature is ready, mutation testing can assist with the problem of non-stringent tests. Mutation testing can assist in reassuring you that your new test is effective at catching errors, whether those errors are introduced when the feature is developed or through later changes. If you apply lots of mutations to your code and your new test never fails then there is a strong likelihood that you have an ineffective test that passes too easily.

Source Code control

A feature I would like to add to a mutation test package is to integrate with a source code control mechanism such as Git. The mutation tester must choose lines of the program to mutate. However, your new test is presumably aimed at testing the new code that you write. Hence we could use the source code control mechanism to mutate lines of code that are newer than the test or some specified commit. That way we would focus our mutation testing to testing the efficacy of the new test(s) with respect to the new or changed lines of code.

This does not preclude doing general mutation testing for features that, for example, depend upon a lot of existing code. Perhaps your new feature is simply a display of existing calculations.

Conclusion

In summary:

  • Mutation testing helps find tests that are ineffective.
  • This plays particularly well with a test first development process in which the test often fails the first time for trivial reasons, thus giving you false assurance that your test can fail.
  • Integrating source code control to target the mutations towards new code could improve this significantly, or at least make it a bit more convenient.

Placement of Python Import Statements

Pep 8 specifies that all import statements should be "put at the top of the file, just after any module comments and docstrings, and before module globals and constants." However, it does not really specify the logic behind this. I'm going to try to articulate some reasons to have import statements somewhere other than directly at the top of the file. I'll also state some arguments for having such import statements at the top.

Note, that Pep 8 also specifies that it is important to "know when to be inconsistent -- sometimes the style guide just doesn't apply. When in doubt, use your best judgment". So the purpose of this post is to suggest some reasons why you might deviate from the style guide with respect to import statements at the top.

Read more…

Update: Flask+Coverage

Update: Flask+Coverage Analysis

In a previous post I demonstrated how to get coverage analysis working for a Flask web application in a relatively simple manner. In the section "At then end of your tests" I stated that you needed your tests to clean-up by telling the server to shutdown. The end of your test code would look something like this:

finally:
    driver.get(get_url('shutdown'))
    ...

This could have made things a little fiddly since your test code would have to make sure to access the shutdown route exactly once, regardless of how many tests were run.

However, I realised that we could remove the burden from the test code by simply doing this in manage.py file.

Updated manage.py

Previously, we had the following code within our manage.py script within the run_with_test_server method:

test_process = subprocess.Popen(test_command)
test_process.wait(timeout=60)
server_return_code = server.wait(timeout=60)

We now update this to be:

test_process = subprocess.Popen(test_command)
test_process.wait(timeout=60)
port = application.config['TEST_SERVER_PORT']
shutdown_url = 'http://localhost:{}/shutdown'.format(port)
response = urllib.request.urlopen(shutdown_url)
print(bytes.decode(response.read()))
server_return_code = server.wait(timeout=60)

Doing so means you can just write your tests without any need to worry about shutting down the server. The example repository has been appropriately updated.

Flask + Coverage Analysis

Flask + Coverage Analysis

This post demonstrates a simple web application written in Flask with coverage analysis for the tests. The main idea should be pretty translatable into most Python web application frameworks.

Update

I've updated this scheme and described the update here.

tl;dr

If you're having difficulty getting coverage analysis to work with Flask then have a look at my example repository. The main take away is that you simply start the server in a process of its own using coverage to start it. However, in order for this to work you have to make sure you can shut the server process down from the test process. To do this we simply add a new "shutdown" route which is only available under testing. Your test code, whether written in Python or, say Javascript, can then make a request to this "shutdown" route once it completes its tests. This allows the server process to shutdown naturally and therefore allow 'coverage' to complete.

Introduction

It's a good idea to test the web applications you author. Most web application frameworks provide relatively solid means to do this. However, if you're doing browser automated functional tests against a live server I've found that getting coverage to work to be non-trivial. A quick search will reveal similar difficulties such as this stack overflow question, which ultimately points to the coverage documentation on sub-processes.

Part of the reason for this might be that the Flask-Testing extension provides live server testing class that starts your server in testing mode as part of the start-up of the test. It then also shuts the server process down, but in so doing does not allow coverage to complete.

A simpler method is to start the server process yourself under coverage. You then only need a means to shutdown the server programatically. I do this by adding a shutdown route.

Read more…

Selenium vs CasperJS

Suppose you have a web service using a Python web framework such as Flask. So you wish to do a full user test by automating the browser. Should you use the Python bindings to Selenium or CasperJS? In this post I wish to detail some of the advantages and drawbacks of both.

Python + Selenium Setup

This is fairly straightforward. You are simply writing some Python code that happens to call a library which binds to Selenium. In order for this to work you will need to install phantomJS but using npm this is pretty trivial.

Javascript + CasperJS Setup

I say Javascript, but at least when I'm using casperJS it tends to be from Coffeescript, but whichever is your fancy works well enough. Similarly to the above you will need to install casperJS through the npm but again this is pretty trivial.

Speed

For my sample project the casperJS tests are faster, by quite a bit. This certainly warrants some more investigation as to exactly why this is the case, but for now my sample project runs the casperJS tests in around 3 seconds whilst the Selenium ones take around 12 seconds to do the same amount of work. I'm not sure whether this is a constant factor or whether it will get worse with more tests.

Read more…

Welcome

Welcome to Coding Diet

This is a simple blog intended to be related to good software development practices, but will probably contain more practical examples that I come across as I code. I use mainly Python for programming but, since a proportion of the coding I do is for web development I end up writing CoffeeScript or vanilla Javascript as well. I'm not averse to other languages as well and have a fairly strong background in Haskell which I may occasionally refer to as an example.