Conditional refactorings
I have been reading through some of the source code from the requests library. This is one of the more well-known, well-used, well-loved, and well-praised Python libraries I know of. I came across some conditional code which I personally would refactor. A first refactor I'm pretty confident about whilst a second I'm less certain is an improvement. I'll detail them both and welcome any comments anyone has.
A specific example of this kind of conditional comes in the adapters module specifically the cert_verify method. Here is the code in question, slightly snipped for brevity:
def cert_verify(self, conn, url, verify, cert): """Verify a SSL certificate. This method should not be called from user ... :param verify: Either a boolean, in which case it controls whether we verify """ if url.lower().startswith('https') and verify: cert_loc = None # Allow self-specified cert location. if verify is not True: cert_loc = verify if not cert_loc: cert_loc = DEFAULT_CA_BUNDLE_PATH if not cert_loc or not os.path.exists(cert_loc): raise IOError("Could not find a suitable TLS CA certificate bundle, " "invalid path: {0}".format(cert_loc)) conn.cert_reqs = 'CERT_REQUIRED' if not os.path.isdir(cert_loc): conn.ca_certs = cert_loc else: conn.ca_cert_dir = cert_loc else: conn.cert_reqs = 'CERT_NONE' conn.ca_certs = None conn.ca_cert_dir = None
The two refactors I would consider is the setting of cert_loc
and the order of the two conditional branches, ie. the if
and else
branches.
Setting cert_loc
This kind of pattern is quite common where a variable is set to a particular value and then a condition is evaluated which may result in the variable being updated. In this particular example we have:
cert_loc = None # Allow self-specified cert location. if verify is not True: cert_loc = verify if not cert_loc: cert_loc = DEFAULT_CA_BUNDLE_PATH
First of all note that verify
cannot be Falsey
. So at the end of these conditionals cert_loc
will either be DEFAULT_CA_BUNDLE_PATH
if verify
is exactly True
or it will be set to whatever verify
is. So we can write this more succinctly as:
cert_loc = DEFAULT_CA_BUNDLE_PATH if verify is True else verify
If you don't like the conditional expression you can do this as:
if verify is True: cert_loc = DEFAULT_CA_BUNDLE_PATH else: cert_loc = verify
All this was only really possible because we knew that verify
could not be Falsey
. Still I consider the setting of a variable to a default before then considering whether or not to update it to be a very minor anti-pattern. So imagine that we could not be sure that verify
is not Falsey
, we could still re-write our conditional without the default setting as:
if verify is True or not verify: cert_loc = DEFAULT_CA_BUNDLE_PATH else: cert_loc = verify
More generally where you see the pattern:
x = None if cond1: x = val1 if cond2: x = val2 ... if not x: x = default_x
Then you can change this to be:
if cond1 and val1: x = val1 elif cond2 and val2: x = val2 else: x = default_x
Sometimes this is awkward if one or more of the val_?
values are interesting expressions.
One defence of the original style is that you know your variable will be set to something so you will never end up with an undefined variable error. However, if this were ever the case then you have not fully written down your logic, and hence the early error is likely a good thing.
Switch the if
and else
branches
A suggestion I often make when you have two branches, such that one is significantly shorter than the other, is to have the shorter branch first. The reason for this is to give the reader less of a "stack" to recall as they are reading through the code. Often when reading code you get to the end of the long branch, to find a shorter second branch but you've forgotten what the original condition was. So, it's tempting to suggest to modify the original condition as:
if not (url.lower().startswith('https') and verify): conn.cert_reqs = 'CERT_NONE' conn.ca_certs = None conn.ca_cert_dir = None cert_loc = None else: # Allow self-specified cert location. if verify is not True: cert_loc = verify if not cert_loc: cert_loc = DEFAULT_CA_BUNDLE_PATH if not cert_loc or not os.path.exists(cert_loc): raise IOError("Could not find a suitable TLS CA certificate bundle, " "invalid path: {0}".format(cert_loc)) conn.cert_reqs = 'CERT_REQUIRED' if not os.path.isdir(cert_loc): conn.ca_certs = cert_loc else: conn.ca_cert_dir = cert_loc
I hesitate to suggest it in this case because the original if
branch has code which depends upon the condition, namely that verify
is not Falsey
. However, more generally this can be something of a win for the person reading the code.
So, two pretty simple refactors, thoughts?