Parameter lists and python's def syntax

It is possible, by moderately extending the python syntax for defining functions, to greatly enhance the flexibility of python. This may be read as `add enough rope to make it hard for novices to avoid accidentally hanging themselves', which might fairly persuade Guido not to do it. However, I make use of this idea in my pseudo-pythonic implementations of classes and similar, so here I document the idea.

Status Quo Ante: what python 1 provides

In python 1, functions are specified using the def statement, of form:

`def' name `(' params `)' `:' suite

in which params is a (potentially empty) comma-separated sequence of terms; it may be described by:

[ name `,' ...] [ name `=' value `,' ...] [ `*' name `,' ] [ `**' name `,' ]

except that the last `,' in the sequence may be elided. Terms of the first kind I'll describe as `simple parameters'; those of the second kind as `defaulted parameters' - the value serve as `default' for the given parameter - while the two * forms are special parameters. Terms are only allowed in the given order in python 1: the thrust of my proposed change is to allow more flexibility here. No name may be used twice in params (as a name; names used in default values are a separate matter).

When the function is called, it receives a tuple of `positional arguments' and a dictionary of `keyword arguments'; the caller can either supply these as tuple and dictionary separately using the python built-in apply or by invoking the function in its own right, using an expression of form:

name `(' args `)'

in which args is, like params in the function definition, a comma-separated sequence of terms, but only (in python 1) of the first two forms,

[ name `,' ...] [ name `=' value `,' ...]

again with the final `,' optional. The terms, in args, of the first kind (like simple parameters) are known as `positional arguments' while those of the second kind (like defaulted parameters) are known as `keyword arguments'. The positional arguments are gathered into a tuple, in the order given; the keyword arguments into a dictionary, using each name as a key mapped to the given value.

The arguments supplied to the call are used to initialise the parameters supplied to the function's def statement; the suite of the function is then executed and the subject of its return (implicitly None if no return is executed) is used as the value of the function-call expression. If a *name parameter was defined, its value will always be a tuple; if a **name parameter was defined, its value will always be a dictionary; each of which may be empty.

Positional arguments initialise the non-special parameters in corresponding position. If more positional arguments are given than there are non-special parameters, the interpreter raises a TypeError unless a *name parameter is supplied; in which case the extra positional arguments are packed up as a tuple whose value initialises the given name.

If enough positional arguments were supplied to provide a value for some non-special parameter, it is an error for any keyword argument to use that parameter's name. If any simple name parameter is not given a value by the positional arguments, it is an error for its name to not be used in a keyword argument.

A keyword argument using the name of a non-special parameter supplies its value to that parameter. In the absence of a **name parameter, no other keyword argument names are allowed (i.e. if any are used, TypeError gets raised); in the presence of **name, keyword arguments with names other than those of non-special parameters are gathered together into the dictionary used to initialise the given name.

If the above gives no name to a defaulted parameter, that parameter uses its default; the above implies that a TypeError will be raised if any simple parameter is not given a value or if both positional and keyword arguments supply a value for some given parameter.

Note that if apply is used to invoke a function which has a **bok parameter, the dictionary passed to apply and the one used to initialise the function's bok are separate: apply does not modify the dictionary it is passed, it just transcribes data from it into a fresh dictionary which the interpreter creates, which is used to initialise the ** parameter.

Ramifications

Illustrations, first without using any special parameters:

def double(x):
    return x + x

A function which has only simple arguments can only be called with exactly as many arguments as it specifies - in this case 1 - though the argument can be specified as a keyword, e.g. either double(6) or double(x=6).

def fibonacci(what, zero=0, one=1):
    if what < 1: return zero
    was, result = zero, one

    while what > 1:
        was, result = result, result + was
        what = what - 1

    return result

This has a pair of optional arguments - the entries at positions 0 and 1 in a sequence specified by: `the next entry is the sum of the previous two' - but requires a first argument (the index of the entry sought). If it is only given this first argument, it uses the values 0 and 1 for the given entries (so its entries starting at what=1 are: 1, 1, 2, 3, 5, ... as familiar to those who recognise the function's name).

def remember(value, row=[]):
    if value:
        if value not in row: row.append(value)
        return tuple(row)
    row = None
    return ()

Default values for parameters are computed when the function is defined: thus row is here initialised to point to a list object. The same list object will be used as row's initialiser at each call to remember, with just one positional parameter, even if it has been called with a false value in the past, hence executed the assignment row = None (which, consequently, has no effect). Furthermore, on successive calls to remember with distinct true values, the calls to append modify the list being used as row's initialiser: the same list is used each time, but lists are mutable. Of course, if remember is called with two arguments then row will use the supplied value and its default will not be involved.

import math
def circumference(radius, pi=math.pi):
    return pi * 2 * radius
del math

Sometimes (as artificially arranged here by del; but a function defined within the body of a function might use a parameter of the outer as the default for the inner) a function needs to remember a value which was available when the function was defined but the source will have been removed by the time the function is due to be called. This can be achieved by `tunnelling' the datum through the default of a `parameter' which the caller is told to not supply via an argument.

Of course, if such a function is called other than in accord with the given restraint, it'll give wrong answers rather than raising the TypeError one might have expected; caveat invocator.

Now, with special parameters:

def conjoin(*args): # and
    for arg in args:
        if not arg: return arg
    return 1

Here arbitrarily many arguments are allowed: args is initialised as a tuple of all the positional arguments supplied (and no keyword arguments are allowed). Note, however, that if we wanted to supply a particular value to be used as the final true value, in place of 1,

>>> def conjoin(*args, true=1):
  File "", line 1
    def conjoin(*args, true=1):
                          ^
SyntaxError: invalid syntax

we'd be stumped: there is no way to supply a keyword argument to a function taking arbitrarily many inputs, unless by putting it first we oblige every caller to specify an argument for the parameter (since, otherwise, it'll be over-ridden by the first positional argument).

def dictionary(**what):
    return what

If this is invoked as dictionary(a=1,b=2,c=3) it'll return the dictionary built out of these, {'a':1, 'b':2, 'c':3}. Note that

>>> bok = {}
>>> new = apply(dictionary, (), bok)
>>> bok['a'] = 1
>>> new
{}

the dictionary received as what is separate from the one passed down to apply, if that is used to invoke dictionary. (This is a Good Thing, trust me: Guido got it right. ;^) Finally, let me introduce a limited form of currie from an imaginary lazy-lookup scheme:

def bymeth(key, obj=self):
    func = getattr(obj.__lazy__, key)
    def meth(k, s=obj, f=func):
        return f(s,k)
    return meth

This returns a lookup (c.f. lookup.html), due to be called with only one input (an attribute name, k). Because we know the signature of func, taking the object on which to do the lookup as well as the attribute name, and we know what we're returning will only be called with one input, we can get away with tunneling the function and its first argument through defaults of arguments we know won't be supplied. However, if we wanted to implement the sort of currie()ing that method-lookup for classes does, supplying the instance as first argument but allowing the method to take arbitrarily many further arguments, we'd need to replace meth's single `real' parameter, k (the other two are tunnels), with *args, **what, i.e. `arbitrary parameters'. However,

def bymeth(key, obj=self, src=klaz):
    func = getattr(src, key) # may raise AttributeError; let it.
    def meth(*args, **what, f=func, s=obj):
        return apply(f, (s,) + args, what)
    return meth

would get a SyntaxError again.

Limitations and a suggestion

There is no way to tunnel a datum from the context defining a function to its invocations without potential for arguments to hide the tunneled datum from some invocations.

Likewise, there is no way to have a function take arbitrarily many positional arguments and yet honour an optional keyword argument; nor can one insist (e.g. for forward compatibility reasons) that certain parameters only be provided by keyword arguments, not positional ones.

These can be bypassed by using a class whose instances support a __call__ method; its initialiser receives the data we want to tunnel and the __call__ method packages this up with the arguments received when the instance was used as a callable. Thus Guido's design decisions are good and solve the problem for real python users: while the following addresses the problems another way, it does so for the sake of my warped perspective - I do not claim it is what Guido should change python to do (though I'd welcome some or all of it if Guido persuades himself that it wouldn't unduly confuse innocents).

However, given my interest in `under the bonnet' experiments and my choice that functions are `more fundamental' than class structuring (essentially because I am interested in arbitrary variants on class structure), this doesn't help me: I need currie()ing in order to implement classes (and their variants), so I can't use classes to implement it !

So, what happens if we liberalise the restriction on the order of the various kinds of parameter ?

Clearly allowing any defaulted parameter to the left of a simple name parameter will lead to bizarre results: one is obliged to supply a value for each simple parameter, as it has no default, so either one must supply it as a keyword argument or one must supply enough positional arguments to supply it, in which case one has supplied the defaulted parameter via it. The clear intent of keyword arguments is to supply values for optional parameters; which is mocked by having a required keyword argument. All the same, this is clearly what it would mean to have a defaulted parameter before a simple one.

Equally clearly, no positional argument can supply a value to any parameter after the * one; by analogy, it makes sense to insist that no keyword argument can supply a value for any parameter after the ** one. In such a case,

def many(first, *rest, keyonly=None, **more, puretunnel=fred): ...
def two(first, **more, posonly=None, *rest, puretunnel=fred): ...

the keyonly parameter can only be supplied by a keyword argument while the posonly parameter can only be supplied by a positional argument (the second positional argument to the call, with all subsequent positional arguments, if any, then going in rest). If we omitted the default for keyonly, every call to many would be obliged to use the name keyonly for a keyword argument; if we omitted the default for posonly, every call to two would have to receive (at least) two positional arguments. In the latter case, note that we'd be obliged to supply first by a positional argument, so the second definition is equivalent to

def two(**more, first, posonly=None, *rest, puretunnel=fred): ...

In both cases, the puretunnel parameter cannot be supplied by any argument: it is after both special parameters. Consequently, it must be given a default (unless we frob the semantics to mean that, for an example suggested on types-sig (by Evan, I think) back in 1998: if a pure tunnel parameter is given no default, e.g. as name, it is implicitly expanded to say name=name, i.e. to pass through the thus-named variable visible in the context defining the function; but I shall ignore this tweak, as I don't need it and I can imagine it confusing innocents). I describe such parameters as `pure tunnels' since all they do is carry some information from the context which defined the function to its invocations.

Now, pure tunnels solve the currie() problem and defaulted parameters after a * parameter suffice to provide for a function taking arbitrarily many positional argument, yet supporting an optional keyword argument. So it suffices to only loosen up the rules to allow defaulted parameters after either special, while still requiring all simple parameters to come before all others and retaining the order constraint on special parameters.

I am not at all convinced I ever want a mandatory keyword argument or a parameter which can only be supplied by a positional argument (i.e. a simple parameter after a special one), so the above would suffice. However, one twist remains: one may wish to have a pure tunnel, or a parameter which can only be supplied in keyword form (e.g. for forward compatibility reasons, when one suspects some further parameters may need to be added before it), without accepting any more arguments than specific parameters - i.e., to have things to the right of special parameters without supporting the `and arbitrarily further arguments' features supported by the special parameters.

Fortunately, there is a natural way out of this: allow special parameters without names. These have the same meaning as the corresponding special with a name, save that the function implicitly begins with a test that the given name has been given a false value (i.e. the tuple or dictionary is empty), raising TypeError if not. Thus a parameter consisting of a * simply says that no more positional arguments may be supplied than there are parameters before the * - it serves to protect parameters after it from being supplied by surplus positional arguments. Likewise a ** parameter with no name says that keyword arguments can only have the names of non-special parameters appearing before the **.

In such a case, a parameter sequence without a flavour of special parameter is equivalent to one with a nameless parameter of its kind `as far right as possible': this is at the end of the parameters except in the case of an absent * parameter in a parameter list which includes a ** parameter, in which case the implicit nameless * appears as the last parameter before the ** (at least, if we retain the order constraint between the two specials).

Consequently, web pages and illustrative code in this webspace presume that the parameter sequence, params, has form:

[ name `,' ...] [ name `=' value `,' ...] [ `*' [name] `,' ]
                [ name `=' value `,' ...] [ `**' [name] `,' ] [ name `=' value `,' ...]

with its final `,' optional.

Written by Eddy.
$Id: def.html,v 1.2 2002/01/30 17:52:21 eddy Exp $