I am going to IPC8 - the eighth International Python Conference, 2000/Jan, 24 to 27 - and I intend to make some suggestions at the discussion of the shape of python 2. What I intend to suggest is a
(going for the throat of what `object oriented' means)
(it just takes what we have and pretends we got it another way)
(everything fits neatly together) and mostly
(letting old things fail to notice that the world around them has changed)
consolidation of what python already does right.
The nub of the changes:
My main objective is to unify the way the python interpreter looks values up in namespaces: specifically, to reduce each variant to a sequence of carefully crafted python function calls, which can then be packaged in a single function (which calls each in turn) to which the interpreter simply passes a name. Several existing types (classes, instances, modules and packages) may sensibly be subsumed into a single type of this form.
- uniformity
- spread the magic namespace protocols around
- Builtin dictionaries, lists, tuples, etc. support attribute lookup for all the names, like __getitem__, that one has to implement to mimic them; likewise, numbers and string support their __add__, __mul__ and similar attributes.
- If a module defines __call__ in its namespace, the module is callable; if a module defines __add__ in its namespace, one may perform addition on it: module + thing is evaluated as module.__add__(thing); and so on. More generally ...
- If namespace-lookup on an object yields a value in response to one of python's methods with magic names, the resulting value is interpreted as the appropriate method on that object; if a python built-in object supports the functionality associated with a magic name, then it does have the appropriate method (albeit the interpreter may short-cut using this).
- You can import from any namespace, including instances of classes. (So, e.g., os.path could be an instance of some class - it need not be a module - but we can still from os.path import join: crucially, one can re-implement a package, replacing some of its sub-modules with other objects, e.g. instances of classes, without breaking code which imports from those submodules)
- The hierarchical import namespace (i.e. packages) is `just another' variety of namespace lookup (with heavy use of laziness).
- builders
- generalise class to a more primitive tool using which class itself may be implemented
- The magic word class gets downgraded from reserved to built-in
- This built-in is a builder: a function which returns a pair
- Where python 1 allowed class, allow any builder
- builder name [ bases ] : suite
- The expressions builder and bases are evaluated; the former should yield a builder, the latter a sequence (exceptions will be raised at run-time if not).
- the builder is called with tuple(bases) (empty tuple in the absence of bases) as its argument list (and, at the interpreter's discretion, keyword arguments for stuff like __file__);
- the pair returned is read as (built-object, suite-wrapper)
- the built-object is assigned under the given name and
- the suite is executed in the namespace of suite-wrapper; name binding actions in the code become namespace modification, mediated by methods of the suite-wrapper. These methods will typically modify the built-object's namespace, though the attribute modification functionality used during the suite need not be accessible from the built-object itself - which is all we have access to once the suite's execution has completed.
- Where a build statement is processed, as above, within the body of a function: use the function's local namespace as globals when executing the suite of the build statement (which will finish executing before the function's locals get discarded) but,
- as at present, use the globals used by the function (i.e. containing module) as the globals for any functions (e.g. methods) being defined within the build statement
note that any (default) values supplied for parameters of a function are evaluated in the build statement's suite, so can be used to `tunnel' data from the outer function (read as globals) to the inner. See Novelties, below, for illustrations.
- tunnelling
- liberty in the parameter-list's placement of *s
Safe tunnels make it possible for the functional programmers to implement their favourite tools (notably currie) in python: without them the implementation of class would have to ask for currie as a built-in (see currie.py and below).
- Allow the parameter-list of a def statement (or lambda expression), after its arbitrarily-many purely positional (i.e. without default) parameters, to have arbitrarily many name=value parameters arbitrarily interspersed with up to one each of
each consisting of the appropriate number of *s, optionally followed by an identifier. (In the absence of either item, the parameter-list implicitly ends with the relevant item without an identifier.)
- a *-item
- a **-item
- As at present, no identifier may be used (as a parameter name or item name) more than once in a parameter list.
- Each keyword argument passed to a call of the function using the same name as some (positional or name=value) parameter preceding the ** initialises that parameter, over-riding any default.
- If the ** item has no identifier, no other keyword arguments may be passed. Otherwise, all other keyword arguments are packed up as a dictionary (defaulting to the empty dictionary, { } if none remain) and bound to the **'s identifier.
- Positional arguments can only provide values to (positional and name=value) parameters preceding both the (single) * and all positional and name=value parameters which have been initialised by keyword arguments (but the ** item is ignored here).
- The candidate parameters this allows are initialised from positional arguments; spare arguments may remain, or some later candidates might not be initialised, if the two lists' lengths don't match. A TypeError is raised at this point if:
- any purely positional parameter has not been initialised (too few arguments)
- the * lacks an identifier and there were spare arguments
- if the lone * has an identifier, any leftover arguments are packed up as a tuple (defaulting to the empty tuple, ( ) if none are left over) which is bound to this identifier.
- Defaults for parameters appearing after the lone * can only be over-ridden using name=value arguments.
- Defaults for parameters appearing after the ** can only be over-ridden by positional arguments
- Parameters appearing after both are safe tunnels between the context which created the function object and the context in which it runs itself.
- See also, above, the namespace tweak for builders.
These make it possible to do bizarre things of the kinds some folk want to do (meta-classes and disturbed variants on class) while providing for the things we already have as particular mechanisms. I've sketched out how to do classes and instances; I'm looking at modules and packages, but I'm pretty sure they're icing (especially given python.builder).
Note: Tibs tells me there's a good python-to-HTML converter out there; so I'll produce prettified versions of the various .py files indicated from here; please accept the plain form for the present (2000/Jan/15). [I've also discovered a bug in grail 0.5 when viewing .py files: if you get an exception, view it (say yes to the dialog); if it's in .../grail/filetypes/text_plain.py at line 12 (YMMV), where it says headers.get('content-type'), then change get to getheaders and re-start grail.]
Jargon (there are two ways objects delegate attribute lookup to one another; I need distinct words for them; so I play Humpty-Dumpty): `inherit' involves binding an object in as the first argument, ready-supplied, of a function (this is also called currie()ing), `borrow' doesn't; an instance inherits methods from its class, but borrows other attributes; a class borrows from its bases, potentially `bequeathing' (the flip-side of inheriting) some of what it borrows to its instances.
[ implementation: with docs | just the code]
The attentive among you will be wondering how I avoid terrible problems with classes, e.g. defining a __call__ callable for its instances to inherit, conflicting with the __call__ I insist the class be using to do instantiation. This inescapably means that the class must have `somewhere else' to store methods for its instances to inherit - indeed, several have the wrong signature, if used as attributes of the class itself. Consequently, we're left with one infidelity of this scheme: by moving the methods off the class, onto a sub-object (call it .__method__ for now), I've dodged the original problem (and instances won't notice) but I've forced methods of derived classes to refer to methods of their bases via this attribute name.
Furthermore, instances borrow attributes from their class, which now has a __call__: so, if the class hasn't provided the instance with a method of this name, the instance is going to borrow the class' __call__ (and likewise for any other magic attributes the class itself posesses). This, in turn, obliges the class to have a second sub-object (call it .__data__ for now) to carry the non-methods in its namespace: instances then borrow from this object, instead of the class. The class can also borrow safely from this object, so that the data seen by instances do appear in the class' own namespace.
With the one unavoidable infidelity, class can be implemented in python as a carefully-crafted builder. This is so organised that:
Old code which refers to a method provided by a class (for its instances) using attribute lookup directly on the class - i.e. as class.method - will present problems. Even if the class borrowed attributes from both sub-objects, any methods it has `for itself' (notably __call__) must hide the attributes available to be borrowed, or the class itself would fail to support instantiation. Workaround: insert .__method__ (the name of the callables' sub-object) in code wherever such lookups are done directly (painful). The typical place where existing code does this is in an __init__ method, so I'll illustrate with that:
class new (recent, old):
__recinit, __oldinit = recent.__init__, old.__init__
def __init__(self, first, *args, **what):
__self.recinit(first)
apply(self.__oldinit, args, what)
will need to use old.__method__.__init__ in place of old.__init__ (it may also wish to access it via a tunnel so as to only do the lookup once, when defining new.__init__, but that's another matter) and likewise for recent.
[See also: asides.] Another potential way out of this would be to hack the details of how attribute lookup is done during the body of a method (analogously to __private name munging) to include the dereference, but this would have to filter the attribute lookups quite thoroughly to identify the ones to which to apply this ... a few changes to code aren't such a huge price !
New magic names:
present in the namespace of the initialisation wrapper produced by primitive builders (notably within: see python.py), but absent from the namespace left behind once the suite has been executed in this wrapper's namespace. A more sophisticated builder may (at its option) build an object which persistently remembers its __lookups__ or whose __lookups__ is inaccessible by the time its suite gets to be executed.
Entries in __lookups__ are `lookup' functions - taking one argument (attribute name), raising AttributeError on failure or returning a value for the attribute named by the one argument. Typically, one of these will carry a dictionary (in a safe tunnel) in which it looks up attributes; this lookup, or another acting in collusion with it, may well also provide values for __setattr__ and __delattr__ which happen to modify the given dictionary; the name __dict__ might also be accessible. See python.py's instance, builder and class builders for variations on the theme; compare and contrast with check.py's equivalents.
The sub-namespace in which a class carries round the data its instances borrow from it -- e.g. fall-back values for attributes that they might set on themselves. It is easy to have the class itself borrow from its own __data__ (before the first base of the class) so as to expose these as attributes of the class itself.
The sub-namespace in which a class carries round the unbound methods (functions expecting a first argument called self, typically) the class' instances will inherit. Having the class itself show these attributes in its namespace would cause problems, so the class doesn't borrow from its __method__.
There follow various fragments of possible (if not necessarily sensible) code that one could write given the above changes:
4.__rsub__(7) # -> 3
def currie(func, *prior, **named):
def result(*args, **what,
func=func, prior=prior, given=named):
what.update(given) # but see discussion in currie.py
try: return apply(func, prior + args, what)
except TypeError:
return apply(currie, (func,) + prior + args, what)
return result
import python
submodule = python.instance # a primitive builder - see python.py
# Now for a function which knows when it is being called without argument:
# no matter what argument you try to give it, you can't fool it.
submodule sentinel: pass # we could equally use: sentinel = [] or {}
def witharg(value=sentinel, *, **, sentry=sentinel):
"""Returns true if you called it with an argument, false if not.
Accepts one optional argument. """
if value is sentry:
return 0 # witharg()
return 1 # witharg(anything)
del sentinel
# Now, no-one but witharg has access to sentinel.
# (Well, OK, witharg(witharg.func_defaults[0]) returns 0, but
# introspection is cheating.)
# Random illustration of local and global nestings:
from python import class
import bonnet
submodule innards:
"""So innards is a sub-namespace of the containing module."""
def spawn(prompt, *classes):
# To refer to itself, this function would need to call itself
# innards.spawn, going via global - which also provides builders class
# and submodule
sub = submodule # copied from globals so new's suite can see it
class new classes: # any tuple will do for bases-tuple
# we can see prompt as a global: tunnel it safely into dawn
def dawn(self, *, **, ask=prompt):
print innards.__doc__, ask # here we see innards as a global
# we can't see submodule, but we can see sub in spawn's locals
sub _pawn:
given = classes
said = prompt
def __init__(self, obj):
# We can have self borrow off the given source
self.__lookups__.insert(-1, bonnet.aslookup(obj))
# aslookup(obj)(key) is getattr(obj, key); see bonnet.py
return new
hello = innards.spawn('Hello') # a class
hi = hello([]) # an instance, borrowing from a list
hi.append(0) # borrowing [].append
`hi` # borrowed __repr__, yields '[0]'
hi.dawn() # print innards.__doc__, 'Hello'
from hi._pawn import given, said # so what if hi and its ._pawn aren't modules ?
hi.dawn('say something') # raises TypeError, no args allowed
innards.spawn = None # raises AttributeError, innards now lacks __setattr__
del innards.spawn # likewise __delattr__
Note that it is possible to cope with the above if build statements insist on the bases-tuple looking like a tuple, unlike `class new classes', though it'd be slightly more fiddly, and depend on bonnet.py's primitive builder within:
new, wrap = apply(class, classes)
bonnet.within tmp (wrap):
def dawn( ... ) ...
Some rather more sensible novelties appear in novelty.py and check.py. Another might be the building of a light-weight hierarchical namespace (e.g. simply to hold a body of data which naturally falls into a hierarchy), such as
submodule Sun:
"""The star at a focus of Earth's orbit."""
name = 'Sun'
type = 'G2 V'
magnitude = 4.79
def __repr__(): return 'Sol'
submodule Earth:
"""The blue planet, cradle of life."""
name = 'Earth'
submodule surface:
"""Where most of the life is."""
radius = 6.37102e6 # metres
gravity = 9.80665 # m/s/s
flattening = 1 / 298.25
nature = { 'Land': .292, 'Ocean': .708 })
submodule orbit:
"""How our home moves about."""
radius = .1496e12 # metres
period = 773 * 7 * pow(18, 3) # seconds (Gregorian approximation)
eccentricity = .0167
centre = Sun
def __repr__(): return 'Terra'
Earth.surface.radius / Earth.orbit.radius # -> 4.25870320856e-05
and I could have hours of fun replacing some of those uses of submodule with uses of some of the neat things one could build with python.py's gennie, the builder of class and other builder-classes.
See also discussion of __data__ and __method__, above: and note that I do not put these names forward as `what we should use' but as something to call them for the present.
The implementation of class, in python.py, involves various design choices, primarily concerned with which attributes hang off suite-wrappers and which are carried by the built object. While I could have kept the differences between it and existing python 1 to the bare minimum (only the insertion of .__method__ when referencing methods of classes), I have chosen to do some things differently just because they seem like a nice idea and serve to illustrate the freedoms open to those who wish to hack `how classes behave'. I offer the following not as `how it should be done', merely as `ways it could be done'.
Where an instance of something class-like has, at the same time, some bases from which it borrows, we need to chose whether it borrows attributes from the __data__ of its class before or after it borrows from its bases. No existing python code involves an object with bases being an instance of something, so I have no precedent to guide me on this one. For the present, I've settled on class-before-bases.
At present, there is no way to hang a function off a class for use as a function, rather than bound to an instance; and instances are affected by changes to the attributes of a class, even if these are made outside its defining suite. I have so implemented class that its two sub-objects are frozen once the suite has executed: but the class built has a __dict__ attribute (visible to the outside world) and attribute-modification methods acting on this (which don't come into play until after the suite has run). Consequently, each instance really sees the data and methods defined within the suite; it can access things stored subsequently on its class, via .__class__, but it doesn't see these on itself. At the same time, we gain the ability to add non-method functions to the namespace of the class (as long as we do so after the suite).
At present, for an instance to have access to its __dict__ attribute, the rest of the universe has to be given access to it, as well. I have so implemented instantiation that the instance, though it looks attributes up in the relevant dictionary, does not provide a name by which to refer to it: however, an associated object is also created, seeing the name __dict__ for this dictionary (and even having fall-back attribute modification methods, acting on this, for use if the class has not provided them); this object borrows from the exposed instance and is used as the self parameter of the methods inherited by the instance. Consequently, methods can modify the dictionary but the outside world can only do so in so far as the class has provided methods for the purpose.
At present, one can modify arbitrary attributes of a module from other code. We have the option of providing attribute-modification functionality for the module during its suite but inaccessible to external code: save that the module is, naturally, at liberty to define __setattr__(key, val) within its body (and, indeed, the other __*attr__ functions), by which to provide the outside world with the liberty to modify such attributes as it wishes to let them modify.
[ locals, globals
| import, laziness
| lookups, namespaces
| objects, builders
]
[ toy
| top
]
See also: asides and digressions.
helped me make it all a lot more intelligible, reducing the likelihood of your head exploding.
introduced me to `closures' which have become the safe tunnels above.
did such a good job in the first place.