Development

Development

Python's Magical Self p.2: Rebuild Everything

My last blog post tried to explain the reasons behind explicitely declaring self in instance methods. It showed the mechanics of instancemethod – the object used to bind classes and class instances to functions – but I didn't explain how this occurs or why it is desirable. 

A note about semantics: I will use the word "method" to discuss functions that are somehow bound to an object. I will use "function" to reference an unbound function.

A Quick Recap

We will be expanding on the object used last time:

def bar(self,y):
    return self.x + y

class Foo(object):
    x = 9
    def __init__(self,x):
        self.x = x
    bar = bar

As we saw

foo = Foo(5)
bar is not foo.bar
bar is not Foo.bar
foo.bar is not Foo.bar

However, we only compared objects against each other. Strangely enough:

Foo.bar is not Foo.bar
foo.bar is not foo.bar

How can this be? It seems like Python is saying an object isn't itself. The answer is in the details of how and when the function bar becomes the instancemethod returned by foo.bar or Foo.bar. As one astute redditor pointed out: 

Foo.__dict__['bar'] is bar
Foo.__dict__['bar'] is not Foo.bar

 

If you haven't seen __dict__ before, it maps objects of property names to values. Foo.bar is usually equivalent to Foo.__dict__['bar'], so what gives?

The Magic of Descriptors

Let's examine Python magic you've likely seen before. The property decorator turns a method that takes a single argument (self) into a getter. Optionally, a setter and deleter can also be set. Essentially, it's syntactic sugar for leaving off parenthesis.

class Container(object):
    def __init__(self, value):
        self._value = value

    @property
    def value(self):
        return self._value
        
    @value.setter
    def value(self,value):
        self._value = value

hi = Container("Hi!")
hi.value == "Hi!"

hi.value = "bye"
hi.value == "bye"

try:
    del hi.value
except AttributeError:
    print True

The property decorator accomplishes this with descriptors. Descriptors are objects that implement at least one of three magic methods. 

__get__(self, instance, owner)

Evaluating hi.value calls property's __get__ method. hi.value returns whatever __get__ returns. instance is a reference to hi, and owner is a reference to Container.

__set__(self, instance, value)

Evaluating hi.value = "bye" calls this method. instance is a reference to hi, and value is the string "bye".

__delete__(self, instance)

Evaluating del hi.value calls this method. 

Descriptors manipulate results of the dot notation (e.g. foo.bar) and can masquerade as something they're not. Using these methods, it's easy to create our own version of the property decorator (Note: This isn't a complete reconstruction of property):

class myproperty(object):
    def __init__(self,fget,fset=None,fdel=None):
        self.fget = fget
        self.fset = fset
        self.fdel = fdel

    def __get__(self, instance, owner):
        if instance is None:
            return self
        else:
            return self.fget(instance)

    def __set__(self,instance,value):
        if self.fset is None:
            raise AttributeError("can't set attribute")
        else:
            return self.fset(instance,value)

    def __delete__(self,instance):
        if self.fdel is None:
            raise AttributeError("can't delete attribute")
        else:
            return self.fdel(instance)

    def getter(self, fget):
        self.fget = fget
        return self

    def setter(self, fset):
        self.fset = fset
        return self

    def deleter(self, fdel):
        self.fdel = fdel
        return self

Using myproperty to create Container, we see the same behavior as above:

class Container(object):
    def __init__(self, value):
        self._value = value

    @myproperty
    def value(self):
        return self._value
        
    @value.setter
    def value(self,value):
        self._value = value

hi = Container("Hi!")
hi.value == "Hi!"

hi.value = "bye"
hi.value == "bye"

try:
    del hi.value
except AttributeError:
    print True

Analyzing myproperty, the property decorator becomes a lot less mysterious. If a myproperty descriptor (i.e., value) is referenced on the class object (i.e. Container.value), __get__ returns itself. If it's referenced on an instance (i.e. hi.value), __get__ returns the result of fget. Notice that instance is explicitly passed to fgetfset, and fdel. This is because myproperty is given the raw, unbound function.

Note: If you want different example of using descriptors, Rafe Kettler wrote a phenomenal article about Python's magic methods. His example is one a favorite.

Simplifying Descriptors

property is familiar, but rebuilding it is complicated. Let's create a simple descriptor that defines a unique identifier for every object it's attached to:

class Identity(object):
    def __get__(self, instance, owner):
        if instance is None:
            return id(owner)
        else:
            return id(instance)

identity_instance = Identity()

class SomeClass(object):
    id = identity_instance

some_object = SomeClass()

SomeClass.id is not identity_instance
SomeClass.id == id(SomeClass)
some_object.id == id(some_object)

Unlike mypropertySomeClass.id does not return itself. This is the magic of descriptors: they completely change how dot syntax reacts. What if you want to reference identity_instance?

SomeClass.__dict__['id'] is identity_instance

Look familiar? In fact, with a reference to identity_instance, we can call __get__ directly:

identity_instance.__get__(some_object, SomeClass) == some_object.id
identity_instance.__get__(None, SomeClass) == SomeClass.id

SomeClass.__dict__['id'].__get__(some_object, SomeClass) == some_object.id
SomeClass.__dict__['id'].__get__(None, SomeClass) == SomeClass.id

Back to Functions

Everything in Python is an object. Functions are objects, and they define a __get__ method.

bar.__get__(foo,Foo)(4) == 9
Foo.__dict__['bar'].__get__(foo,Foo)(4) == 9

Knowing this, we can redefine bar to explicitly show a function's __get__ method does. We'll reuse myinstancemethod from the previous blog post:

class mybar(object):
    def __call__(self,instance,y):
        return instance.x + y
        
    def __get__(self, instance, owner):
        return myinstancemethod(self,owner,instance)

Foo rebuilt with mybar:

class NewFoo(object):
    x = 9
    def __init__(self, x):
        self.x = x
    bar = mybar()

new_foo = NewFoo(5)
new_foo.bar is not new_foo.bar
NewFoo.bar is not NewFoo.bar

Now those last two lines should make a lot more sense. A new myinstancemethod is instantiated every time you access new_foo.bar or NewFoo.bar. Python functions' __get__ method automagically does this for you, so you rarely need to think about it.

The Zen of Self

Some people have asked why self isn't implicit; many languages give you a convenient this keyword. In Python, self  is not a keyword. You can use 'me', 'this', 'context', 'banana', or any other valid variable name you like. As we saw in myproperty, __get__'s second argument (instance) is passed to fget as self

In many languages there is a concrete separation between static methods, instance methods, and global functions. This is not true in python; the only difference between functions, instance methods, class methods, and static methods is the context they explicitly manipulate. The desired context is passed in as the first argument (though this is not strictly required), or not at all for functions. The first argument is determined by __get__, @classmethod, @staticmethod, @property, or whatever other technique you happen to invoke.

This is immensely powerful. Concepts that are rigid paradigms in many other languages – such as static methods – are simply manipulating context workflows in Python. If you want a classmethod that's accessible only from the class, you're a few, short lines away:

class classonlymethod(object):
    def __init__(self, func):
        self.func = func
    
    def __get__(self, instance, owner):
        if instance is None:
            return lambda *args, **kwargs: self.func(owner,*args,**kwargs)
        else:
            raise AttributeError("'%s' can't be referenced on instances of '%s'" % (self.func.__name__, owner.__name__))

You can redefine static methods to behave more like Java:

class trulystaticmethod(object):
    def __init__(self, func):
        self.func = func
    
    def __get__(self, instance, owner):
        if instance is None:
            return self.func
        else:
            raise AttributeError("'%s' is static, and can't be referenced from an instance" % self.func.__name__)

Python lets you define the language as you see fit.

comments powered by Disqus