Development

Development

Python's Magical Self

(We've recently posted a riposte to this post, please read it here.) - Joey Bruckner, Editor

Python’s self argument drives some people crazy. For one, you must explicitly define it in every class method. It then rudely injects itself in places it’s not wanted.

class Foo(object):
    x = 9
    def __init__(self,x):
        self.x = x

    def bar(self,y):
        return self.x + y

If you come from a C++, Java, or similar background, self in __init__ and bar seems redundant. Python brags about simple and elegant code, so what gives?

Scope Happens

Scope in Python is very simplistic. Everything in Python is an object, and almost everything is scoped at the object level. Write a new module?

# test.py
def say_hi():
    print 'Hi!'

You just created a new module object with the say_hi property.

Define a new class?

class Foo(object):
    x = 9
    def __init__(self,x):
        self.x = x

    def bar(self,y):
        return self.x + y

You just made a new class object with the properties x, __init__, and bar.

Instantiate Foo?

foo = Foo(5)

You created a new Foo object with the properties x, __init__, and bar. Keep in mind, the three properties on foo are different than those on Foo. We'll get to why in a moment.

Context is Everything

Let's deconstruct Foo:

def bar(self,y):
    return self.x + y

class Foo(object):
    x = 9
    def __init__(self,x):
        self.x = x
    bar = bar

Ignore that bar's first argument is "self". If we consider bar as just another function, the following is very natural.

foo = Foo(5)

print bar(foo,4) == 9
print bar(Foo,0) == 9

It seems to follow that this same pattern could be applied to Foo.bar.

print Foo.bar(foo,4) == 9
print Foo.bar(Foo,0) == 9

The first line prints True. The second line raises TypeError: unbound method bar() must be called with Foo instance as first argument (got type instance instead). Instantiating a Foo further modifies Bar by hiding the self argument.

print foo.bar(foo,4) == 9
print foo.bar(foo,0) == 9

Both lines raise TypeError: bar() takes exactly 2 arguments (3 given). If you've ever wondered why you get an error claiming you passed three arguments instead of two, you're about to find out.

Binding Self

If you examine the type of the three bars, you'll see they're not the same.

print type(bar)
# <type 'function'>
print type(Foo.bar)
# <type 'instancemethod'>
print type(foo.bar)
# <type 'instancemethod'>

Attaching any function to a class wraps it in an instancemethod object. instancemethod acts as glue, binding the class, instance (if any), and original function together.

print Foo.bar.im_class == Foo
print Foo.bar.im_func == bar
print Foo.bar.im_self == None
print foo.bar.im_class == Foo
print foo.bar.im_func == bar
print foo.bar.im_self == foo

It's straightforward to create a simplified instancemethod class with Python code.

class myinstancemethod(object):
    def __init__(self,func,cls,instance=None):
        self.im_func = func
        self.im_class = cls
        self.im_self = instance

    def __call__(_self,*args,**kwargs):
        args = list(args)
        if _self.im_self is not None:
            args.insert(0,_self.im_self)
            
        if len(args) == 0:
            raise TypeError("unbound method bar() must be called with Foo instance as first argument (got nothing instead)")
        elif not isinstance(args[0],_self.im_class):
            raise TypeError("unbound method bar() must be called with Foo instance as first argument (got %s instead)" % type(args[0]).__name__)
        else:
            return _self.im_func(*args,**kwargs)

myinstancemethod mimics the actual instancemethod class with suprising accuracy. It exhibits the same behavior as Foo.bar and Foo.bar in the above examples, plus it handles very well a few edge cases of class and instance method calling.

my_unbound(self=foo,y=4)
# TypeError: bar() got multiple values for keyword argument 'self'
Foo.bar(self=foo,y=4)
# TypeError: bar() got multiple values for keyword argument 'self'

my_bound(self=foo,y=4)
# TypeError: unbound method bar() must be called with Foo instance as first argument (got nothing instead)
foo.bar(self=foo,y=4)
# TypeError: unbound method bar() must be called with Foo instance as first argument (got nothing instead)

This is why you can pass references to bar instead of having to pass foo and call foo.bar.

Closing

foo is a completely different beast than Foo. Every variable in Python is a reference to an object in memory – objects are no different. Foo.x, Foo.__init__, and Foo.bar point to different memory locations than foo.x, foo.__init__, and foo.bar.

print Foo.x is not foo.x
print Foo.__init__ is not foo.__init__
print Foo.bar is not foo.bar

foo and Foo are separate entities that happen to reference each other, in all the right places.

comments powered by Disqus