Development

Development

Python's Hidden New

__new__ is one of the most easily abused features in Python. It's obscure, riddled with pitfalls, and almost every use case I've found for it has been better served by another of Python's many tools. However, when you do need __new__, it's incredibly powerful and invaluable to understand.

The predominant use case for __new__ is in metaclasses. Metaclasses are complex enough to merit their own article, so I don't touch on them here. If you already understand metaclasses, great. If not, don't worry; understanding how Python creates objects is valuable regardless.

Constructors

With the proliferation of class-based languages, constructors are likely the most popular method for instantiating objects.

Java

class StandardClass {
    private int x;
    public StandardClass() {
        this.x = 5;
    }
    
    public int getX() {
        return this.x;
    }
}

Python

class StandardClass(object):
    def __init__(self, x):
        self.x = x

Even JavaScript, a protypical language, has object constructors via the new keyword.

function StandardClass(x) {
    this.x = x;
}

var standard = new StandardClass(5);
alert(standard.x == 5);

Newer is Better

In Python, as well as many other languages, there are two steps to object instantiation:

The New Step

Before you can access an object, it must first be created. This is not the constructor. In the above examples, we use this or self to reference an object in the constructor; the object had already been created by then. The New Step creates the object before it is passed to the constructor. This generally involves allocating space in memory and/or whatever language specific actions newing-up an object requires.

The Constructor Step

Here, the newed-up object is passed to the constructor. In Python, this is when __init__ is called.

Python Object Creation

This is the normal way to instantiate a StandardClass object:

standard = StandardClass(5)
standard.x == 5

StandardClass(5) is the normal instance creation syntax for Python. It performs the New Step followed by the Constructor Step for us. Python also allows us to deconstruct this process:

# New Step
newed_up_standard = object.__new__(StandardClass)
type(newed_up_standard) is StandardClass
hasattr(newed_up_standard,'x') is False

# Constructor Step
StandardClass.__init__(newed_up_standard, 5)
newed_up_standard.x == 5

object.__new__ is the default New Step for object instantiation. It's what creates an instance from a class. This happens implicitly as the first part of StandardClass(5)

Notice, x is not set until after newed_up_standard is run through __init__. This is because object.__new__ doesn't call __init__. They are disparate functions. If we wanted to perform checks on newed_up_standard or manipulate it before the constructor is run, we could. However, explicitely calling the New Step followed by Constructor Step is neither clean nor scalable. Fortunately, there is an easy way.

Controlling New with __new__

Python allows us to override the New Step of any object via the __new__ magic method.

class NewedBaseCheck(object):
    def __new__(cls):
        obj = super(NewedBaseCheck,cls).__new__(cls)
        obj._from_base_class = type(obj) == NewedBaseCheck
        return obj
    def __init__(self):
        self.x = 5

newed = NewedBaseCheck()
newed.x == 5
newed._from_base_class is True

__new__ takes a class instead of an instance as the first argument. Since it creates an instance, that makes sense. super(NewedClass, cls).__new__(cls) is very important. We don't want to call object.__new__ directly; again, you'll see why later.

Why is from_base_class defined in __new__ instead of __init__? It's metadata about object creation, which makes more semantic sense in __new__. However, if you really wanted to, you could place define _from_base_class:

class StandardBaseCheck(object):
    def __init__(self):
        self.x = 5
        self._from_base_class == type(self) == StandardBaseCheck

standard_base_check = StandardBaseCheck()
standard_base_check.x == 5
standard_base_check._from_base_class is True

There is a major behavioral difference between NewBaseCheck and StandardBaseCheck in how they handle inheritance:

class SubNewedBaseCheck(NewedBaseCheck):
    def __init__(self):
        self.x = 9

subnewed = SubNewedBaseCheck()
subnewed.x == 9
subnewed._from_base_class is False

class SubStandardBaseCheck(StandardBaseCheck):
    def __init__(self):
        self.x = 9

substandard_base_check = SubStandardBaseCheck()
substandard_base_check.x == 9
hasattr(substandard_base_check,"_from_base_class") is False

Because we failed to call super(...).__init__ in the constructors, _from_base_class is never set.

__new__ and __init__

Up until now, classes defining both __init__ and __new__ had no-argument constructors. Adding arguments has a few pitfalls to watch out for. We'll modify NewBaseCheck:

class NewedBaseCheck(object):
    def __new__(cls):
        obj = super(NewedBaseCheck,cls).__new__(cls)
        obj._from_base_class = type(obj) == NewedBaseCheck
        return obj

    def __init__(self, x):
        self.x = x

try:
    NewedBaseCheck(5)
except TypeError:
    print True

Instantiating a new NewedBaseCheck throws a TypeError. NewedBaseCheck(5) first calls NewBaseCheck.__new__(NewBaseCheck, 5). Since __new__ takes only one argument, Python complains. Let's fix this:

class NewedBaseCheck(object):
    def __new__(cls, x):
        obj = super(NewedBaseCheck,cls).__new__(cls)
        obj._from_base_class = type(obj) == NewedBaseCheck
        return obj

    def __init__(self, x):
        self.x = x

newed = NewedBaseCheck(5)
newed.x == 5

There are still problems with subclassing:

class SubNewedBaseCheck(NewedBaseCheck):
    def __init__(self, x, y):
        self.x = x
        self.y = y

try:
    SubNewedBaseCheck(5,6)
except TypeError:
    print True

We get the same TypeError as above; __new__ takes cls and x, and we're trying to pass in cls, x, and y. The generic fix is fairly simple:

class NewedBaseCheck(object):
    def __new__(cls, *args, **kwargs):
        obj = super(NewedBaseCheck,cls).__new__(cls)
        obj._from_base_class = type(obj) == NewedBaseCheck
        return obj

    def __init__(self, x):
        self.x = x

newed = NewedBaseCheck(5)
newed.x == 5

subnewed = SubNewedBaseCheck(5,6)
subnewed.x == 5
subnewed.y == 6

Unless you have a good reason otherwise, always define __new__ with *args and **kwargs.

The Real Power of __new__

__new__ is incredibly powerful (and dangerous) because you manually return an object. There are no limitations to the type of object you return.

class GimmeFive(object):
    def __new__(cls, *args, **kwargs)):
        return 5

GimmeFive() == 5

If __new__ doesn't return an instance of the class it's bound to (e.g. GimmeFive), it skips the Constructor Step entirely:

class GimmeFive(object):
    def __new__(cls, *args, **kwargs):
        return 5

    def __init__(self,x):
        self.x = x

five = GimmeFive()
five == 5
isinstance(five,int) is True
hasattr(five, "x") is False

That makes sense: __init__ will throw an error if passed anything but an instance of GimmeFive, or a subclass, for self. Knowing all this, we can easily define Python's object creation process:

def instantiate(cls, *args, **kwargs):
    obj = cls.__new__(cls, *args, **kwargs)
    if isinstance(obj,cls):
        cls.__init__(obj, *args, **kwargs)
    return obj

instantiate(GimmeFive) == 5
newed = instantiate(NewedBaseCheck, 5)
type(newed) == NewedBaseCheck
newed.x == 5

Don't Do This. Ever.

While experimenting for this post I created a monster that, like Dr. Frankenstein, I will share with the world. It is a greate example of how horrifically __new__ can be abused. (Seriously, don't ever do this.)

class A(object):
    def __new__(cls):
        return super(A,cls).__new__(B)
    def __init__(self):
        self.name = "A"

class B(object):
    def __new__(cls):
        return super(B,cls).__new__(A)
    def __init__(self):
        self.name = "B"

a = A()
b = B()
type(a) == B
type(b) == A
hasattr(a,"name") == False
hasattr(b,"name") == False

The point of the above code snippet: please use __new__ responsibly; everyone you code with will thank you.

__new__ and the new step, in the right hands and for the right task, are powerful tools. Conceptually, they neatly tie together object creation. Practically, they are a blessing when you need them. They also have a dark side. Use them wisely.

 

comments powered by Disqus