Sunday, 14 November 2010

Python metaclasses

A basic object relational design

In this article we will explore metaclasses as a tool to link up Python classes and database tables.

Based in part on the information in illustrated in this
Python article.

Metaclasses are a daunting concept at first, yet it is worthwhile to understand the idiom because in some circumstances the are vital in getting hard jobs done. In this short article we'll look into a rough draft of some code that creates database tables at the same time a class is created and alters the class definition in such a way that accessing instance variables will result in calls to some database engine.

The first thing to understand is that Python metaclasses are not magic. Everytime you define a new class you already refer to the built-in metaclass type. Metaclasses are subclasses of type and regular classes are instances of a metaclass. This may sound weird but remember that everything in Python is an object, even a class definition. And because every object is an instance of some class, this is just a logical extension of a general concept.

If you want your class definition to be instanced by a different metaclass than type you will have to use the metaclass parameter in your class definition:

class MyClass(metaclass=MyMetaClass):
 ...

Your metaclass must be a subclass from Python's built-in metaclass type and should at least provide a __new__() method:

class MyMetaClass(type):
 def __new__(metaclass, classname, bases, classdict):
  ... do stuff to classDict ...
  return type.__new__(metaclass, classname, bases, classdict)

This __new__() method should return a new class, for example by passing its attribute to the __new__() method of type like we do in the example above. The real power is hidden in the arguments passed to the __new__() method: besides the name of the class we are building and the base classes it will be a subclass of, we also have access to the class dictionary.

The class dictionary holds all class attributes, including class variables and methods (both class methods and instance methods). The beauty is that we can check and/or alter the contents of this class dictionary before we actually create a class. This could be used for example to check whether the class has some mandatory method definitions or overrides specific methods in its bases, something that is used to implement abstract base classes in Python.

Another application of metaclasses is to bridge domains. In the example below there are two domains: the classes and instances of those classes in Python's runtime environment and the tables and records in a relational database stored on disk. If we would like to map those classes to database tables metaclasses provide us with some excellent tooling because they allow us to act upon the class definition before the class is actually made to exist.

This means that we when we define a class our metaclass can check whether there is already a suitable table defined in the database and that class variables are mapped to columns in this table. It is also possible to change class variables into properties in such a way that accessing these properties from instances will result in proper sql statements issues to a database engine.

This may sound a bit abstract, so just take a look at the code below. It will not really interact with a database but just print the sql it would have used. The comments should explain what is happening in a fairly detailed manner:

from functools import partial
from random import randint

class Attribute:
 """
 Attribute makes it possible to distinguish class
 variables that should be backed by a columns definition
 for a table.
 """
 def __init__(self,constraints=''):
  self.constraints=constraints
  
class DBbackend(type):
 """
 A metaclass that will create a database table that
 contains columns for each class variable that is an
 instance of Attribute. It also replaces these class
 variables with poperties that will retrieve or update
 the database value when the attribute is accessed.
 """
 def __new__(meta, classname, bases, classDict):
  attributes = {}
  # create a suitable table definition. Although we
  # do not do a complete implementation here, we have
  # sqlite in mind, so no explicit types.
  for attr in classDict:
   if issubclass(Attribute,classDict[attr].__class__):
    attributes[attr]=classDict[attr]
  sql = 'create table if not exists %s ( %s )' % ( classname, 
   ",".join(['id integer primary key autoincrement']
     +[ name+" "+a.constraints for name,a in attributes.items()]))
   
  print(sql)
  
  # we alter the __init__() method to create a database record
  def create(self,**kw):
   self.id=randint(1,1000000) # just for illustration, normally handled by autoincrement in the db
   sql='insert into %s (%s) values (%s) [%s]' % (self.__class__.__name__,
    ",".join(kw.keys()),
    ",".join(['?']*len(kw.keys())),
    ",".join(kw.values()))
   print(sql)
   
  classDict['__init__']=create
  
  # functions that retrieve/update a column in a database table
  def get(self,name):
   print('select %s from %s where id = ? [%s]' % (name,
      classname,str(self.id)))
  
  def set(self,value,name):
   print('update %s set %s=? [%s] where id = ? [%s]' % (classname,
      name,str(value),str(self.id)))
  
  # change each class var that holds an Attribute object to a 
  # property that get/sets the appropriate column
  for attr in attributes:
   fget = partial(get,name=attr)
   fset = partial(set,name=attr)
   classDict[attr]=property(fget,fset)
  
  return type.__new__(meta, classname, bases, classDict)

if __name__ == "__main__":

 # example, create a class with three attributes/database columns
 class Car(metaclass=DBbackend):
  make = Attribute()
  model= Attribute()
  license=Attribute('unique')

 # create an instance
 mycar = Car(make='Volvo', model='C30', license='1-abc-23')

 # retrieve various attributes
 model = mycar.model
 make = mycar.make
 lic = mycar.license

 # set an attribute
 mycar.model='S40'