typesafe objects in PHP

February 19, 2007

I always disliked the way PHP handles Objects. There is no way to assign a type to properties. Validators have to be glued against the fields externally and you can’t just generate a Object-Description (like WSDL) from a object either.

Usually you have DataObjects like:

/* a plain-old-php-object */

class Employee {
    var $employee_id;

    var $name;
    var $surname;

    var $since;
}

and as a human you immediatly how to use it:

$e = new Employee();

$e->name       = "Jan";
$e->surname    = "Kneschke";
$e->employeenr = 123;
$e->since      = mktime(0, 0, 0, 1, 1, 2005);

But you can also have a bad day and write something like:

$e->unknown = "value";
$e->since = "Monday";

The property unknown gets created automaticly in the object and since gets a invalid value. If you take a look at ActiveRecord in Rails you see how proper types can simplify your life. They also assign the validators directly in the DataObject Classes, because that’s the obvious place where you want to guarantee consistency of your data.

The approach we choose here changes the behaviour of PHP objects:

  • properties are always initialized with NULL
  • properties are NOT automaticly declared
  • properties always have a type
  • at assigment the type is enforced and converted if possible without loss of data

At the side we win:

  • WSDL generation
  • SQL which is automaticly escaped correctly

In Java 1.5.x and later they got annotations and we can use the same in PHP:

class Employee {
    /** @var int */     
    var $employee_id;

    /** @var string */
    var $name;
    /** @var string */
    var $surname;

    /** @var timestamp */
    var $since;
}

Yes, doc-comments. Not only we added documentation to our code (which is always a good thing), we can also use them for the type-definition. With the help of Reflection we can get the DocComments of a property and extract the @var from it.

Similar to the term POJO we call our objects POPO (plain old php objects).

class POPO {
    function hasProperty($k) {
        $r = new ReflectionObject($this);

        return $r->hasProperty($k);
    }

    function getPropertyType($k) {
        $o = new ReflectionObject($this);

        $p = $o->getProperty($k);

        $dc = $p->getDocComment();

        if (!preg_match("#@var\s+([a-z]+)#", $dc, $a)) {
            return false;
        }
        return $a[1];
    }
    ...
}

The POPOTypeSafe class is using this internal information to implement the settors and gettors:

class POPOTypeSafe extends POPO {
    function __get($k) {
        if (!$this->hasProperty($k)) {
            throw new Exception(sprintf("'%s' has no property '%s'", get_class($this), $k));
        }

        if (!isset($this->$k)) {
            return NULL;
        }

        return $this->$k;
    }

    function __set($k, $v) {
        if (!$this->hasProperty($k)) {
            throw new Exception(sprintf("'%s' has no property '%s'", get_class($this), $k));
        }

        if (!($type = $this->getPropertyType($k))) {
            throw new Exception(sprintf("'%s'.'%s' has no type set", get_class($this), $k));
        }

        if (!$this->isValid($k, $v, $type)) {
            throw new Exception(sprintf("'%s'.'%s' = %s is not valid for '%s'", get_class($this), $k, $v, $type));
        }

        $this->$k = $v;
    }

    function isValid($k, $v, $type) {
        if (!isset($v)) return false;
        if (is_null($v)) return false;

        switch ($type) {
        case "int":
        case "integer":
        case "timestamp":
            return (is_numeric($v));
        case "string":
            return true;
        default:
            throw new Exception(sprintf("'%s'.'%s' has invalid type: '%s'", get_class($this), $k, $type));
        }
    }
}

which our Employee-class is extending:

class Employee extends POPOTypeSafe {
    /** @var int */
    protected $employee_id;

    /** @var string */
    protected $name;
    /** @var string */
    protected $surname;

    /** @var timestamp */
    protected $since;
}

The properties have to be protected to raise the settor and to be accessable by the parent-class.

Check out how the class behaves now:

>> r employee.php
...
>> $e = new Employee()
object(Employee)#12 (4) {
  ["employee_id:protected"]=>
  NULL
  ["name:protected"]=>
  NULL
  ["surname:protected"]=>
  NULL
  ["since:protected"]=>
  NULL
}
>> $e->name
NULL
>> $e->name = "Jan"
Jan
>> $e->name
Jan
>> $e->employee_id = "foobar"
Exception (code: 0) got thrown: 'Employee'.'employee_id' = foobar is not valid for 'int'
>> $e->employee_id = 1
1
>> $e->unknown = 1
Exception (code: 0) got thrown: 'Employee' has no property 'unknown'

Yep, this works nicely. As we have the type-information now in our PHP objects we can use it to generate XML from it:

we can generate XML

We iterate over all the properties and generate a XML based on the type we have. In the case of NULL we just skip the tag to distinguish it from a empty string.

class SerializeXML {
    static function toXML(POPO $o, SimpleXMLElement $parent = NULL) {
        if (is_null($parent)) {
            $parent = new SimpleXMLElement(sprintf("<?xml version=\"1.0\"?><%s/>", get_class($o)));
        }

        foreach ($o->getProperties() as $k) {
            $type = $o->getPropertyType($k);

            $v = $o->$k;

            if (is_null($v)) continue;

            switch ($type) {
            case "int":
                $parent->addChild($k, (int)$v);
                break;
            case "timestamp":
                $parent->addChild($k, gmstrftime("%Y-%m-%dT%H:%M:%SZ", $v));
                break;
            case "string":
                $parent->addChild($k, $v);
                break;
            }
        }

        return $parent;
    }
}

The PHP-Shell helps us to examine our new class:

>> $e = new Employee()
object(Employee)#12 (4) {
...
}
>>     $e->name        = "Jan";
Jan
>>     $e->employee_id = 123;
123
>>     $e->since       = mktime(0, 0, 0, 1, 1, 2005);
1104534000
>> SerializeXML::toXML($e)->asXML();
<?xml version="1.0"?>
<Employee><employee_id>123</employee_id><name>Jan</name><since>2004-12-31T23:00:00Z</since></Employee>

Manfred Weber has implemented the same idea with DocComments some time ago as http://pear.php.net/package/services_webservice/.

See:

for examples.

… and SQL too

Now we can do the same with SQL and get type-safe, encoded SQL statements:

class SerializeSQL {
    static function toINSERT(POPO $o, $table) {
        $fields = array();
        $values = array();

        foreach ($o->getProperties() as $k) {
            $type = $o->getPropertyType($k);

            $v = $o->$k;

            $fields[] = $k;

            switch ($type) {
            case "int":
                $values[] = is_null($v) ? "NULL" : (int)$v;
                break;
            case "timestamp":
                $values[] = is_null($v) ? "NULL" : gmstrftime('"%Y-%m-%d %H:%M:%S"', $v);
                break;
            case "string":
                $values[] = is_null($v) ? "NULL" : addslashes($v);
                break;
            }
        }

        return sprintf('INSERT INTO %s (%s) VALUES (%s)',
            $table,
            join($fields, ","),
            join($values, ",")
        );
    }
}

In the shell:

>> $e = new Employee();
object(Employee)#12 (4) {
....
}

>> $e->name        = "Jan";
Jan
>> $e->employee_id = 123;
123
>> $e->since       = mktime(0, 0, 0, 1, 1, 2005);
1104534000
>> SerializeSQL::toINSERT($e, "employees");
INSERT INTO employees (employee_id,name,surname,since) VALUES (123,Jan,NULL,"2004-12-31 23:00:00")

This is what Rails is doing in ActiveRecords or any other OR-mapper.

Annotations

Up to now we used annotations only for assigning a type to it, but we can do more. I implemented a generic parser and handler for custom DocComment fields. As an example I use

class Employee extends POPOTypeSafe {
    /**
    * @var      int
    * @length   10
    * @validate 1-
    * @is_required
    */
    protected $employee_id;
...

The length field can be used to add a toCREATE to our SerializeSQL class:

    static function toCREATE(POPO $o, $table) {
        $fields = array();
        $values = array();

        foreach ($o->getProperties() as $k) {
            $type = $o->getPropertyType($k);
            if ($o instanceof POPOTypeSafe) {
                $length = $o->getPropertyLength($k);
            } else {
                $length = NULL;
            }

            switch ($type) {
            case "int":
                $fields[] = sprintf('%s INT%s', self::escapePropertyName($k), is_null($length) ? "" : '('.$length.')');
                break;
            case "timestamp":
                $fields[] = sprintf('%s TIMESTAMP', self::escapePropertyName($k));
                break;
            case "string":
                if (is_null($length) || $length > 64000) {
                    $fields[] = sprintf('%s TEXT', self::escapePropertyName($k));
                } else {
                    $fields[] = sprintf('%s VARCHAR(%d)', self::escapePropertyName($k), $length);
                }
                break;
            default:
                throw new Exception(sprintf("unkown type '%s'", $type));
                break;
            }
        }

        return sprintf("CREATE TABLE %s (\n  %s)",
            self::escapePropertyName($table),
            join($fields, ",\n  ")
        );
    }

… or we use the @validate to add more complex validators to the type. A collection can be found at

There is a lot of potential in this basic concept and it opens up the possibilities and readability of your code.

If you are interested you can see the whole code, a bit advanced from the above at annotations-php.html

php

Comments

Enable javascript to load comments.