Named function parameters in PHP

Officially, PHP doesn’t support named function parameters. But there is one easy way to emulate this feature – so easy that it doesn’t really matter that this feature is missing. It’s actually a pretty old trick inspired from Javascript I’ve been using since forever. I thought everyone knew it, but I’m still surprised by how many people come up with overly complicated solutions. Anyway, imagine you have a method that have a lot of optional parameters. Without named parameters, it goes like this:

// $queue is required, the rest is optionnal
function bind_queue($queue, $durable = false, $auto_delete = true, $name = false, $answer = null) { ... }

Later in your code, if you want to specify another value for $answer, you have to copy the values of every intermediate parameters in your function call. It’s time consuming, error prone and confusing.

// Good luck remember which parameter is what. 
bind_queue($my_queue, false, true, false, 42);

So instead, it’s much better to use an associative array, like this:

// leave the required parameter part of the function definition, the options goes into an array
function bind_queue($queue, array $opt = array())
{
  // we define the default values, and merge with the user values at the same timle
  $opt = array_merge(array(
    'durable' => false,
    'auto_delete' => true,
    'name' => false,
    'answer' => null
  ), $opt);

  // use $opt['answer'] to access the 'answer' optionnal parameter
}

Later in your code, if you want to override answer, you can only write this:

bind_queue($my_queue, array('answer' => 42));

Much easier to read, isn’t it?

Fixing WordPress auto-update

Are you tired of WordPress failing to auto-update, and asking for a username for a FTP server even if every possible files and directories have the right permissions (i.e. are writeable by the web server)? Well, me too. Good news is, after countless hours of browsing and trying every possible solution, I stumble upon one that works. Just add the following line to wp-config.php:

define('FS_METHOD', 'direct');

I tried this out of desperation, and what do you know, the auto-update started to work just fine with this line… Don’t get me started on WordPress’ code logic.

Bypassing PHP’s open_basedir with MySQL

PHP feature open_basedir is supposed to limit the files that can be opened by PHP to a specified directory-tree (full doc is here). Functions like fopen or file_get_contents will returns an error if the file is outside the allowed directory. So far, it sounds like a good protection.

However, it is also very famous for being flawed by design and easy to violate (like safe_mode by the way). Well, until now I didn’t realize how easy it is indeed to bypass it. During a security audit, I add the opportunity to study a backdoor left here by some script kiddie (thanks to an outdated version of a web application). Here is just one interesting example that uses MySQL:

  1. create a temporary table
  2. use MySQL’s command LOAD DATA INFILE to read any file and load is content to the table
  3. select the content of the table

In clean PHP, the code would looks like:

$filename = '/etc/passwd';

$pdo = new PDO($dsn, $username, $password);
$pdo->exec('CREATE TEMPORARY TABLE tmp_file ( content LONGBLOB NOT NULL)');
$pdo->exec(sprintf(
	'LOAD DATA INFILE %s INTO TABLE tmp_file',
	$pdo->quote($filename)
));
$content = $pdo->query('SELECT * FROM tmp_file')->fetchAll(PDO::FETCH_COLUMN);

To prevent that exploit in particular, it’s easy: just make sure that the MySQL’s user doesn’t have the FILE privilege. But open_basedir is definitely not safe. As seen in Debian’s php.ini default file: This is considered a “broken” security measure. Applications relying on this feature will not recieve full support by the security team

Implementation of “tail -f” in PHP

This is a small algorithm to implement a functionality similar to “tail -f” in PHP. The script is able watch a file in real time and do something everytime a new line is added (for example, a log file). It doesn’t implement the “tail” functionnality however (outputing only the end of the file) and instead starts processing the file from the beginning.

$file = @ fopen($filename, 'r');
$pos = 0; 

while (true) {
	fseek($file, $pos);
	while ($line = fgets($file)) {
		// do something with $line
	}
	$pos = ftell($file);
	sleep(1);
}
fclose($file);

Use destructor carefully

A couple of days ago I lost half a day of work on a stupid bug in a PHP CLI script. Basically, the script wouldn’t die. Instead it would hang forever, eating all the CPU time. Even explicitly calling die() (or exit(), since they are synonyms) wouldn’t terminate it. Yep, that’s right, die() didn’t work! Until then, I naively believed that die() was some failsafe language construct that would terminate the script not matter what. But as it turned out, it’s not the case…

According to PHP documentation: The destructor will be called even if script execution is stopped using exit(). So destructor’s code will be executed even after die() is called. What happens if a destructor’s code is faulty and gets stuck in an infinite loop ? The script will never end. In my story, I was using an third-party object-oriented library (for AMQP) that did some funky stuff in some objects destructor, and then waited forever for an event on the network… Needless to say, it’s a BAD idea to write that much application logic in a destructor.

So the moral of the story is: (1) die() can fail and (2) use destructor with caution, only write code that will NEVER fail and that is STRICLY necessary like closing connections, closing file handlers and such. Same goes for shutdown functions by the way.

A quick example for the sake of demonstration:

class Evil
{
	public function __destruct()
	{
		while (true);
	}
}

$foo = new Evil();
die();

One PHP error log per developer

At work we’re using a shared web development server, with a setup that allows every developer to have his own compartmentalized development space. It requires some tricky configurations, but in the end I find it way more efficient than the policy “every developer installs its own Apache/PHP/MySQL stack”. I already talked about the general ideas in a previous article (in french) two years ago, and so far it works like a charm. Our setup it constantly improving, and today’s big question is: how to make so that every developer has it’s own PHP error log?

Apache config

First, a little insight on the Apache configuration. If we would be using one virtual host directive per developer, this wouldn’t be an issue at all. Just set up the proper error_log path (using Apache’s php_value directive like so: php_value error_log /var/log/php/errors_username.log) for every virtual host, reload Apache and you’re done.

Our Apache, however, is using mod_vhost_alias. Basically we have “one virtual host to rule them all”, because I’m lazy and I don’t want to write one virtual host for every new dev, or modify a dozens of virtual host directives everytime I change a setting.

So how does it works? The first part of the URL is the developer username, then the second part is the project’s folder name, then the remaining part is the dev server hostname. With the VirtualDocumentRoot directive, you use parts of the URL has variable for calculating the correct document root path. (Actually, our setup includes a third part which is the subfolder to separate the document root of a project from all the include files, but that’s not the point of this article). Example:

<VirtualHost *:80>
	ServerName dev.domain.com
	ServerAlias *.dev.domain.com

	UseCanonicalName off
	VirtualDocumentRoot /home/%1/web/%2

	<Directory /home/*/web/*/*>
		Allow from all
		AllowOverride all
	</Directory>
</VirtualHost>

The address http://remi.awesome.dev.domain.com will automagically has a document root located in: /home/remi/web/awesome. The benefit is obvious: simplicity and flexibilty. I love convention over configuration design principle.

PHP config

Now the problem is that it’s impossible to tell PHP to logs the errors on a different file based on the first part of the URL directly into the Apache configuration, because the %1, %2,… variable are only valid inside the VirtualDocumentRoot directive.

So we need to do it inside PHP or inside an htaccess file on every folder. But doing it for every project is a pain, not to mention the conflicts to come in SVN because every developer will have his own path. That’s when PHP’s auto_prepend_file option come into play (and honestly, I would never have thought it’ll be useful one day).

I just create a file named /etc/php5/prepend.php with that one-liner:

<?php ini_set('error_log',sprintf('/var/log/php/errors_%s.log',substr($_SERVER['SERVER_NAME'], 0, strpos($_SERVER['SERVER_NAME'],'.'))));

And add this settings to php.ini:

auto_prepend_file = /etc/php5/prepend.php

Note that the directory /var/log/php will not be created out of the blue, so you have to take care of it, and it must be writeable by www-data. Files will be named errors_(first part of the url).log, so for example errors_remi.log in the previous example. Users belongs to group www-data, so as long as the logs are readable by the group (what they are by default) they’ll be able to read their logs.

And now the cherry on top of the cake, the logrotate configuration to rotate logs weekly, and keep one month (4 files) of archive:

/var/log/php/errors_*.log {
        weekly
        missingok
        rotate 4
        compress
        delaycompress
        notifempty
        create 640 www-data www-data
}

Et voilà!

Google Chart PHP Library 0.4

I released a new version of my PHP library for Google Chart API. Remember it’s still in heavy development (well, depending on my free time and my motivation, so “heavy” is relative), therefore don’t except anything bug-free or feature complete.

Less rigid API

Until this version, I was focusing on implementing strictly the Google Chart API. However, I eventually realize that the API is sometimes too rigid. For example, say you want to hide an axis. You have to specify “_” (underscore) as the 5th value (axis_or_tick) in the chxs parameter (values are separated by a coma). Looks easy right? Except it is NOT ok to omit the first 4th values. So you have to specify the label_color, font_size and alignment before, in order to be able to hide your axis.

In version 0.3, you had to do exactly that, by using setStyle and specifying the 4th parameter ($axis_or_tick). If you wonder why it’s not the 5th, it’s because the “axis index” value is calculated on runtime. Fortunatly, you could pass null as the value for the parameter, and the library will replace them by the default value. Example:

$axis = new GoogleChartAxis('x');
$axis->setStyle(null, null, null, '_');

In version 0.4, the setStyle method as been removed and splitted into multiple methods setLabelColor, setFontSize, setLabelAlignment, setDrawLine, setDrawTickMarks and setTickColor. Now you don’t need to worry about how many parameter you have to set, just call the method you want and the library will take care of setting the appropriate intermediate values. Example:

$axis = new GoogleChartAxis('x');
$axis->setDrawLine(false)->setDrawTickMarks(false);

More abstraction

This version also comes up with a set of features to simplify chart creation. For example, one of my favorite is setBorder method for Shape Markers (GoogleChartShapeMarker).

To create a border in a shape with Google Chart API, you need to create another similar marker below the first one (think z-order), with a different color and a slightly bigger size. Well, this can be done exactly this way in version 0.3. However, starting version 0.4, the setBorder method does the job for you. Just specify a color and the size of the border, and it will create the second marker automatically. Not only this is more convenient and easy to write, but this is also faster and uses less memory.

New features

This version adds support for Dynamic Icon. Because icons can be either “freestanding” (used as a chart) or used as marker, I had to refactor the base class. Now the base class is GoogleChartApi which holds the logic to query the API. GoogleChart and GoogleChartIcon extends this class, so you can use a GoogleChartIcon exactly the same way as a chart.

Example:

require '../lib/icons/GoogleChartIconNote.php';

$chart = new GoogleChartIconNote('Hello world');
$chart->setTitle('Example');
$chart->setTextColor('D01F3C');

header('Content-Type: image/png');
echo $chart;

To use a icon as a marker, use the new addDynamicMarker method. Example:

require '../lib/GoogleChart.php';
require '../lib/icons/GoogleChartIconNote.php';

$values = array();
for ($i = 0; $i <= 10; $i += 1) {
	$values[] = rand(20,80);
}

$chart = new GoogleChart('ls', 500, 200);
$data = new GoogleChartData($values);
$chart->addData($data);

$marker = new GoogleChartIconNote('Hello');
$marker->setData($data);
$chart->addDynamicMarker($marker);

header('Content-Type: image/png');
echo $chart;

For the moment, only “note” icon (aka Fun style notes with text and optional title) are supported, but I’m working on it.

Hey, the project has a new home!

Yes, the project is now hosted by Google Code. Because there is already a shitload of abandonned projects named using every possible combinations of “google” “chart” and “php”, I had to use the (rather long) name “googlechartphplib”. So new home is here:
http://code.google.com/p/googlechartphplib. You’ll find source code, issue tracker, documentation and the new SVN access there.

Announcing GoogleChart PHP library 0.3

I’ve been playing around a lot with Google Chart API lately, mainly for fun to draw some charts based on my Last.fm profile (Last.fm provides a very nice API). Google Chart API is very powerful, but quite harsh to work with, and unfortunately I didn’t find any good and easy-to-use PHP library for it. There are some but most of them are either not maintened or not fully working. So I ended up writting my own library. A few weeks ago, I used it for a project at work, improved it a bit and it worked like a charm. Eventually I decided to release it open-source (MIT license). It’s provided “as is”, without warranty of any kind. I just hope that it might be useful to somebody else as well, who knows?

Quick introduction

The library’s goal is to provide an easy way to build requests to Google Chart API, and especially to ease the painfull strings concatenation with comas, pipes, colons, etc. If you’ve already tried Google Chart API, you know what I mean! :-) So I wrote a couple of class, that allows to quickly create a chart (GoogleChart), add data series (GoogleChartData), axes (GoogleChartAxis) and markers (GoogleChartMarker), and compute an URL (for GET requests) or an array of parameters (for POST requests). It can even fetch the image for you (via GET or POST) so that you can display it directly (or cache it, or do whatever you want with it).

Continue reading

Indentation avec des espaces pour les fichiers YAML avec SciTE

Vous utilisez SciTE comme éditeur de code, et vous l’avez configuré pour utiliser des tabulations pour l’indentation dans votre coding style ? C’est parfait, vous êtes sur la voix de la sagesse. Mais voilà qu’un beau jour débarquent dans votre petite vie tranquille de développeur les fichiers au format YAML (Yet Another Markup Language), format qui impose l’utilisation d’espace pour l’indentation, sous peine de rendre le fichier invalide. Saloperie. Heureusement, il suffit de quelques lignes de configuration pour faire en sorte que SciTE se comporte correctement avec les fichiers YAML :

use.tabs.*.yml=0
indent.size.*.yml=2
tab.size.*.yml=2

Ouf !

Trees in SQL : an approach based on materialized paths and normalization for MySQL

I’m working since a few months on how storing trees in MySQL. It started like “hey, let’s make a database of every genre of heavy metal music!”. So I began with the easy way : a tree structure in a table genre with id and parent_id columns. And then I came to the point where I wanted to display the entire tree. This is just impossible with a reasonable number of queries, as there is no recursive syntax in standard SQL nor in MySQL (Oracle has the CONNECT BY extension). So I started researching the web.

There are basically three ways to store a tree in a relationnal databases : adjacency list, nested set and materialized path, plus a few more like nested intervals. Let’s put it clear : they all suck.

  • Adjacency list model is the one I used the first time (with parent_id column). It’s easy to create but impossible to query deeply without recursivity.
  • Nested set model is quite easy to query, but is indeed pretty fucked up (tree is limited in size, very hard to read without maths, almost all the lines of the table need to be updated each time a node is added, moved or deleted!). Nested intervals model tries to remove the size limitation of nested sets model, but involves way to much maths.
  • Materialized path is great and easy to understand, but very slow because it involves heavy use of the “LIKE” operator.

More on that here, here and here.

Then I found two very interesting articles. One is a reply on MySQL Forums about materialized path performance problem, recommending an approach based on normalized schema to store paths. The other is an implementation using PostgreSQL by depesz that also use a normalized schema. The point is to create a table dedicated to store all the paths from every nodes to every nodes. This looks to me like a very good approach, as it’s easy to understand, easy to maintain, easy to query and performance-ok (with appropriate indexes). Unfortunalty, the implementation I found relies to much on PostgreSQL and doesn’t work out-of-the-box with MySQL (because of the use of triggers and of some queries that needs to be rewriten). So I reworked it to work with MySQL 5, and changed a few things. I strongly recommend that you read depesz’s post for a complete understanding of the approach before you continue.

Continue reading