Named function parameters in PHP

Officially, PHP doesn’t support named function parameters. But there is one easy way to emulate this feature – so easy that it doesn’t really matter that this feature is missing. It’s actually a pretty old trick inspired from Javascript I’ve been using since forever. I thought everyone knew it, but I’m still surprised by how many people come up with overly complicated solutions. Anyway, imagine you have a method that have a lot of optional parameters. Without named parameters, it goes like this:

// $queue is required, the rest is optionnal
function bind_queue($queue, $durable = false, $auto_delete = true, $name = false, $answer = null) { ... }

Later in your code, if you want to specify another value for $answer, you have to copy the values of every intermediate parameters in your function call. It’s time consuming, error prone and confusing.

// Good luck remember which parameter is what.
bind_queue($my_queue, false, true, false, 42);

So instead, it’s much better use an associative array, like this:

// leave the required parameter part of the function definition, the options goes into an array
function bind_queue($queue, array $opt = array())
{
  // we define the default values, and merge with the user values at the same timle
  $opt = array_merge(array(
    'durable' => false,
    'auto_delete' => true,
    'name' => false,
    'answer' => null
  ), $opt);

  // use $opt['answer'] to access the 'answer' optionnal parameter
}

Later in your code, if you want to override answer, you can only write this:

bind_queue($my_queue, array('answer' => 42));

Much easier to read, isn’t it?

Bypassing PHP’s open_basedir with MySQL

PHP feature open_basedir is supposed to limit the files that can be opened by PHP to a specified directory-tree (full doc is here). Functions like fopen or file_get_contents will returns an error if the file is outside the allowed directory. So far, it sounds like a good protection.

However, it is also very famous for being flawed by design and easy to violate (like safe_mode by the way). Well, until now I didn’t realize how easy it is indeed to bypass it. During a security audit, I add the opportunity to study a backdoor left here by some script kiddie (thanks to an outdated version of a web application). Here is just one interesting example that uses MySQL:

  1. create a temporary table
  2. use MySQL’s command LOAD DATA INFILE to read any file and load is content to the table
  3. select the content of the table

In clean PHP, the code would looks like:

$filename = '/etc/passwd';

$pdo = new PDO($dsn, $username, $password);
$pdo->exec('CREATE TEMPORARY TABLE tmp_file ( content LONGBLOB NOT NULL)');
$pdo->exec(sprintf(
	'LOAD DATA INFILE %s INTO TABLE tmp_file',
	$pdo->quote($filename)
));
$content = $pdo->query('SELECT * FROM tmp_file')->fetchAll(PDO::FETCH_COLUMN);

To prevent that exploit in particular, it’s easy: just make sure that the MySQL’s user doesn’t have the FILE privilege. But open_basedir is definitely not safe. As seen in Debian’s php.ini default file: This is considered a “broken” security measure. Applications relying on this feature will not recieve full support by the security team

Implementation of “tail -f” in PHP

This is a small algorithm to implement a functionality similar to “tail -f” in PHP. The script is able watch a file in real time and do something everytime a new line is added (for example, a log file). It doesn’t implement the “tail” functionnality however (outputing only the end of the file) and instead starts processing the file from the beginning.

$file = @ fopen($filename, 'r');
$pos = 0; 

while (true) {
	fseek($file, $pos);
	while ($line = fgets($file)) {
		// do something with $line
	}
	$pos = ftell($file);
	sleep(1);
}
fclose($file);

Use destructor carefully

A couple of days ago I lost half a day of work on a stupid bug in a PHP CLI script. Basically, the script wouldn’t die. Instead it would hang forever, eating all the CPU time. Even explicitly calling die() (or exit(), since they are synonyms) wouldn’t terminate it. Yep, that’s right, die() didn’t work! Until then, I naively believed that die() was some failsafe language construct that would terminate the script not matter what. But as it turned out, it’s not the case…

According to PHP documentation: The destructor will be called even if script execution is stopped using exit(). So destructor’s code will be executed even after die() is called. What happens if a destructor’s code is faulty and gets stuck in an infinite loop ? The script will never end. In my story, I was using an third-party object-oriented library (for AMQP) that did some funky stuff in some objects destructor, and then waited forever for an event on the network… Needless to say, it’s a BAD idea to write that much application logic in a destructor.

So the moral of the story is: (1) die() can fail and (2) use destructor with caution, only write code that will NEVER fail and that is STRICLY necessary like closing connections, closing file handlers and such. Same goes for shutdown functions by the way.

A quick example for the sake of demonstration:

class Evil
{
	public function __destruct()
	{
		while (true);
	}
}

$foo = new Evil();
die();

One PHP error log per developer

At work we’re using a shared web development server, with a setup that allows every developer to have his own compartmentalized development space. It requires some tricky configurations, but in the end I find it way more efficient than the policy “every developer installs its own Apache/PHP/MySQL stack”. I already talked about the general ideas in a previous article (in french) two years ago, and so far it works like a charm. Our setup it constantly improving, and today’s big question is: how to make so that every developer has it’s own PHP error log?

Apache config

First, a little insight on the Apache configuration. If we would be using one virtual host directive per developer, this wouldn’t be an issue at all. Just set up the proper error_log path (using Apache’s php_value directive like so: php_value error_log /var/log/php/errors_username.log) for every virtual host, reload Apache and you’re done.

Our Apache, however, is using mod_vhost_alias. Basically we have “one virtual host to rule them all”, because I’m lazy and I don’t want to write one virtual host for every new dev, or modify a dozens of virtual host directives everytime I change a setting.

So how does it works? The first part of the URL is the developer username, then the second part is the project’s folder name, then the remaining part is the dev server hostname. With the VirtualDocumentRoot directive, you use parts of the URL has variable for calculating the correct document root path. (Actually, our setup includes a third part which is the subfolder to separate the document root of a project from all the include files, but that’s not the point of this article). Example:

<VirtualHost *:80>
	ServerName dev.domain.com
	ServerAlias *.dev.domain.com

	UseCanonicalName off
	VirtualDocumentRoot /home/%1/web/%2

	<Directory /home/*/web/*/*>
		Allow from all
		AllowOverride all
	</Directory>
</VirtualHost>

The address http://remi.awesome.dev.domain.com will automagically has a document root located in: /home/remi/web/awesome. The benefit is obvious: simplicity and flexibilty. I love convention over configuration design principle.

PHP config

Now the problem is that it’s impossible to tell PHP to logs the errors on a different file based on the first part of the URL directly into the Apache configuration, because the %1, %2,… variable are only valid inside the VirtualDocumentRoot directive.

So we need to do it inside PHP or inside an htaccess file on every folder. But doing it for every project is a pain, not to mention the conflicts to come in SVN because every developer will have his own path. That’s when PHP’s auto_prepend_file option come into play (and honestly, I would never have thought it’ll be useful one day).

I just create a file named /etc/php5/prepend.php with that one-liner:

<?php ini_set('error_log',sprintf('/var/log/php/errors_%s.log',substr($_SERVER['SERVER_NAME'], 0, strpos($_SERVER['SERVER_NAME'],'.'))));

And add this settings to php.ini:

auto_prepend_file = /etc/php5/prepend.php

Note that the directory /var/log/php will not be created out of the blue, so you have to take care of it, and it must be writeable by www-data. Files will be named errors_(first part of the url).log, so for example errors_remi.log in the previous example. Users belongs to group www-data, so as long as the logs are readable by the group (what they are by default) they’ll be able to read their logs.

And now the cherry on top of the cake, the logrotate configuration to rotate logs weekly, and keep one month (4 files) of archive:

/var/log/php/errors_*.log {
        weekly
        missingok
        rotate 4
        compress
        delaycompress
        notifempty
        create 640 www-data www-data
}

Et voilà!

Google Chart PHP Library 0.4

I released a new version of my PHP library for Google Chart API. Remember it’s still in heavy development (well, depending on my free time and my motivation, so “heavy” is relative), therefore don’t except anything bug-free or feature complete.

Less rigid API

Until this version, I was focusing on implementing strictly the Google Chart API. However, I eventually realize that the API is sometimes too rigid. For example, say you want to hide an axis. You have to specify “_” (underscore) as the 5th value (axis_or_tick) in the chxs parameter (values are separated by a coma). Looks easy right? Except it is NOT ok to omit the first 4th values. So you have to specify the label_color, font_size and alignment before, in order to be able to hide your axis.

In version 0.3, you had to do exactly that, by using setStyle and specifying the 4th parameter ($axis_or_tick). If you wonder why it’s not the 5th, it’s because the “axis index” value is calculated on runtime. Fortunatly, you could pass null as the value for the parameter, and the library will replace them by the default value. Example:

$axis = new GoogleChartAxis('x');
$axis->setStyle(null, null, null, '_');

In version 0.4, the setStyle method as been removed and splitted into multiple methods setLabelColor, setFontSize, setLabelAlignment, setDrawLine, setDrawTickMarks and setTickColor. Now you don’t need to worry about how many parameter you have to set, just call the method you want and the library will take care of setting the appropriate intermediate values. Example:

$axis = new GoogleChartAxis('x');
$axis->setDrawLine(false)->setDrawTickMarks(false);

More abstraction

This version also comes up with a set of features to simplify chart creation. For example, one of my favorite is setBorder method for Shape Markers (GoogleChartShapeMarker).

To create a border in a shape with Google Chart API, you need to create another similar marker below the first one (think z-order), with a different color and a slightly bigger size. Well, this can be done exactly this way in version 0.3. However, starting version 0.4, the setBorder method does the job for you. Just specify a color and the size of the border, and it will create the second marker automatically. Not only this is more convenient and easy to write, but this is also faster and uses less memory.

New features

This version adds support for Dynamic Icon. Because icons can be either “freestanding” (used as a chart) or used as marker, I had to refactor the base class. Now the base class is GoogleChartApi which holds the logic to query the API. GoogleChart and GoogleChartIcon extends this class, so you can use a GoogleChartIcon exactly the same way as a chart.

Example:

require '../lib/icons/GoogleChartIconNote.php';

$chart = new GoogleChartIconNote('Hello world');
$chart->setTitle('Example');
$chart->setTextColor('D01F3C');

header('Content-Type: image/png');
echo $chart;

To use a icon as a marker, use the new addDynamicMarker method. Example:

require '../lib/GoogleChart.php';
require '../lib/icons/GoogleChartIconNote.php';

$values = array();
for ($i = 0; $i <= 10; $i += 1) {
	$values[] = rand(20,80);
}

$chart = new GoogleChart('ls', 500, 200);
$data = new GoogleChartData($values);
$chart->addData($data);

$marker = new GoogleChartIconNote('Hello');
$marker->setData($data);
$chart->addDynamicMarker($marker);

header('Content-Type: image/png');
echo $chart;

For the moment, only “note” icon (aka Fun style notes with text and optional title) are supported, but I’m working on it.

Hey, the project has a new home!

Yes, the project is now hosted by Google Code. Because there is already a shitload of abandonned projects named using every possible combinations of “google” “chart” and “php”, I had to use the (rather long) name “googlechartphplib”. So new home is here:
http://code.google.com/p/googlechartphplib. You’ll find source code, issue tracker, documentation and the new SVN access there.

Announcing GoogleChart PHP library 0.3

I’ve been playing around a lot with Google Chart API lately, mainly for fun to draw some charts based on my Last.fm profile (Last.fm provides a very nice API). Google Chart API is very powerful, but quite harsh to work with, and unfortunately I didn’t find any good and easy-to-use PHP library for it. There are some but most of them are either not maintened or not fully working. So I ended up writting my own library. A few weeks ago, I used it for a project at work, improved it a bit and it worked like a charm. Eventually I decided to release it open-source (MIT license). It’s provided “as is”, without warranty of any kind. I just hope that it might be useful to somebody else as well, who knows?

Quick introduction

The library’s goal is to provide an easy way to build requests to Google Chart API, and especially to ease the painfull strings concatenation with comas, pipes, colons, etc. If you’ve already tried Google Chart API, you know what I mean! :-) So I wrote a couple of class, that allows to quickly create a chart (GoogleChart), add data series (GoogleChartData), axes (GoogleChartAxis) and markers (GoogleChartMarker), and compute an URL (for GET requests) or an array of parameters (for POST requests). It can even fetch the image for you (via GET or POST) so that you can display it directly (or cache it, or do whatever you want with it).

Read more

Logs PHP avec syslog

Pendant longtemps j’ai cherché comment uniformiser les logs de mes applications PHP, et plus particulièrement des nombreux scripts en ligne de commande qui s’exécutent régulièrement (oui, je suis un grand fan de PHP en ligne de commande) et des applis web quand elles génèrent une erreur. Pour aller droit au but : j’ai trouvé (mais ça, vous vous en doutiez) et maintenant j’utilise du syslog partout. Mais patience, je vais y venir.

D’abord qu’est-ce que j’entends par “logs” ? Il y a bien sûr tous les messages d’erreurs, que ce soit des erreurs générées par PHP (genre “la base de données ne répond plus”) ou des erreurs générées manuellement (genre “c’est quoi ce bordel je devrais jamais tomber dans ce cas là”). Mais il y a également les messages d’informations sur l’état d’avancement du script en ligne de commande (que personne ne lit mais qui pourront peut-être être utile le jour où le script va faire n’importe quoi) ou encore les messages de debug (pratique pour le développement). Pour les applications web, toutes les erreurs génèrent un code HTTP 500 et une page d’erreur propre pour le client, mais j’aimerais bien pouvoir garder une trace de ce qui a foiré.

Pour un script en ligne de commande

Pour cette partie je vais essayer de détailler le plus possible, mais il vaut mieux être familier avec les systèmes Unix, car cela concerne exclusivement des scripts qui sont exécutés avec la SAPI CLI (plus d’infos dans la doc de PHP).

Affichage standard

Au départ, reflexe de programmeur PHP, j’utilisais simplement les fonctions echo (ou printf), mais ça devient vite un casse-tête de filtrer les messages selon leur niveau d’importance. Et quand le script n’est pas lancé en ligne de commande (par exemple s’il est lancé via une cron), ces messages ne sont plus visibles. Il est cependant possible de les archiver en redirigeant le flux de sortie standard vers un fichier. Mais il faut penser faire la rotation de ce fichier de log manuellement (voir logrotate) et on se retrouve vite avec une multitude de fichiers de log éparpillés sur le disque, ce qui ne facilite pas la maintenance.

Exemple, pouet.php contient :

echo "pouet\n";

On redirige le flux :

$ php pouet.php > /tmp/pouet

Read more

Internationalisation d’une base de données

Disons que vous soyez en charge d’une application web écrite en PHP/MySQL, par exemple l’intranet de votre entreprise, et que vous ayez soudainement besoin de l’internationaliser parce que votre entreprise installe des bureaux à l’étranger et que, malheureusement, notre merveilleuse langue française n’est pas parlée dans tous les pays du monde.

Pour les templates, pas de problème, gettext est là pour ça (je reviendrais peut-être dessus dans un futur article, si je suis motivé). Si vous n’avez pas envie de mettre les mains dans le système, votre framework propose sûrement une émulation plus ou moins performante en pur PHP, ou, au pire, une solution “maison”. Bref, ça c’est facile.

Ce qui pose plus de problèmes, ce sont les données présentes en base. Par exemple : la liste de catégories pour les tickets d’incidents des clients. Elle est stockée en base dans une table qui contient notamment l’intitulé de cette catégorie. Oui, mais cet intitulé doit être traduit. Et comme les catégories sont gérées dynamiquement, ce n’est pas envisage d’utiliser la méthode gettext qui repose sur des fichiers statiques.

Transformer le modèle

Nous allons commencer par ajouter une table pour stocker les traductions. Si on envisage la catégorie comme un élément identifié par un id uniquement, les traductions sont liées à une catégorie par une relation “1-N” : 1 catégorie possède N traductions. Chaque traduction est identifiée par un code de langue. L’idée est de retirer les champs contenant du texte à traduire de la catégorie, et de les stocker dans la table contenant les traductions

Schéma de la base

Schéma de la base

On crée donc les tables suivantes :

CREATE TABLE `category` (
	`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
	`created_at` DATETIME NOT NULL,
	`deleted_at` DATETIME NULL,
	PRIMARY KEY (`id`)
) ENGINE = InnoDB;

CREATE TABLE `category_i18n` (
	`category_id` INT UNSIGNED NOT NULL,
	`lang` VARCHAR(6) NOT NULL,
	`name` VARCHAR(45) NOT NULL,
	PRIMARY KEY (`lang`, `category_id`),
	INDEX `category_i18n_category_fk` (`category_id` ASC),
	CONSTRAINT `category_i18n_category_fk`
		FOREIGN KEY (`category_id` )
		REFERENCES `category` (`id` )
		ON DELETE CASCADE
) ENGINE = InnoDB;

Read more

Développement web collaboratif

On me demande souvent comment j’organise le développement web dans mon entreprise : comment on arrive à coder à plusieurs sur le même fichier sans se marcher sur les pieds, comment on archive les versions, comment on effectue les livraisons en production, etc. Il est vrai que le développement web collaboratif est souvent mal organisé, et il est courant d’avoir comme seul outil un simple dossier partagé…

Contraintes liées au développement web collaboratif

Contrairement à un projet classique qui ne nécessite “que” un compilateur et des bibliothèques, une application web est une association complexe de plusieurs composants logiciels dont les versions et les configurations très variables peuvent influer sur le bon fonctionnement de l’application :

  • un serveur web (vhost, htaccess, mod_rewrite, etc.),
  • l’interpréteur php (options de php.ini, modules supplémentaires comme pdo, gettext, etc.),
  • le sgbd (configuration des users, etc.).

Donc la solution “classique” d’avoir un poste de développement par développeur est ici difficile à maintenir : il faudrait installer tous ces logiciels sur chaque poste et surtout veiller à avoir une configuration identique partout. Et ça se complique dès qu’il faut faire une mise à jour de la config…

Une autre solution serait d’avoir un serveur de dev unique (pour éviter les problèmes de maintenance évoqués ci-avant), et de travailler directement dessus via des partages de fichiers (Samba, NFS, etc.). Or cette solution pose plusieurs problèmes :

  • en cas d’accès concurrents à un fichier : le risque est grand de voir ses modifications écrasées par un autre développeur ;
  • en cas de modifications lourdes (exemple : un refactoring de la base) : les autres développeurs sont bloqués ;
  • impossible d’utiliser correctement le svn : il n’y a qu’une seule version des fichiers et on ne sait pas qui l’a modifiée.

Nous avons donc créée une solution intermédiaire : un serveur de dev unique (pour une maintenance facile), qui fournit des serveurs virtuels pour chaque développeur (pour une bonne séparation des données et une utilisation optimale de svn).

Le serveur de dev fonctionne avec Debian GNU/Linux, les postes clients avec Windows XP.

Read more

Next Page »