Salagir's Blog

Old name was: "Why do witches burn?"

This is french section. Go to English version

Url rewriting is your friend

Le novembre 28th, 2008

Url rewriting, is to have a link to a webpage that a human can read, and a search engine can interpret as keywords.

Many blogs use this, they use the title of the post in the url.

Example :
Good: http://toto.com/articles/eat-creme-brulee.html
Bad: http://toto.com/2007/article.php?id=684

You english reader may not care about this, but the main problem characters with accents : é, è … Most url rewriter will have problems, because from their point of view, they are special characters.

Bad: http://toto.com/articles/eat cr%E8me brul%E9e.html
Bad: http://toto.com/articles/eat-crme-brule.html
Bad: http://toto.com/articles/eat-cr-me-brul-e.html

Many also have a problem with apostrophes:

Bad: http://toto.com/articles/i%27ll-kill-for-you.html
Bad: http://toto.com/articles/ill-kill-for-you.html
Good: http://toto.com/articles/i-ll-kill-for-you.html

And some others… (I saw some pagename that begin or end with a dash, it’s ugly).

So here is my own url rewriting algo. Use it well :

//! Transform a text into a simple and readable filename
function text2filename($str, $spaceChar = '-') {
    $str = preg_replace('/[:;?!¡,~R()=%"«»]/', '', $str);
    $str = strtr($str,
        'äàáâãåÀÁÂÃÅÇçèéêëÈÉÊËìíîïÌÍÎÏÑñÒÓÔÕòóôõÙÚÛùúûÝýÿÐ',
        'aaaaaaAAAAACceeeeEEEEiiiiIIIINnOOOOooooUUUuuuYyyD');
    $str = str_replace(
        array('Ä','Æ','æ','Ö','ö','ß','Ü','ü'),
        array('AE','AE','ae','OE','oe','ss','UE','ue'), $str);
    $str = preg_replace('/[ _\'\/.-]+/', $spaceChar, $str);
    $str = trim($str, $spaceChar);
    return strtolower($str);
}

Leave a Reply