HOWTO: Proper Links and URLs for XHTML

13 04 2007

Copyright (c) 2007 Elmar Hinz, elmar.hinz@team-red.net
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License”.

How a link for XHTML should look like

The raw URL:

links.php?märker[einEuro]=1€&märker[fünfEuro]=5€

For XHTML we need to urlencode it, apart from the ampersand and the equal signs, which the PHP parser uses to analyse the variables. The ampersand signs must be used as entities.

links.php?m%C3%A4rker%5BeinEuro%5D=1%E2%82%AC&m%C3%A4rker%5Bf%C3%BCnfEuro%5D=5%E2%82%AC

The square brackets also can be encoded. But it also works without encoding them. According to RFC 1738 they should be encoded.

In PHP the function would be composed this way

$url = 'links.php?'
. rawurlencode('märker[einEuro]') . '=' . rawurlencode('1€')
. '&'
. rawurlencode('märker[fünfEuro]') . '=' . rawurlencode('5€');

An alternative way in 2 steps with htmlspecialchars

Step 1:

$url = 'links.php?'
. rawurlencode('märker[einEuro]') . '=' . rawurlencode('1€')
. '&'
. rawurlencode('märker[fünfEuro]') . '=' . rawurlencode('5€');

Step 2:

$url = htmlspecialchars($url);

This is possible due to the previous urlencoding of keys and values. No characters are remaining that would be altered by the htmlspecialchars function.

Quirks of Typolink

The typolink function already includes the use of the htmlspechialchars function, but not of the urlencodefunction. Also htmlspecialchars is inconsistantly applied. It is used for generating link tags but not for generating plain URL (property: returnLast). In the latter case you have to postprocess the generated URL with htmlspecialchars yourself, i.e. before you use it as a form action.

As we have seen above the processing by htmlspecialchars in the 2. step prerequests the previous urlending of keys and values. Because this is not done by the typolink function we urlencode keys and values for the additionalParams property as described above.

This are a few lines of version 0.0.20 of tx_lib_link:


if(!is_array($value)) { // TODO handle arrays
if($this->designatorString) {
$conf['additionalParams'] .= '&' . $this->designatorString . '['
. rawurlencode($key) . ']=' . rawurlencode($value);
} else {
$conf['additionalParams'] .= '&' .
rawurlencode($key) . '=' . rawurlencode($value);
}
}

As you observe in this version the square brackets are still not encoded to match RFC 1738.


Actions

Information

3 responses

30 05 2007
Golden rules for TYPO3 extension programmers « TYPO3 FLYERS

[...] [8] Make yourself a (typo)link specialist: If you deal with forms or internationalization or realurl or caching or indexing or accessibility or frames or whatsoever. Your extension will only work smoothly when your links are build with the right technology. Also typolink has some quirks you have to master. [...]

16 10 2007
Using the typolink function « TYPO3 FLYERS

[...] Hint: How to build proper additionalParams with Umlauts? [...]

22 05 2008
patrik

pig2FV jr39ug7djalfgpitg94gbvm

Leave a comment