When translating a document name to a url path segment the following rules are used:
All rules and translation tables are applied, not just the first matching rule. So if a rule indicates that a character is converted into a space, and another rule specifies that spaces are converted to hyphens, than the character is converted into a hyphen
The following translation tables for printable characters are used:
table 1: special handling of some regular printable characters ! removed " removed # removed $ usd % removed & removed ' removed ( removed ) removed * converted into space + converted into space , removed - converted into hyphen . removed / converted into hyphen : converted into space ; converted into space < removed = converted into hyphen > removed ? removed @ -at- { removed | converted into hyphen } removed ~ converted into hyphen
table 2: ISO 8859-1 special characters ¡ removed ¢ ct £ gbp ¤ removed ¥ yen ¦ - § removed ¨ removed © removed ª removed « removed ¬ removed - ® removed ¯ - ° removed ± - ² removed ³ removed ´ removed µ removed ¶ removed · removed ¸ removed ¹ removed º removed » removed ¼ removed ½ removed ¾ removed Ð d Ø o Ù u Ú u Û u Ü u Ý y Þ y ß ss à a á a â a ã a ä a å a æ ae ç c è e é e ê e ë e ì i í i î i ï i ð d ñ n ò o ó o ô o õ o ö o ÷ removed ø o ù u ú u û u ü u ý u þ y ÿ y
table 3: translation of UTF-8 Latin 1 characters above 0xc200 UTF-8 c2a1 removed UTF-8 c2a2 ct UTF-8 c2a3 gbp UTF-8 c2a4 removed UTF-8 c2a5 yen UTF-8 c2a6 removed UTF-8 c2a7 removed UTF-8 c2a8 removed UTF-8 c2a9 removed UTF-8 c2aa removed UTF-8 c2ab removed UTF-8 c2ac removed UTF-8 c2ad - UTF-8 c2ae removed UTF-8 c2af - UTF-8 c2b0 removed UTF-8 c2b1 removed UTF-8 c2b2 removed UTF-8 c2b3 removed UTF-8 c2b4 removed UTF-8 c2b5 removed UTF-8 c2b6 removed UTF-8 c2b7 removed UTF-8 c2b8 removed UTF-8 c2b9 removed UTF-8 c2ba removed UTF-8 c2bb removed UTF-8 c2bc removed UTF-8 c2bd removed UTF-8 c2be removed UTF-8 c2bf removed UTF-8 c380 a UTF-8 c381 a UTF-8 c382 a UTF-8 c383 a UTF-8 c384 a UTF-8 c385 a UTF-8 c386 ae UTF-8 c387 c UTF-8 c388 e UTF-8 c389 e UTF-8 c38a e UTF-8 c38b e UTF-8 c38c i UTF-8 c38d i UTF-8 c38e i UTF-8 c38f i UTF-8 c390 d UTF-8 c391 n UTF-8 c392 o UTF-8 c393 o UTF-8 c394 o UTF-8 c395 o UTF-8 c396 o UTF-8 c397 x UTF-8 c398 o UTF-8 c399 u UTF-8 c39a u UTF-8 c39b u UTF-8 c39c u UTF-8 c39d y UTF-8 c39e y UTF-8 c39f ss UTF-8 c3a0 a UTF-8 c3a1 a UTF-8 c3a2 a UTF-8 c3a3 a UTF-8 c3a4 a UTF-8 c3a5 a UTF-8 c3a6 ae UTF-8 c3a7 c UTF-8 c3a8 e UTF-8 c3a9 e UTF-8 c3aa e UTF-8 c3ab e UTF-8 c3ac i UTF-8 c3ad i UTF-8 c3ae i UTF-8 c3af i UTF-8 c3b0 d UTF-8 c3b1 n UTF-8 c3b2 o UTF-8 c3b3 o UTF-8 c3b4 o UTF-8 c3b5 o UTF-8 c3b6 o UTF-8 c3b7 removed UTF-8 c3b8 o UTF-8 c3b9 u UTF-8 c3ba u UTF-8 c3bb u UTF-8 c3bc u UTF-8 c3bd y UTF-8 c3be y UTF-8 c3bf y