Tutorials:Character Encoding and xajax
Tutorials:Character Encoding and xajax
Contents |
[edit] Introduction to Character Encoding with xajax
One of the ongoing struggles with Ajax technologies is dealing with different sets of character encoding. The javascript XmlHttpRequest object upon which all of Ajax's asynchronous communication relies, sends data in only UTF-8 encoding, regardless of the character set of your HTML and headers.
Making xajax easily function with character sets other than UTF-8 has been an ongoing development problem. Starting in xajax version 0.2.3, we have implemented functionality that we hope will make working with xajax and character encoding a lot more abstract, easier to code, and easier to maintain.
The optimal solution for internationalization is for developers to design and create their websites and databases using only UTF-8, since no matter what language you need to use, UTF-8 should be able to display the needed characters. By default xajax uses UTF-8 encoding. However, we understand that despite this ideal, legacy code or other circumstances may require the use of another encoding.
There are three ways you can set the character encoding that xajax will use:
[edit] Changing the Default Encoding
In the xajax.inc.php file there is a define called XAJAX_DEFAULT_CHAR_ENCODING. Out of the box, this define is set to UTF-8. However, you can set it to whichever character encoding you need and xajax will use it instead.
//define ('XAJAX_DEFAULT_CHAR_ENCODING', 'utf-8' ); define ('XAJAX_DEFAULT_CHAR_ENCODING', 'ISO-8859-1' );
[edit] Setting Encoding During Instantiation
Both the xajax class constructor and the xajaxResponse class constructor accept parameters for the character encoding to use for the instance being constructed. They both default to the value of XAJAX_DEFAULT_CHAR_ENCODING.
xajax constructor parameters:
xajax($sRequestURI="",$sWrapperPrefix="xajax_",$sEncoding=XAJAX_DEFAULT_CHAR_ENCODING,$bDebug=false)
xajaxResponse constructor parameters:
xajaxResponse($sEncoding=XAJAX_DEFAULT_CHAR_ENCODING, $bOutputEntities=false)
You may tell xajax which encoding to use upon instantiation:
$xajax = new xajax("","xajax_",'ISO-8859-1');
However, if you instantiate xajax with a certain encoding, you must remember to set encoding of the xajaxResponses to the same:
$objResponse = new xajaxResponse('ISO-8859-1');
[edit] Setting Encoding After Instantiation
Both the xajax and xajaxResponse classes include a setCharEncoding() method that can be used to change the character encoding after constructing an instance of the class:
$xajax = new xajax(); $xajax->setCharEncoding('ISO-8859-1');
If you use the setCharEncoding() method to change the encoding of the xajax object, you must also remember to set the encoding of your xajaxResponse class instances as well, otherwise xajax will not function properly. You can set the xajaxResponse encoding during instantiation as described in the previous section, or you can use the setCharEncoding() method after instantiation:
$objResponse = new xajaxResponse(); $objResponse->setCharEncoding('ISO-8859-1');
[edit] If Encoding Still Does Not Work
If you are still getting error messages of invalid XML and you are only using your characters for displaying and not for comparasion or manipulation (and you have the mb_string extension installed in PHP) you can try using xajax->outputEntitiesOn().
outputEntitiesOn() tells the response object to convert special characters to HTML entities automatically, so that characters that aren't in your current character set get converted. eg é turns into é which will end up rendered the same in browsers.
If you are using a MySQL database and getting funny characters in/out of it, run the following PHP commands:
mysql_query("SET NAMES 'utf8' COLLATE 'utf8_unicode_ci'"); mysql_query("SET CHARACTER SET utf8");
Also remember that you might need to set the charset the html/xhtml document is using. Something like this should do it in html:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
and something like this in xhtml:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
You can use the header() function of php to make sure your document is delivered in utf-8 charset:
header("Content-type: text/html; charset=utf-8");
[edit] Decoding Issues
Even if you use the above methods to set the character encoding to another set, the XmlHttpRequest object always sends data in UTF-8 and the values of parameters passed from JavaScript to the PHP functions you have registered with xajax will still be encoded in UTF-8.
Starting with version 0.2.3, xajax has built-in functionality for decoding the data received from UTF-8 to whichever character encoding you are using. This decoding functionality is OFF by default. You can enable it by calling the decodeUTF8InputOn() method of the xajax class:
$xajax = new xajax(); $xajax->setCharEncoding('ISO-8859-1'); $xajax->decodeUTF8InputOn();
With the decoding functionality enabled, xajax will check for various modules in your PHP installation that may be capable of converting the character encoding, and use them if they are available. If they are not you will receive an error message. If it works, the values of the parameters received by your functions should be properly encoded for whichever encoding you have set.
[edit] Quick Tips
- Try to use UTF-8 whenever possible.
- If you set the character encoding in the xajax constructor or using the $xajax->setCharEncoding() method, remember to set it for your xajaxResponse instances too. If you neglect to set the xajaxResponse encoding to the same set as the xajax instance, you will get an XML error in Internet Explorer.
- Regardless of the encoding you have set, the data will arrive from your javascript encoded in UTF-8. Use the $xajax->decodeUTF8InputOn() method to enable automatic decoding from UTF-8 to the character encoding you have set.
[edit] Example
<?php require ('xajax.inc.php'); function testEncoding($strText) { $objResponse = new xajaxResponse('ISO-8859-1'); $objResponse->addAssign("div1","innerHTML",$strText); return $objResponse; } $xajax = new xajax(); $xajax->setCharEncoding('ISO-8859-1'); $xajax->decodeUTF8InputOn(); $xajax->registerFunction("testEncoding"); $xajax->processRequests(); ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <?php $xajax->printJavascript(); ?> </head> <body> <div id="div1"></div> <form id="form1" onsubmit="return false;"> <input type="text" value="EspaƱol" id="text1" name="text1" /> <input type="submit" value="Submit" onclick="xajax_testEncoding(xajax.$('text1').value);" /> </form> </form> </body> </html>