There are several ways to set the character encoding of a database. The recommended method is appending ;charset=charsetName
to the DSN string. But older PHP versions ignore this option. Here we’ll explore all available options to set the character encoding for a database:
- By providing the
charset
parameter in the DSN string - By executing
SET NAMES utf8
- By providing an options array to the PDO class
Specifying character set in the DSN string
To get UTF-8 charset you can specify that in the DSN:
<?php $dsn = 'mysql:host=localhost; dbname=test; charset=UTF8'; $dbh = new PDO($dsn, $user, $password);
Specifying character set by executing SET NAMES utf8
Prior to PHP 5.3.6, the charset
option was ignored. For older versions use the following method:
<?php //For PHP version earlier to 5.3.6) $dbh = new PDO($dsn, $user, $password); $dbh->exec('set names utf8');
If you also need to set collation:
<?php $dbh->exec('set names utf8 COLLATE collateName'); //For utf8 the default collation is utf8_general_ci $dbh->exec('set names utf8 COLLATE utf8mb4_unicode_ci');
Specifying character set in PDO constructor’s fourth parameter
You also can set the character encoding by creating an options array for PDO constructor’s fourth parameter:
<?php //For PHP version prior to 5.3.6) $options = [PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8']; $db = new PDO($dsn, $user, $password, $options);
Note: The above example only works with the MySQL database.
Output Multibyte (i.e. UTF-8) Data
When working with multibyte data, such as UTF-8, you must:
- Store data with the UTF-8 character encoding.
- Tell PHP to use UTF-8 character encoding. It’s easiest to do this in your php.ini file like this:
default_charset = "UTF-8";
- Display data with the UTF-8 character encoding:
- For PHP scripts and dynamically generated HTML files send
content-type
header:
<?PHP header('Content-Type: text/html; charset=utf-8');
Note: You must use the header()
function before any output is returned from the PHP.
- Include this meta tag
<meta charset="UTF-8"/>
in HTML files:
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width"> . . .
- Configure your code editor or IDE to use UTF-8 character encoding, the following figure shows the utf-8 configuration setting for the Notepad++ code editor:
Working with Databases: