Categories
PHP

PDO – Set character encoding of a database

The type of character set used on the HTML page must match the character set used in the database table. In this tutorial, we’ll specify the utf-8 encoding for PHP, MySQL database, HTML page, and Notepad++ editor.

There are several ways to set the character encoding of a database. The recommended method is appending ;charset=charsetName to the DSN string. But older PHP versions ignore this option. Here we’ll explore all available options to set the character encoding for a database:

  1. By providing the charset parameter in the DSN string
  2. By executing SET NAMES utf8
  3. By providing an options array to the PDO class

Specifying character set in the DSN string

To get UTF-8 charset you can specify that in the DSN:

<?php
 $dsn = 'mysql:host=localhost; dbname=test; charset=UTF8';
 $dbh = new PDO($dsn, $user, $password);

Specifying character set by executing SET NAMES utf8

Prior to PHP 5.3.6, the charset option was ignored. For older versions use the following method:

<?php
 //For PHP version earlier to 5.3.6)
 $dbh = new PDO($dsn, $user, $password);
 $dbh->exec('set names utf8');

If you also need to set collation:

<?php
 $dbh->exec('set names utf8 COLLATE collateName');

 //For utf8 the default collation is utf8_general_ci
 $dbh->exec('set names utf8 COLLATE utf8mb4_unicode_ci');

Specifying character set in PDO constructor’s fourth parameter

You also can set the character encoding by creating an options array for PDO constructor’s fourth parameter:

<?php
 //For PHP version prior to 5.3.6)
 $options = [PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8'];

 $db = new PDO($dsn, $user, $password, $options);

Note: The above example only works with the MySQL database.

Output Multibyte (i.e. UTF-8) Data

When working with multibyte data, such as UTF-8, you must:

  1. Store data with the UTF-8 character encoding.
  2. Tell PHP to use UTF-8 character encoding. It’s easiest to do this in your php.ini file like this:
    default_charset = "UTF-8";
  3. Display data with the UTF-8 character encoding:
  • For PHP scripts and dynamically generated HTML files send content-type header:
<?PHP
 header('Content-Type: text/html; charset=utf-8');

Note: You must use the header() function before any output is returned from the PHP.

  • Include this meta tag <meta charset="UTF-8"/> in HTML files:
<!DOCTYPE html>
<html lang="en">
 <head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width">
.
.
.
  1. Configure your code editor or IDE to use UTF-8 character encoding, the following figure shows the utf-8 configuration setting for the Notepad++ code editor:

Working with Databases: