Categories
PHP

Search and Replace Unicode Accents

Many foreign languages, such as Turkish, Spanish, and most other European languages have accented letters. In this tutorial, we’ll replace the accented characters with their relevant characters, for example, é with e.

Replace accented characters with str_replace()

In the following code snippet, we’ll convert a string that’s accented with diacritics (such as é) to plain ASCII in a readable form. We’ll use PHP’s str_replace and strtr functions to replace all the diacritic characters with standard ones.

Replace a single character with str_replace():

$text = "héllo";
$from = "é";
$to = "e";
$newText = str_replace($from, $to, $text);

Example: Replacing accented characters with str_replace()

<?php
 $from = array(
  'à', 'á', 'â', 'ã', 'ä', 'å', 'æ',
  'À', 'Á', 'Â', 'Ã', 'Ä', 'Å', 'Æ',
  'ß', 'ç', 'Ç',
  'è', 'é', 'ê', 'ë',
  'È', 'É', 'Ê', 'Ë',
  'ì', 'í', 'î', 'ï',
  'Ì', 'Í', 'Î', 'Ï',
  'ñ', 'Ñ',
  'ò', 'ó', 'ô', 'õ', 'ö',
  'Ò', 'Ó', 'Ô', 'Õ', 'Ö',
  'š', 'Š',
  'ù', 'ú', 'û', 'ü',
  'Ù', 'Ú', 'Û', 'Ü',
  'ý', 'Ý', 'ž', 'Ž'
 );

 $to = array(
  'a', 'a', 'a', 'a', 'a', 'a', 'a', 
  'A', 'A', 'A', 'A', 'A', 'A', 'A',
  'B',  'c', 'C',
  'e', 'e', 'e', 'e',
  'E', 'E', 'E', 'E',
  'i', 'i', 'i', 'i',
  'I', 'I', 'I', 'I', 
  'n',  'N',
  'o', 'o', 'o', 'o', 'o',
  'O', 'O', 'O', 'O', 'O', 
  's',  'S', 
  'u', 'u', 'u', 'u', 
  'U', 'U', 'U', 'U', 
  'y',  'Y', 'z', 'Z'
 );

 $text = 'Hëllô wörld, ßræînßèll.çõm';
 echo str_replace($from, $to, $text);
 //Prints: Hello world, BrainBell.com

Replacing accents with strtr()

For detail, visit How to use strtr().

<?php
 $text = 'héllö';
 $translation = array('é' => 'e', 'ö'=>'o');
 echo strtr($text, $translation);
 //Prints: hello

Example: Remove or replace accents with strtr()

<?php
 $trns = array(
 'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a', 'å'=>'a', 'æ'=>'a', 
 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A', 'Å'=>'A', 'Æ'=>'A',
 'ß'=>'B', 'ç'=>'c', 'Ç'=>'C',
 'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e',
 'È'=>'E', 'É'=>'E', 'Ê'=>'E', 'Ë'=>'E',
 'ì'=>'i', 'í'=>'i', 'î'=>'i', 'ï'=>'i',
 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I', 'Ï'=>'I',
 'ñ'=>'n', 'Ñ'=>'N',
 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o', 'ö'=>'o',
 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O',
 'š'=>'s', 'Š'=>'S', 
 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'ü'=>'u',
 'Ù'=>'U', 'Ú'=>'U', 'Û'=>'U', 'Ü'=>'U',
 'ý'=>'y', 'Ý'=>'Y', 'ž'=>'z', 'Ž'=>'Z'
 );

 $str = 'Hëllô wörld, ßræînßèll.çõm';
 echo strtr($str, $trns);
 //Prints: Hello world, BrainBell.com

Manipulating substrings: