Categories
PHP

Split, tokenize, and iterate a string

In this tutorial, we’ll discuss the str_split() and mb_str_split() functions to split a string into an array based on a specified length, strtok() function to split a string into smaller strings (tokens), and chunk_split() to split a string into smaller chunks.

  1. str_split()
  2. mb_str_split()
  3. strtok()
  4. chunk_split()

Convert a string to an array with str_split()

<?php
 //Syntax
 str_split(string $string, int $length = 1): array

The str_split() function takes two parameters:

  1. $string: the input string.
  2. $length (optional): maximum length of the chunk, default is 1.

This function splits a string into an array based on a specified length.

<?php
 $str = 'abcdef';
 $ar1 = str_split($str);
 # ['a','b','c','d','e','f']

 $ar2 = str_split($str, 2);
 # ['ab','cd','ef']
 
 print_r($ar1);
 /*Array (
    [0] => a
    [1] => b
    [2] => c
    [3] => d
    [4] => e
    [5] => f )*/
 print_r($ar2);
 /*Array(
    [0] => ab
    [1] => cd
    [2] => ef)*/

Note: use mb_str_split() to deal with a multi-byte string.

mb_str_split()

<?php
//Syntax
mb_str_split(string $string, int $length = 1, ?string $encoding = null): array

This function takes three parameters:

  1. $string: the input string
  2. $length (optional): maximum length of the chunk, default is 1.
  3. $encoding (optional): the character encoding, if not provided, the internal character encoding value will be used.

The str_split() function deals with single-byte characters but it can not handle the multi-byte characters, see the following example:

<?php
 $string = '€£Ͻڻ➿';
 $array = str_split($string);
 print_r($array);
 #Prints: [0] => � [1] => � [2] => � ...

The following example uses the mb_str_split() function that can handle multi-byte characters:

<?php
 $string = '€£Ͻڻ➿';
 $array  = mb_str_split($string);
 print_r($array);
/* Prints:
[0] => €
[1] => £
[2] => Ͻ
[3] => ڻ
[4] => ➿ */

 $array  = mb_str_split($string, 2, 'UTF-8');
 print_r($array);
/* Prints:
[0] => €£
[1] => Ͻڻ
[2] => ➿ */

Tokenize string with strtok()

<?php
 //Syntax
 strtok(string $string, string $token): string|false

This function takes two parameters:

  1. $string: the input string.
  2. $token: a delimiter to split string.

The strtok($string, $token) returns the first part of the string, and the subsequent calls requires only the $token parameter, so the strtok($token) returns the next token of the string. This function returns FALSE when there are no more tokens to be returned. See the following code:

<?php
 $string = 'a.b,c.d';
 $token = '.,';
 // initialized
 echo strtok($string, $token); # a
 // jump to next token
 echo strtok($token); # b
 // jump to next token
 echo strtok($token); # c
 // jump to next token
 echo strtok($token); # d
 // jump to next token
 echo strtok($token); # false, prints nothing

You can use the while loop to make subsequent calls until the function reaches the end of the string, see example:

<?php
 $string = 'a.b.c.d';
 $token = '.';
 // initialized
 $tok = strtok($string, $token);
 while ($tok !== false) {
  echo $tok;

  // jump to next token
  $tok = strtok($token);
 }
//Prints: abcd

Split a string into smaller chunks using chunk_split()

<?php
 //Syntax
 chunk_split(string $string, int $length = 76, string $separator = "\r\n"): string

This function has three parameters:

  1. $string: The string to be chunked.
  2. $length: The chunk length, default value is 76.
  3. $separator: The line ending sequence, default value is \r\n.

By default, the chunk_split() function returns a chunk length of 76 with a trailing CRLF (\r\n), leaving the original string untouched.

Example: Split a string into smaller chunks

<?php
 $v = 'By default, the chunk_split() function returns a chunk length of 76 with a trailing CRLF';
 echo chunk_split($v, 15, '<br>');

Output of chunk_split($v, 15, '<br>');

The chunk_split() function usually used along with the base64_encode() function to accomplish RFC 2045 standards for sending an email attachment. See example:

<?php
 $text = 'Welcome to BrainBell.com...'.
         'Welcome to BrainBell.com...'.
         'Welcome to BrainBell.com...'.
         'Welcome to BrainBell.com...'.
         'Welcome to BrainBell.com...'.
         'Welcome to BrainBell.com...';
 $encoded = base64_encode($text);
 $chunked = chunk_split($encoded);
 echo $chunked;

The following output prints on a web browser:

Base64 encoded text, chunked with chunk_split() function

Working with arrays:

  1. Creating Arrays
  2. Array Iteration
  3. Array Internal Pointers
  4. Explode and Implode Functions
  5. Count Array Elements, Find Min and Max Values
  6. Search values and keys in arrays
  7. Filter array elements using array_filter()
  8. Modify array elements using array_map() and array_walk()
  9. Split, tokenize, and iterate a string
  10. Convert a delimited string into an array
  11. Sorting arrays