Categories
PHP

Regular expression to find in an array and to split a string

How to use preg_split() function to split strings based on regular expressions and how to use preg_grep() function to search through an entire array of strings looking for text that matches a regular expression.

preg_split() – Split string using regular expression

This function splits the input string into an array, breaking the string where the matching pattern is found.

<?php
 //Syntax
 preg_split(
  string $pattern, string $subject,
  int $limit = -1, int $flags = 0
 ): array|false

The preg_split() function takes four parameters:

  1. $pattern: The regular expression as a string to search for.
  2. $subject: The input string.
  3. $limit (optional): The maximum possible replacements, default to -1 (no limit).
  4. $flags: can be any combination of the following flags:
    1. PREG_SPLIT_NO_EMPTY – do not return empty text chunks.
    2. PREG_SPLIT_DELIM_CAPTURE – returns subpatterns.
    3. PREG_SPLIT_OFFSET_CAPTURE – returns the offset position of splitted text as well.

This function performs a similar task to the explode( ) function. When complex patterns aren’t needed to break a string into an array, the explode() function makes a better choice.

Example: Split string by any space character (i.e. space, newline, tab, etc.)

<?php
 $pattern = '/\s/';
 $subject = 'one two 	three
 		four	five';
 $array = preg_split($pattern, $subject);
 print_r($array);

The following output prints on the web browser:

Array
(
    [0] => one
    [1] => two
    [2] => 
    [3] => three
    [4] => 
    [5] => 
    [6] => 
    [7] => four
    [8] => five
)

The above result contains the empty elements, use the PREG_SPLIT_NO_EMPTY flag to remove empty values from the output array, see the following example:

Example: Using PREG_SPLIT_NO_EMPTY flag for non-empty result:

<?php
 $pattern = '/\s/';
 $subject = 'one two 	three
 		four	five';
 $array = preg_split($pattern, $subject, -1, PREG_SPLIT_NO_EMPTY);
 print_r($array);
 /* Prints:
Array(
 [0] => one
 [1] => two
 [2] => three
 [3] => four
 [4] => five ) */

Example: Using PREG_SPLIT_OFFSET_CAPTURE flag to return the offset position of each element:

The PREG_SPLIT_OFFSET_CAPTURE flag works the same as the preg_match() flag PREG_OFFSET_CAPTURE, it returns an array of arrays, each array contains two elements: the splitted text and its offset position in the input string. In the following example, we used two flags by using the bitwise OR operator to separate them.

<?php
 $pattern = '/\s/';
 $subject = 'one two 	three
 		four	five';
 $array = preg_split($pattern, $subject, -1, PREG_SPLIT_NO_EMPTY|PREG_SPLIT_OFFSET_CAPTURE);

 print_r($array);
 /* Prints:
Array (
 [0] => Array ([0] => one [1] => 0 )
 [1] => Array ([0] => two [1] => 4 )
 [2] => Array ([0] => three [1] => 9)
 [3] => Array ([0] => four [1] => 18)
 [4] => Array ([0] => five [1] => 23)
)
*/

Example: Using PREG_SPLIT_DELIM_CAPTURE flag:

The following example demonstrates how the subpattern in the regular expression returned in the array as well.

<?php
 $pattern = '/([\.,;:])/';
 $subject = 'Start small, Lose fluff words: Review your work.';
 $array = preg_split($pattern, $subject, -1, PREG_SPLIT_DELIM_CAPTURE);
 print_r($array);
/* Prints:
Array (
 [0] => Start small
 [1] => ,
 [2] =>  Lose fluff words
 [3] => :
 [4] =>  Review your work
 [5] => .
 [6] => 
) */

preg_grep() – Search arrays using regular expression

The preg_match() and preg_match_all() functions search individual strings of text. To search an entire array of strings, use the preg_grep() function.

<?php
//Syntax
preg_grep(string $pattern, array $array, int $flags = 0): array|false

The preg_grep() function takes three parameters:

  1. $pattern: The regular expression as a string to search for.
  2. $array: The input array to search through.
  3. $flags (optional): By default, the function returns array entries that match the pattern. Use the PREG_GREP_INVERT flag if you want entries that do not match the pattern.

The preg_grep() function works similar to Unix grep command, it searches an array and returns an array of values that match the pattern found in the array.

<?php
 $pattern = '/\$[0-9]+/';
 $array = [5,'$2','$'];
 $narray= preg_grep($pattern, $array);
 print_r($narray);
/*Array(
  [1] => $2
)*/

Similar to the Unix command grep -v, you can invert the search by using the PREG_GREP_INVERT flag which returns an array of all elements that do not match the pattern.

<?php
 $pattern = '/\$[0-9]+/';
 $array = [5,'$2','$'];
 $narray= preg_grep($pattern, $array, PREG_GREP_INVERT);
 print_r($narray);
/* Prints: Array(
    [0] => 5
    [2] => $) */

More Regular Expressions Tutorials:

  1. Regular Expressions
  2. Matching patterns using preg_match() and preg_match_all()
  3. Search and replace with preg_replace() and preg_filter()
  4. Search and replace with preg_replace_callback() and preg_replace_callback_array()
  5. Regular expression to find in an array and to split a string
  6. Escaping special characters in regular expressions
  7. Handling errors in regular expressions
  8. Matching word boundaries
  9. Data validation with regular expressions