- extract phone numbers
- to locate all files with digits in the start, middle or end of their names
- to find all repeated words in a block of text
- to replace all URLs in a page with actual HTML links to those same URLs
- and much more
Like any language, regular expressions have a special syntax and instructions that you must learn.
So what does this have to do with MySQL? As already explained, all regular expressions do is match text, comparing a pattern (the regular expression) with a string of text. MySQL provides rudimentary support for regular expressions with WHERE
clauses, allowing you to specify regular expressions that are used to filter data retrieved using SELECT
.
This will all become much clearer with some examples.
Basic Character Matching
We'll start with a very simple example. The following statement retrieves all rows where column prod_name
contains the text 1000
:
SELECT prod_name FROM products WHERE prod_name REGEXP '1000' ORDER BY prod_name;
The above statement displayed the following output:
+--------------+ | prod_name | +--------------+ | product 1000 | +--------------+
This statement looks much like the ones that used LIKE
except that the keyword LIKE
has been replaced with REGEXP
. This tells MySQL that what follows is to be treated as a regular expression (one that just matches the literal text 1000
).
So, why bother using a regular expression? Well, in the example just used, regular expressions really add no value (and probably hurt performance), but consider this next example:
SELECT prod_name FROM products WHERE prod_name REGEXP '.000' ORDER BY prod_name;
The above statement displayed the following output:
+--------------+ | prod_name | +--------------+ | product 1000 | | product 2000 | +--------------+
Here the regular expression .000
was used. .
is a special character in the regular expression language. It means match any single character, and so both 1000
and 2000
matched and were returned.
Of course, this particular example could also have been accomplished using LIKE
and wildcards.