Finding Tags with Regular Expressions

preg_match_all('/<.*?>/', $string, $matches)

One of the advantages of PCRE or POSIX is that some special constructs are supported. For instance, usually regular expressions are matched greedily. Take, for instance, this regular expression:


When trying to match this in the following string:

<p>Text, html and <b>PHP</b>.</p>

what do you get? You get the complete string. Of course, the pattern also matches on <p>, but regular expressions try to match as much as possible. Therefore, you usually have to do a clumsy workaround, such as <[^>]*>. However, it can be done more easily. You can use the ? modifier after the * quantifier to activate nongreedy matching.

Finding All Tags Using Non-greedy PCRE

  $string = '<p>Text, html and <b>PHP</b>.</p>';
  preg_match_all('/<.*?>/', $string, $matches);
  foreach ($matches[0] as $match) {
    echo htmlspecialchars("$match ");

Which outputs:

<p> <b> </b> </p>

Validating Mandatory Input

function checkNotEmpty($s) {
   return (trim($s) !== '');

When validating form fields (see tutorial 4 for more about HTML forms), several checks can be done. However, you should test as little as possible. For instance, when recently trying to order concert tickets for a U.S. concert, I always failed because it expected a U.S. telephone number, which couldn't be provided.

The best check is to check whether there is any input at all. However, what is considered to be "any input"? If someone enters just whitespace (that is, space characters and other nontext characters), is the form field filled out correctly?

The best way is to use trim() before checking whether there is anything inside the variable or expression. The function trim() removes all kinds of whitespace characters, including the space character, horizontal and vertical tabs, carriage returns, and line feeds. If, after that, the string is not equal to an empty string, the (mandatory) field has been filled out.

The file check.php contains sample calls and all following calls to validation functions in the file