Please wait

Escaping Special Characters

So far, we've seen that the \ character is commonly paired with character classes, such as \d, \w, \u, etc. It's not the only character with a special meaning. Other special characters include [ ] { } ( ) \ ^ $ . | ? * +. They can be used to affect the behavior of a regular expression.

With that being said, we may want to search for these characters in a regular expression without PHP treating them differently. We can avoid special treatment of a character by escaping it.

Escaping a character in a regular expression means using a backslash \ before a character that has a special meaning in regex. By doing so, you tell the regex engine to treat that character as a literal character rather than a special symbol with a specific function. By escaping with a backslash, you remove their special meaning and make them behave like regular characters.

Escaping Example

One example where escaping would be necessary is when we're trying to match a literal dot. Not "any character", but just a dot.

We can select a . character by prepending the character with a backslash to escape it: \..

This is referred to as escaping a character.

$string1 = "Chapter 5.1";
 
if (preg_match("/\d\.\d/", $string1, $matches1)) {
  echo $matches1[0]; // 5.1 (match!)
} else {
  echo "null"; // This part won't be reached in this example
}
 
$string2 = "Chapter 511";
 
if (preg_match("/\d\.\d/", $string2, $matches2)) {
  echo $matches2[0]; // This part won't be reached in this example
} else {
  echo "null"; // null (looking for a real dot \.)
}

The pattern /\d\.\d/ is looking for a digit, followed by an actual dot, followed by another digit. If a match is found, it is printed. If no match is found, "null" is printed.

Parentheses are another special character in regular expressions. If we want to search for them literally, we must escape them like so: \(.

$string = "function g()";
 
if (preg_match("/g\(\)/", $string, $matches)) {
  echo $matches[0]; // "g()"
} else {
  echo "null"; // This part won't be reached in this example
}

The pattern /g\(\)/ is looking for the literal string "g()". The parentheses ( and ) have special meanings in regex, so we need to escape them with a backslash \ to treat them as literal characters.

Lastly, sometimes you may have to search for a literal \ character. In this case, you can use two \\ characters.

$string = "1\\2";
 
if (preg_match("/\\\/", $string, $matches)) {
  echo $matches[0]; // '\'
} else {
  echo "null"; // This part won't be reached in this example
}

Changing the Delimiter

The / character doesn't hold any special meaning in a regular expression aside from acting as a delimiter. Since that's the case, you must escape this character if you want to perform a match on it like so:

$string = "/";
 
if (preg_match("/\//", $string, $matches)) {
  echo $matches[0]; // '/'
} else {
  echo "null"; // This part won't be reached in this example
}

Alternatively, you can simply change the delimiter.

$string = "/";
 
if (preg_match("#/#", $string, $matches)) {
  echo $matches[0]; // '/'
} else {
  echo "null"; // This part won't be reached in this example
}

In this example, the delimiter was changed from / to #.

Key Takeaways

  • In regular expressions, certain characters have special meanings, such as ., *, +, ?, |, ^, $, [, ], {, }, (, ), and /. To match these characters literally, you need to escape them with a backslash \.
  • In PHP, you can use different delimiters for your regex pattern. If your pattern includes a lot of forward slashes, using an alternative delimiter like ~ can make the pattern more readable and eliminate the need to escape the slashes.
  • Proper escaping helps in maintaining the readability of the code, making it clear when a character is intended to be matched literally versus when it's used as a special symbol.

Comments

Please read this before commenting