Please wait

Quantifiers

Quantifiers in regular expressions are symbols that tell you how many times a certain pattern or character must appear. They are useful when you want to match repeated characters or patterns in a string.

Here's a simple breakdown:

  • *: Matches 0 or more occurrences of the preceding character or pattern.
  • +: Matches 1 or more occurrences.
  • ?: Matches 0 or 1 occurrence.
  • {n}: Matches exactly n occurrences.
  • {n,m}: Matches between n and m occurrences.

Basic example

If you want to find sequences of the same digit occurring exactly three times in a row, you could use the pattern /\d{5}/.

$string = "I'm 12345 years old";
 
preg_match("/\d{3}/", $string, $matches);
 
print_r($matches);

So, this pattern would match "12345" in the string "I'm 12345 years old".

Here's what each part means:

  • \d: Matches any digit.
  • {5}: Specifies that the preceding pattern (a digit in this case) must occur exactly 5 times.

Configuring the Limit

To find numbers from 3 to 5 digits, you can use the pattern \d{3,5}:

$string = "I'm not 12, but 1234 years old";
 
preg_match('/\d{3,5}/', $string, $matches);
 
print_r($matches[0]); // "1234"

If you omit the upper limit, the pattern \d{3,} looks for sequences of digits of length 3 or more:

$string = "I'm not 12, but 345678 years old";
 
preg_match('/\d{3,}/', $string, $matches);
 
print_r($matches[0]); // "345678"

To match a sequence of one or more digits in a row, like in the string "+7(903)-123-45-67", you can use the pattern \d{1,}:

$str = "+7(903)-123-45-67";
 
preg_match_all('/\d{1,}/', $str, $numbers);
 
print_r($numbers[0]); // Array: 7, 903, 123, 45, 67

These examples illustrate how you can use quantifiers with curly braces to specify exact, minimum, or ranged quantities of characters or patterns you want to match. It's a powerful way to write more flexible and precise patterns in regular expressions.

Shorthand Syntax

As mentioned earlier, we're not limited to setting quantifiers with {}. There are also shorthand characters to represent limits.

One or more

The + quantifier means "one or more," the same as {1,}.

For instance, \d+ looks for numbers:

$str = "+7(903)-123-45-67";
 
preg_match_all('/\d+/', $str, $matches);
 
print_r($matches[0]); // Array: 7, 903, 123, 45, 67

The pattern \d+ will match one or more consecutive digits in the given string, returning an array with all the matched sequences.

Zero or one

The ? quantifier means "zero or one," the same as {0,1}. In other words, it makes the symbol optional.

For instance, the pattern ou?r looks for 'o' followed by zero or one 'u', and then 'r'.

So, colou?r finds both "color" and "colour":

$str = "Should I write color or colour?";
 
preg_match_all('/colou?r/', $str, $matches);
 
print_r($matches[0]); // Array: color, colour

This pattern demonstrates how you can use the ? quantifier to match variations in spelling or optional parts of a pattern.

Zero or more

The * quantifier in regular expressions means "zero or more," the same as {0,}. Essentially, this means that the character may repeat any number of times or not be present at all.

Using * to look for a digit followed by any number of zeroes (may be many or none):

$string = "100 10 1";
 
preg_match_all('/\d0*/', $string, $matches);
 
print_r($matches[0]); // Array: 100, 10, 1

Comparing it with +, which requires one or more of the specified characters:

$string = "100 10 1";
 
preg_match_all('/\d0+/', $string, $matches);
 
print_r($matches[0]); // Array: 100, 10
// 1 is not matched, as 0+ requires at least one zero

The * quantifier provides flexibility in matching patterns where a specific character can be present in various quantities or not at all, whereas the + quantifier requires at least one occurrence of the specified character.

Exercises

Exercise #1

Create a regular expression to find an ellipsis: 3 (or more) dots in a row.

Here's some starter code to help you out:

$regexp = "";
 
$test = "Hello!... How goes?.....";
 
preg_match_all($regexp, $test, $matches);
 
print_r( $matches );

Exercise #2

Create a regular expression to search for valid CSS colors in hexadecimal format. This regular expression only needs to work with 6 hexadecimal characters. For example, #FFFFFF would be a valid result. The expression should also capture the # character.

Here's some starter code:

$regexp = "";
 
$test = "color:#121212; background-color:#AA00ef bad-colors:f#fddee #fd2 #12345678";
 
preg_match_all($regexp, $test, $matches);
 
print_r( $matches );

Key Takeaways

  • Quantifiers define how many times a character, group, or character class must occur in a pattern.
  • {n} matches exactly n occurrences. {n,} matches n or more occurrences. {n,m} matches between n and m occurrences.
  • One or More (+): Matches one or more occurrences of the preceding element. Equivalent to {1,}.
  • Zero or More (*): Matches zero or more occurrences of the preceding element. Equivalent to {0,}.
  • Zero or One (?): Makes the preceding element optional, matching either zero or one occurrence. Equivalent to {0,1}.

Comments

Please read this before commenting