Quantifiers
Quantifiers in regular expressions are symbols that tell you how many times a certain pattern or character must appear. They are useful when you want to match repeated characters or patterns in a string.
Here's a simple breakdown:
*
: Matches 0 or more occurrences of the preceding character or pattern.+
: Matches 1 or more occurrences.?
: Matches 0 or 1 occurrence.{n}
: Matches exactly n occurrences.{n,m}
: Matches between n and m occurrences.
Basic example
If you want to find sequences of the same digit occurring exactly three times in a row, you could use the pattern /\d{5}/
.
$string = "I'm 12345 years old";
preg_match("/\d{3}/", $string, $matches);
print_r($matches);
So, this pattern would match "12345"
in the string "I'm 12345 years old"
.
Here's what each part means:
\d
: Matches any digit.{5}
: Specifies that the preceding pattern (a digit in this case) must occur exactly 5 times.
Configuring the Limit
To find numbers from 3 to 5 digits, you can use the pattern \d{3,5}
:
$string = "I'm not 12, but 1234 years old";
preg_match('/\d{3,5}/', $string, $matches);
print_r($matches[0]); // "1234"
If you omit the upper limit, the pattern \d{3,}
looks for sequences of digits of length 3 or more:
$string = "I'm not 12, but 345678 years old";
preg_match('/\d{3,}/', $string, $matches);
print_r($matches[0]); // "345678"
To match a sequence of one or more digits in a row, like in the string "+7(903)-123-45-67"
, you can use the pattern \d{1,}
:
$str = "+7(903)-123-45-67";
preg_match_all('/\d{1,}/', $str, $numbers);
print_r($numbers[0]); // Array: 7, 903, 123, 45, 67
These examples illustrate how you can use quantifiers with curly braces to specify exact, minimum, or ranged quantities of characters or patterns you want to match. It's a powerful way to write more flexible and precise patterns in regular expressions.
Shorthand Syntax
As mentioned earlier, we're not limited to setting quantifiers with {}
. There are also shorthand characters to represent limits.
One or more
The +
quantifier means "one or more," the same as {1,}
.
For instance, \d+
looks for numbers:
$str = "+7(903)-123-45-67";
preg_match_all('/\d+/', $str, $matches);
print_r($matches[0]); // Array: 7, 903, 123, 45, 67
The pattern \d+
will match one or more consecutive digits in the given string, returning an array with all the matched sequences.
Zero or one
The ?
quantifier means "zero or one," the same as {0,1}
. In other words, it makes the symbol optional.
For instance, the pattern ou?r looks for 'o'
followed by zero or one 'u'
, and then 'r'
.
So, colou?r
finds both "color"
and "colour"
:
$str = "Should I write color or colour?";
preg_match_all('/colou?r/', $str, $matches);
print_r($matches[0]); // Array: color, colour
This pattern demonstrates how you can use the ?
quantifier to match variations in spelling or optional parts of a pattern.
Zero or more
The *
quantifier in regular expressions means "zero or more," the same as {0,}
. Essentially, this means that the character may repeat any number of times or not be present at all.
Using *
to look for a digit followed by any number of zeroes (may be many or none):
$string = "100 10 1";
preg_match_all('/\d0*/', $string, $matches);
print_r($matches[0]); // Array: 100, 10, 1
Comparing it with +
, which requires one or more of the specified characters:
$string = "100 10 1";
preg_match_all('/\d0+/', $string, $matches);
print_r($matches[0]); // Array: 100, 10
// 1 is not matched, as 0+ requires at least one zero
The *
quantifier provides flexibility in matching patterns where a specific character can be present in various quantities or not at all, whereas the +
quantifier requires at least one occurrence of the specified character.
Exercises
Exercise #1
Create a regular expression to find an ellipsis: 3 (or more) dots in a row.
Here's some starter code to help you out:
$regexp = "";
$test = "Hello!... How goes?.....";
preg_match_all($regexp, $test, $matches);
print_r( $matches );
Exercise #2
Create a regular expression to search for valid CSS colors in hexadecimal format. This regular expression only needs to work with 6 hexadecimal characters. For example, #FFFFFF
would be a valid result. The expression should also capture the #
character.
Here's some starter code:
$regexp = "";
$test = "color:#121212; background-color:#AA00ef bad-colors:f#fddee #fd2 #12345678";
preg_match_all($regexp, $test, $matches);
print_r( $matches );
Key Takeaways
- Quantifiers define how many times a character, group, or character class must occur in a pattern.
{n}
matches exactly n occurrences.{n,}
matches n or more occurrences.{n,m}
matches between n and m occurrences.- One or More (
+
): Matches one or more occurrences of the preceding element. Equivalent to{1,}
. - Zero or More (
*
): Matches zero or more occurrences of the preceding element. Equivalent to{0,}
. - Zero or One (
?
): Makes the preceding element optional, matching either zero or one occurrence. Equivalent to{0,1}
.