PCRE Documentation and Change Log


As a convenience to PCRE users, with the permission of Philip Hazel, I aim to provide a mirror to the latest PCRE documentation whenever it is released. To download the latest PCRE, see pcre.org.

Apart from links to various versions of the PCRE documentation, this page presents a curated list of new feature introductions in PCRE's pattern syntax, as well as as links to other PCRE-related material on RexEgg.

(direct link)

Index

For easy navigation, here are some jumping points to various sections of the page:

Change Log
Documentation
Feature Additions to the PCRE Pattern Syntax
When PCRE precedes Perl
Links to other PCRE-related Material on RexEgg


(direct link | top)

Change Log

✽ For the latest official PCRE2 revision history (ChangeLog), follow the link, which should remain the same when new versions are released. For the official "PCRE 1" revision history (ChangeLog), follow the link, which shows all changes up to the latest version of PCRE1.

✽ For a brief, curated history of additions to the syntax, see Additions to PCRE further down.


(direct link | top)

Documentation

✽ Versions 10.0 and higher are called PCRE2. PCRE2 contains a new API, which includes a replacement function: pcre2_substitute(). The latest PCRE2 documentation should always be available on this link. If you are mostly interested in PCRE's regex syntax, the most important file in the PCRE2 documentation is the pcre2pattern man page. The pcre2api file has the replacement syntax.

✽ Versions below 10.0, sometimes known as "PCRE 1", are the original PCRE library—still widely but now in bug-fix mode only (no new features to be introduced). The latest "PCRE 1" documentation should always live on this link. If you are mostly interested in PCRE's regex syntax, the most important file in the "PCRE 1" documentation is the pcrepattern man page.

PCRE 10.39 documentation
PCRE 10.38 documentation
PCRE 10.37 documentation
PCRE 10.36 documentation
PCRE 10.35 documentation
PCRE 10.34 documentation
PCRE 10.33 documentation
PCRE 10.32 documentation
PCRE 10.31 documentation
PCRE 10.30 documentation
PCRE 10.23 documentation
PCRE 10.22 documentation
PCRE 10.21 documentation
PCRE 10.20 documentation
PCRE 10.10 documentation
PCRE 10.00 documentation

PCRE 8.45 documentation
PCRE 8.44 documentation
PCRE 8.43 documentation
PCRE 8.42 documentation
PCRE 8.41 documentation
PCRE 8.40 documentation
PCRE 8.39 documentation
PCRE 8.38 documentation
PCRE 8.37 documentation
PCRE 8.36 documentation
PCRE 8.35 documentation
PCRE 8.34 documentation
PCRE 8.33 documentation
PCRE 8.32 documentation
PCRE 8.31 documentation
PCRE 8.30 documentation
PCRE 8.21 documentation
PCRE 8.13 documentation
PCRE 8.02 documentation
PCRE 7.90 documentation
PCRE 6.70 documentation
PCRE 5.00 documentation
PCRE 4.50 documentation
PCRE 3.90 documentation


(direct link | top)

Feature Additions to the PCRE Pattern Syntax

This section is not the full PCRE change log. Instead, it presents the version and date when new features were added to the pattern syntax. This is a curated collection that does not claim to be exhaustive. For the full story, see the change log for PCRE and the change log for PCRE2.

VersionDateChange
10.381 Oct 2021Added PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK option to allow \K in lookarounds
10.364 Dec 2020Added CET_CFLAGS option for Intel CET
10.359 May 2020Added PCRE2_SUBSTITUTE_LITERAL option to turn off the interpretation of the replacement string
10.359 May 2020Added PCRE2_SUBSTITUTE_MATCHED option
10.359 May 2020Added PCRE2_SUBSTITUTE_REPLACEMENT_ONLY option
10.359 May 2020Added Added (?* and (?<* as synonms for (*napla: and (*naplb: to match another regex engine. option
10.3421 Nov 2019Added non-atomic positive lookaround via (*non_atomic_positive_lookahead:…) or (*napla:…), (*non_atomic_positive_lookbehind:…) or (*naplb:…)
10.3421 Nov 2019(*ACCEPT) can now be quantified: an ungreedy quantifier with a zero minimum is potentially useful
10.3421 Nov 2019Add pcre2_get_match_data_size() to the API
10.3421 Nov 2019Add pcre2_maketables_free() to the API
10.3316 Apr 2019Added Perl "script run" features (*script_run:…) a.k.a (*sr:…), and (*atomic_script_run:…) a.k.a (*asr:…)
10.3316 Apr 2019Added Perl 5.28 experimental alphabetic names for atomic groups and lookaround assertions, such as (*pla:…) and (*atomic:…)
10.3316 Apr 2019Added PCRE2_EXTRA_ESCAPED_CR_IS_LF option
10.3316 Apr 2019Added PCRE2_COPY_MATCHED_SUBJECT option
10.3316 Apr 2019Added PCRE2_EXTRA_ALT_BSUX option to support ECMAScript 6 \u{hhh} construct
10.3316 Apr 2019In DOTALL mode, \p{Any} is now the same as .
10.3210 Sep 2018(?^) unsets all imnsx options
10.3210 Sep 2018(*ACCEPT:ARG), (*FAIL:ARG), and (*COMMIT:ARG) are now supported.
10.3014 Aug 2017Added the PCRE2_LITERAL option, telling the compiler to treat the entire pattern as a literal string, including what would normally be metacharacters
10.3014 Aug 2017Added the PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL option, telling the compiler to treat an escaped character which isn't a proper token (such as \j) as a literal (in this case the letter j) rather than an error
10.3014 Aug 2017Added the PCRE2_NEWLINE_NUL option, which adds the NUL character (binary zero) to the list of characters which can be set as those to be recognized as new lines, set using pcre2_set_newline()
10.3014 Aug 2017Added the PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES option, giving finer control over the treatment of Unicode surrogate code points
10.3014 Aug 2017Added the (?n) inline option to disable auto-capture, in the same way as the PCRE2_NO_AUTO_CAPTURE API option
10.3014 Aug 2017Added the (?xx) inline option and the PCRE2_EXTENDED_MORE API option to ignore all unescaped whitespace, including in a character class
10.3014 Aug 2017Added the PCRE2_ENDANCHORED option, telling the engine that the pattern can only match at the end of the subject
10.3014 Aug 2017Added pcre2_pattern_convert() to the API, an experimental foreign pattern conversion function
10.3014 Aug 2017Added pcre2_code_copy_with_tables() to the API
10.2314 Feb 2017Allow backreferences in lookbehind so long as group names or numbers are unambiguous
10.2314 Feb 2017Added forward relative back-reference syntax: \g{+2} (mirroring the existing \g{-2})
10.2229 Jul 2016Added pcre2_code_copy() to the API
10.2112 Jan 2016Added the PCRE2_SUBSTITUTE_EXTENDED option to enhance replacement syntax
10.2112 Jan 2016Added the ${*MARK} facility to pcre2_substitute()
10.2112 Jan 2016Added the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option to tweak what happens during replacements when the output buffer is too small
10.2112 Jan 2016Added the PCRE2_SUBSTITUTE_UNKNOWN_UNSET and PCRE2_SUBSTITUTE_UNSET_EMPTY options to fine-tune how empty capture groups are treated in replacements
10.2112 Jan 2016Added the PCRE2_ALT_VERBNAMES option to subtly modify marked names that can be used with backtracking control verbs
10.2112 Jan 2016Added pcre2_set_max_pattern_length() to the API, allowing programs to restrict the size of patterns they are prepared to handle
10.2030 Jun 2015Added the PCRE2_ALT_CIRCUMFLEX option to allow ^ to assert position after any newline including a terminating newline
10.2030 Jun 2015Added the PCRE2_NEVER_BACKSLASH_C option to disable \C
10.2030 Jun 2015pcre2_callout_enumerate was added to the API
10.106 Mar 2015Serialization functions were added to the API
10.05 Jan 2015Version check available via patterns such as (?(VERSION>=x)…)
10.05 Jan 2015PCRE2_NO_DOTSTAR_ANCHOR tells the engine not to automatically anchor patterns that start with .*
10.05 Jan 2015(*NOTEMPTY) and (*NOTEMPTY_ATSTART) tell the engine not to return empty matches)
10.05 Jan 2015By default, PCRE2 buils with unicode support
10.05 Jan 2015Name switch to PCRE2 and new API, which includes a replacement function: pcre2_substitute()
*********
8.415 Jul 2017Inline comments can now be inserted between ++ and +? quantifiers, as in a+(?# make it possessive)+ or a+(?# up to b)?b
8.3415 Dec 2014Added support for the POSIX [[:<:]] and [[:>:]] (left- and right-of-word boundaries), which are converted to \b(?=\w) and \b(?<=\w) internally
8.3415 Dec 2014Added \o{…} to specify code points in octal
8.3328 May 2014Added \p{Xuc} (PCRE-specific) to match characters that can be expressed using Universal Character Names
8.1025 Jun 2010Added PCRE-specific Unicode properties: \p{Xan} (alphanumeric), \p{Xsp} (Perl space), \p{Xps} (POSIX space) and \p{Xwd} (word)
8.1025 Jun 2010Added support for (*MARK:ARG) and for ARG additions to PRUNE, SKIP, and THEN
8.1025 Jun 2010Added \N (any character that is not a line break)
8.1025 Jun 2010Added the (*UCP) start of pattern modifier, which affects \b, \d, \s and \w
7.9011 Apr 2009Added the (*UTF8) start of pattern modifier
7.707 May 2008Added Ruby-style subroutine call syntax: \g<2>, \g'name', \g'2'
7.3028 Aug 2007Added backtracking control verbs (*SKIP), (*FAIL), (*F), (*PRUNE), (*THEN), (*COMMIT), (*ACCEPT)
7.3028 Aug 2007Added the (*CR) start of pattern modifier
7.2019 Jun 2007Added (?-2) and (?+2) syntax for relative subroutine calls
7.2019 Jun 2007Added (?(-2)…) and (?(+2)…) conditional syntax to check if a relative capture group has been set
7.2019 Jun 2007Added \K to drop what has been matched so far from the match to be returned
7.2019 Jun 2007Added named back-reference synonyms: \k{foo} and \g{foo}
7.2019 Jun 2007Added branch reset syntax (?|…)
7.2019 Jun 2007Added \h and \v (and their counterclasses \H and \V) to match horizontal and vertical whitespace
7.0019 Dec 2006Added \R to match any Unicode newline sequence
7.0019 Dec 2006Added named group synonyms (?<foo>…) and (?'foo'…)
7.0019 Dec 2006Added named subroutine call synonym (?&foo)
7.0019 Dec 2006Added named back-reference synonyms \k<foo> and \k'foo'
7.0019 Dec 2006Added named conditional synonyms (?(<foo>)…), (?('foo')…) and (?(foo)…)
7.0019 Dec 2006Added pre-defined subroutines (?(DEFINE)…)
7.0019 Dec 2006Added conditional syntax to check if a subroutine or recursion level has been reached: (?(R2)…), (?(R&foo)…) and (?(R)…)
7.0019 Dec 2006Added \g2 and \g{-2} for relative back-references
6.704 Jul 2006Added named groups in conditionals: (?(foo)…)
6.501 Feb 2006Added support for Unicode script names via \p{Arabic}
6.007 Jun 2005Added pcre_dfa_exec() to the API
6.007 Jun 2005Added pcre_refcount() to the API
6.007 Jun 2005Added pcre_compile2() to the API
5.0013 Sep 2004Added support for Unicode categories such as \p{L} and negated Unicode categories such as \P{Nd}
5.0013 Sep 2004Added \X Unicode grapheme token
4.0017 Feb 2003Added [:blank:] to match ASCII space character and tab
4.0017 Feb 2003Added \Q…\E escape sequence
4.0017 Feb 2003Added possessive quantifiers: ?+, *+, ++ and {…,…}+
4.0017 Feb 2003Added \C to match a single byte, even in UTF-8 mode
4.0017 Feb 2003Added the \G continuation anchor
4.0017 Feb 2003Added callouts (?C), (?C2) etc. which can be used in C but not PHP
4.0017 Feb 2003Added named groups (?P<foo>…) and back-references (?P=foo), and subroutine calls (?P>foo)
3.301 Aug 2000Added pcre_free_substring() and pcre_free_substring_list() to the API
3.001 Feb 2000Added recursion (?R)
3.001 Feb 2000Added POSIX classes such as [:alpha:]
3.001 Feb 2000Added pcre_fullinfo() to the API
2.0024 Sep 1998Atomic groups (?>) can now be quantified
2.0024 Sep 1998Added positive lookbehind (?<=…)
2.0024 Sep 1998Added negative lookbehind (?
2.0024 Sep 1998Added non-capturing groups with inline modifiers (?imsx-imsx:)
2.0024 Sep 1998Added unsetting of inline modifiers: (?-imsx)
2.0024 Sep 1998Added conditional pattern matching (?(cond)re|re)
1.0827 Mar 1998Add PCRE_UNGREEDY to invert the greediness of quantifiers
1.0827 Mar 1998Added the inline modifier (?U) to turn on ungreedy mode
1.0827 Mar 1998Added the inline modifier (?X) to turn on extras mode
0.9927 Oct 1997Added atomic groups (?>…)
0.9616 Oct 1997Added DOTALL mode, including inline modifier (?s)
0.9315 Sep 1997Added pcre_study() to the API
0.9211 Sep 1997Added multiline mode via inline modifier (?m) and PCRE_MULTILINE
0.9211 Sep 1997Added pcre_info() to the API (removed in 8.30)



(direct link | top)

When PCRE precedes Perl

For the most part, PCRE tries to stay in step with Perl regex syntax, but the two engines' behaviors are not always identical. As is bound to happen in communities with many active users, it can happen that an idea makes it to the PCRE engine before it gets adopted by Perl. This kind of friendly exchange is a good thing for all regexers. Parochial not invented here postures wouldn't serve us—we just want the best regex engines.

Here are examples of features where PCRE preceded Perl:

✽ Recursion was first implemented in PCRE by a contributor and appeared in version 3.0 (February 2000). Perl introduced recursion in version 5.10 (officially released in December 2007), which explains why certain details function differently in the two engines.

✽ PCRE implemented Python's named group syntax (?P<foo>…) in version 4.0 (February 2003). Perl started supporting named groups in version 5.10 (officially released in December 2007).


(direct link | top)

Links to other PCRE-related Material on RexEgg

PCRE-related material is peppered throughout the site. Below, I try to maintain a list of the most significant "PCRE pockets" on the site.

Reducing (?…) Syntax Confusion explains all the (?…) syntax. Other points of PCRE syntax can be found on the pages about anchors, boundaries, capture groups and others (see the "Black Belt Program") in the left-side menu at the top of the page.

✽ The page on flags and modifiers has a section about PCRE's Special Start-of-Pattern Modifiers.

✽ I've implemented an infinite lookbehind demo for PCRE.

pcregrep and pcretest presents two PCRE-specific tools and includes the latest Windows binaries.

✽ My page on backtracking control verbs shows useful contructs such as (*SKIP)(*FAIL)

✽ The PHP regex page shows the PHP interface to the PCRE engine.

✽ The trick about matching line numbers shows an interesting example of self-referencing groups and of recursion.

✽ The trick about matching numbers in plain English shows an full-scale example of how (?(DEFINE)…) can be used to produce modular, maintainable patterns.





next  Two marvelous PCRE tools:
 grep with pcregrep, debug and optimize with pcretest



Buy me a coffee


Be the First to Leave a Comment






All comments are moderated.
Link spammers, this won't work for you.

To prevent automatic spam, may I gently ask that you go through these crazy hoops…
Buy me a coffee