grep-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Changes to grep/manual/html_node/Problematic-Expressions.html,v


From: Jim Meyering
Subject: Changes to grep/manual/html_node/Problematic-Expressions.html,v
Date: Wed, 22 Mar 2023 22:55:26 -0400 (EDT)

CVSROOT:        /webcvs/grep
Module name:    grep
Changes by:     Jim Meyering <meyering> 23/03/22 22:55:22

Index: html_node/Problematic-Expressions.html
===================================================================
RCS file: /webcvs/grep/grep/manual/html_node/Problematic-Expressions.html,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -b -r1.1 -r1.2
--- html_node/Problematic-Expressions.html      3 Sep 2022 19:33:14 -0000       
1.1
+++ html_node/Problematic-Expressions.html      23 Mar 2023 02:55:21 -0000      
1.2
@@ -1,11 +1,11 @@
-<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 
"http://www.w3.org/TR/html4/loose.dtd";>
+<!DOCTYPE html>
 <html>
-<!-- Created by GNU Texinfo 6.8, https://www.gnu.org/software/texinfo/ -->
+<!-- Created by GNU Texinfo 7.0dev, https://www.gnu.org/software/texinfo/ -->
 <head>
 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
 <!-- This manual is for grep, a pattern matching engine.
 
-Copyright (C) 1999-2002, 2005, 2008-2022 Free Software Foundation,
+Copyright © 1999-2002, 2005, 2008-2023 Free Software Foundation,
 Inc.
 
 Permission is granted to copy, distribute and/or modify this document
@@ -14,10 +14,10 @@
 Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
 Texts.  A copy of the license is included in the section entitled
 "GNU Free Documentation License". -->
-<title>Problematic Expressions (GNU Grep 3.8)</title>
+<title>Problematic Expressions (GNU Grep 3.10)</title>
 
-<meta name="description" content="Problematic Expressions (GNU Grep 3.8)">
-<meta name="keywords" content="Problematic Expressions (GNU Grep 3.8)">
+<meta name="description" content="Problematic Expressions (GNU Grep 3.10)">
+<meta name="keywords" content="Problematic Expressions (GNU Grep 3.10)">
 <meta name="resource-type" content="document">
 <meta name="distribution" content="global">
 <meta name="Generator" content="makeinfo">
@@ -31,21 +31,9 @@
 <link href="Basic-vs-Extended.html" rel="prev" title="Basic vs Extended">
 <style type="text/css">
 <!--
-a.copiable-anchor {visibility: hidden; text-decoration: none; line-height: 0em}
-a.summary-letter {text-decoration: none}
-blockquote.indentedblock {margin-right: 0em}
-div.display {margin-left: 3.2em}
-div.example {margin-left: 3.2em}
-kbd {font-style: oblique}
-pre.display {font-family: inherit}
-pre.format {font-family: inherit}
-pre.menu-comment {font-family: serif}
-pre.menu-preformatted {font-family: serif}
-span.nolinebreak {white-space: nowrap}
-span.roman {font-family: initial; font-weight: normal}
-span.sansserif {font-family: sans-serif; font-weight: normal}
-span:hover a.copiable-anchor {visibility: visible}
-ul.no-bullet {list-style: none}
+a.copiable-link {visibility: hidden; text-decoration: none; line-height: 0em}
+span:hover a.copiable-link {visibility: visible}
+ul.mark-bullet {list-style-type: disc}
 -->
 </style>
 <link rel="stylesheet" type="text/css" 
href="https://www.gnu.org/software/gnulib/manual.css";>
@@ -54,139 +42,139 @@
 </head>
 
 <body lang="en">
-<div class="section" id="Problematic-Expressions">
-<div class="header">
+<div class="section-level-extent" id="Problematic-Expressions">
+<div class="nav-panel">
 <p>
 Next: <a href="Character-Encoding.html" accesskey="n" rel="next">Character 
Encoding</a>, Previous: <a href="Basic-vs-Extended.html" accesskey="p" 
rel="prev">Basic vs Extended Regular Expressions</a>, Up: <a 
href="Regular-Expressions.html" accesskey="u" rel="up">Regular Expressions</a> 
&nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>][<a href="Index.html" title="Index" 
rel="index">Index</a>]</p>
 </div>
 <hr>
-<span id="Problematic-Regular-Expressions"></span><h3 class="section">3.7 
Problematic Regular Expressions</h3>
+<h3 class="section" id="Problematic-Regular-Expressions"><span>3.7 Problematic 
Regular Expressions<a class="copiable-link" 
href="#Problematic-Regular-Expressions"> &para;</a></span></h3>
 
-<span id="index-invalid-regular-expressions"></span>
-<span id="index-unspecified-behavior-in-regular-expressions"></span>
-<p>Some strings are <em>invalid regular expressions</em> and cause
-<code>grep</code> to issue a diagnostic and fail.  For example, 
&lsquo;<samp>xy\1</samp>&rsquo;
+<a class="index-entry-id" id="index-invalid-regular-expressions"></a>
+<a class="index-entry-id" 
id="index-unspecified-behavior-in-regular-expressions"></a>
+<p>Some strings are <em class="dfn">invalid regular expressions</em> and cause
+<code class="command">grep</code> to issue a diagnostic and fail.  For 
example, &lsquo;<samp class="samp">xy\1</samp>&rsquo;
 is invalid because there is no parenthesized subexpression for the
-back-reference &lsquo;<samp>\1</samp>&rsquo; to refer to.
+back-reference &lsquo;<samp class="samp">\1</samp>&rsquo; to refer to.
 </p>
-<p>Also, some regular expressions have <em>unspecified behavior</em> and
-should be avoided even if <code>grep</code> does not currently diagnose
-them.  For example, &lsquo;<samp>xy\0</samp>&rsquo; has unspecified behavior 
because
-&lsquo;<samp>0</samp>&rsquo; is not a special character and 
&lsquo;<samp>\0</samp>&rsquo; is not a special
-backslash expression (see <a href="Special-Backslash-Expressions.html">Special 
Backslash Expressions</a>).
+<p>Also, some regular expressions have <em class="dfn">unspecified 
behavior</em> and
+should be avoided even if <code class="command">grep</code> does not currently 
diagnose
+them.  For example, &lsquo;<samp class="samp">xy\0</samp>&rsquo; has 
unspecified behavior because
+&lsquo;<samp class="samp">0</samp>&rsquo; is not a special character and 
&lsquo;<samp class="samp">\0</samp>&rsquo; is not a special
+backslash expression (see <a class="pxref" 
href="Special-Backslash-Expressions.html">Special Backslash Expressions</a>).
 Unspecified behavior can be particularly problematic because the set
 of matched strings might be only partially specified, or not be
 specified at all, or the expression might even be invalid.
 </p>
 <p>The following regular expression constructs are invalid on all
 platforms conforming to POSIX, so portable scripts can assume that
-<code>grep</code> rejects these constructs:
+<code class="command">grep</code> rejects these constructs:
 </p>
-<ul>
-<li> A basic regular expression containing a back-reference 
&lsquo;<samp>\<var>n</var></samp>&rsquo;
-preceded by fewer than <var>n</var> closing parentheses.  For example,
-&lsquo;<samp>\(a\)\2</samp>&rsquo; is invalid.
-
-</li><li> A bracket expression containing &lsquo;<samp>[:</samp>&rsquo; that 
does not start a
-character class; and similarly for &lsquo;<samp>[=</samp>&rsquo; and 
&lsquo;<samp>[.</samp>&rsquo;.  For
-example, &lsquo;<samp>[a[:b]</samp>&rsquo; and 
&lsquo;<samp>[a[:ouch:]b]</samp>&rsquo; are invalid.
+<ul class="itemize mark-bullet">
+<li>A basic regular expression containing a back-reference &lsquo;<samp 
class="samp">\<var class="var">n</var></samp>&rsquo;
+preceded by fewer than <var class="var">n</var> closing parentheses.  For 
example,
+&lsquo;<samp class="samp">\(a\)\2</samp>&rsquo; is invalid.
+
+</li><li>A bracket expression containing &lsquo;<samp 
class="samp">[:</samp>&rsquo; that does not start a
+character class; and similarly for &lsquo;<samp class="samp">[=</samp>&rsquo; 
and &lsquo;<samp class="samp">[.</samp>&rsquo;.  For
+example, &lsquo;<samp class="samp">[a[:b]</samp>&rsquo; and &lsquo;<samp 
class="samp">[a[:ouch:]b]</samp>&rsquo; are invalid.
 </li></ul>
 
-<p>GNU <code>grep</code> treats the following constructs as invalid.
-However, other <code>grep</code> implementations might allow them, so
+<p>GNU <code class="command">grep</code> treats the following constructs as 
invalid.
+However, other <code class="command">grep</code> implementations might allow 
them, so
 portable scripts should not rely on their being invalid:
 </p>
-<ul>
-<li> Unescaped &lsquo;<samp>\</samp>&rsquo; at the end of a regular expression.
+<ul class="itemize mark-bullet">
+<li>Unescaped &lsquo;<samp class="samp">\</samp>&rsquo; at the end of a 
regular expression.
 
-</li><li> Unescaped &lsquo;<samp>[</samp>&rsquo; that does not start a bracket 
expression.
+</li><li>Unescaped &lsquo;<samp class="samp">[</samp>&rsquo; that does not 
start a bracket expression.
 
-</li><li> A &lsquo;<samp>\{</samp>&rsquo; in a basic regular expression that 
does not start an
+</li><li>A &lsquo;<samp class="samp">\{</samp>&rsquo; in a basic regular 
expression that does not start an
 interval expression.
 
-</li><li> A basic regular expression with unbalanced 
&lsquo;<samp>\(</samp>&rsquo; or &lsquo;<samp>\)</samp>&rsquo;,
-or an extended regular expression with unbalanced &lsquo;<samp>(</samp>&rsquo;.
+</li><li>A basic regular expression with unbalanced &lsquo;<samp 
class="samp">\(</samp>&rsquo; or &lsquo;<samp class="samp">\)</samp>&rsquo;,
+or an extended regular expression with unbalanced &lsquo;<samp 
class="samp">(</samp>&rsquo;.
 
-</li><li> In the POSIX locale, a range expression like 
&lsquo;<samp>z-a</samp>&rsquo; that
-represents zero elements.  A non-GNU <code>grep</code> might treat it as
+</li><li>In the POSIX locale, a range expression like &lsquo;<samp 
class="samp">z-a</samp>&rsquo; that
+represents zero elements.  A non-GNU <code class="command">grep</code> might 
treat it as
 a valid range that never matches.
 
-</li><li> An interval expression with a repetition count greater than 32767.
+</li><li>An interval expression with a repetition count greater than 32767.
 (The portable POSIX limit is 255, and even interval expressions with
 smaller counts can be impractically slow on all known implementations.)
 
-</li><li> A bracket expression that contains at least three elements, the first
-and last of which are both &lsquo;<samp>:</samp>&rsquo;, or both 
&lsquo;<samp>.</samp>&rsquo;, or both
-&lsquo;<samp>=</samp>&rsquo;.  For example, a non-GNU <code>grep</code> might 
treat
-&lsquo;<samp>[:alpha:]</samp>&rsquo; like 
&lsquo;<samp>[[:alpha:]]</samp>&rsquo;, or like 
&lsquo;<samp>[:ahlp]</samp>&rsquo;.
+</li><li>A bracket expression that contains at least three elements, the first
+and last of which are both &lsquo;<samp class="samp">:</samp>&rsquo;, or both 
&lsquo;<samp class="samp">.</samp>&rsquo;, or both
+&lsquo;<samp class="samp">=</samp>&rsquo;.  For example, a non-GNU <code 
class="command">grep</code> might treat
+&lsquo;<samp class="samp">[:alpha:]</samp>&rsquo; like &lsquo;<samp 
class="samp">[[:alpha:]]</samp>&rsquo;, or like &lsquo;<samp 
class="samp">[:ahlp]</samp>&rsquo;.
 </li></ul>
 
 <p>The following constructs have well-defined behavior in GNU
-<code>grep</code>.  However, they have unspecified behavior elsewhere, so
+<code class="command">grep</code>.  However, they have unspecified behavior 
elsewhere, so
 portable scripts should avoid them:
 </p>
-<ul>
-<li> Special backslash expressions like &lsquo;<samp>\b</samp>&rsquo;, 
&lsquo;<samp>\&lt;</samp>&rsquo;, and &lsquo;<samp>\]</samp>&rsquo;.
-See <a href="Special-Backslash-Expressions.html">Special Backslash 
Expressions</a>.
+<ul class="itemize mark-bullet">
+<li>Special backslash expressions like &lsquo;<samp 
class="samp">\b</samp>&rsquo;, &lsquo;<samp class="samp">\&lt;</samp>&rsquo;, 
and &lsquo;<samp class="samp">\]</samp>&rsquo;.
+See <a class="xref" href="Special-Backslash-Expressions.html">Special 
Backslash Expressions</a>.
 
-</li><li> A basic regular expression that uses &lsquo;<samp>\?</samp>&rsquo;, 
&lsquo;<samp>\+</samp>&rsquo;, or &lsquo;<samp>\|</samp>&rsquo;.
+</li><li>A basic regular expression that uses &lsquo;<samp 
class="samp">\?</samp>&rsquo;, &lsquo;<samp class="samp">\+</samp>&rsquo;, or 
&lsquo;<samp class="samp">\|</samp>&rsquo;.
 
-</li><li> An extended regular expression that uses back-references.
+</li><li>An extended regular expression that uses back-references.
 
-</li><li> An empty regular expression, subexpression, or alternative.  For
-example, &lsquo;<samp>(a|bc|)</samp>&rsquo; is not portable; a portable 
equivalent is
-&lsquo;<samp>(a|bc)?</samp>&rsquo;.
+</li><li>An empty regular expression, subexpression, or alternative.  For
+example, &lsquo;<samp class="samp">(a|bc|)</samp>&rsquo; is not portable; a 
portable equivalent is
+&lsquo;<samp class="samp">(a|bc)?</samp>&rsquo;.
 
-</li><li> In a basic regular expression, an anchoring 
&lsquo;<samp>^</samp>&rsquo; that appears
-directly after &lsquo;<samp>\(</samp>&rsquo;, or an anchoring 
&lsquo;<samp>$</samp>&rsquo; that appears
-directly before &lsquo;<samp>\)</samp>&rsquo;.
+</li><li>In a basic regular expression, an anchoring &lsquo;<samp 
class="samp">^</samp>&rsquo; that appears
+directly after &lsquo;<samp class="samp">\(</samp>&rsquo;, or an anchoring 
&lsquo;<samp class="samp">$</samp>&rsquo; that appears
+directly before &lsquo;<samp class="samp">\)</samp>&rsquo;.
 
-</li><li> In a basic regular expression, a repetition operator that
+</li><li>In a basic regular expression, a repetition operator that
 directly follows another repetition operator.
 
-</li><li> In an extended regular expression, unescaped 
&lsquo;<samp>{</samp>&rsquo;
+</li><li>In an extended regular expression, unescaped &lsquo;<samp 
class="samp">{</samp>&rsquo;
 that does not begin a valid interval expression.
-GNU <code>grep</code> treats the &lsquo;<samp>{</samp>&rsquo; as an ordinary 
character.
+GNU <code class="command">grep</code> treats the &lsquo;<samp 
class="samp">{</samp>&rsquo; as an ordinary character.
 
-</li><li> A null character or an encoding error in either pattern or input 
data.
-See <a href="Character-Encoding.html">Character Encoding</a>.
+</li><li>A null character or an encoding error in either pattern or input data.
+See <a class="xref" href="Character-Encoding.html">Character Encoding</a>.
 
-</li><li> An input file that ends in a non-newline character,
-where GNU <code>grep</code> silently supplies a newline.
+</li><li>An input file that ends in a non-newline character,
+where GNU <code class="command">grep</code> silently supplies a newline.
 </li></ul>
 
 <p>The following constructs have unspecified behavior, in both GNU
-and other <code>grep</code> implementations.  Scripts should avoid
+and other <code class="command">grep</code> implementations.  Scripts should 
avoid
 them whenever possible.
 </p>
-<ul>
-<li> A backslash escaping an ordinary character, unless it is a
-back-reference like &lsquo;<samp>\1</samp>&rsquo; or a special backslash 
expression like
-&lsquo;<samp>\&lt;</samp>&rsquo; or &lsquo;<samp>\b</samp>&rsquo;.  See <a 
href="Special-Backslash-Expressions.html">Special Backslash Expressions</a>.  
For
-example, &lsquo;<samp>\x</samp>&rsquo; has unspecified behavior now, and a 
future version
-of <code>grep</code> might specify &lsquo;<samp>\x</samp>&rsquo; to have a new 
behavior.
+<ul class="itemize mark-bullet">
+<li>A backslash escaping an ordinary character, unless it is a
+back-reference like &lsquo;<samp class="samp">\1</samp>&rsquo; or a special 
backslash expression like
+&lsquo;<samp class="samp">\&lt;</samp>&rsquo; or &lsquo;<samp 
class="samp">\b</samp>&rsquo;.  See <a class="xref" 
href="Special-Backslash-Expressions.html">Special Backslash Expressions</a>.  
For
+example, &lsquo;<samp class="samp">\x</samp>&rsquo; has unspecified behavior 
now, and a future version
+of <code class="command">grep</code> might specify &lsquo;<samp 
class="samp">\x</samp>&rsquo; to have a new behavior.
 
-</li><li> A repetition operator that appears directly after an anchor, or at 
the
+</li><li>A repetition operator that appears directly after an anchor, or at the
 start of a complete regular expression, parenthesized subexpression,
-or alternative.  For example, &lsquo;<samp>+|^*(+a|?-b)</samp>&rsquo; has 
unspecified
-behavior, whereas &lsquo;<samp>\+|^\*(\+a|\?-b)</samp>&rsquo; is portable.
+or alternative.  For example, &lsquo;<samp 
class="samp">+|^*(+a|?-b)</samp>&rsquo; has unspecified
+behavior, whereas &lsquo;<samp class="samp">\+|^\*(\+a|\?-b)</samp>&rsquo; is 
portable.
 
-</li><li> A range expression outside the POSIX locale.  For example, in some
-locales &lsquo;<samp>[a-z]</samp>&rsquo; might match some characters that are 
not
+</li><li>A range expression outside the POSIX locale.  For example, in some
+locales &lsquo;<samp class="samp">[a-z]</samp>&rsquo; might match some 
characters that are not
 lowercase letters, or might not match some lowercase letters, or might
-be invalid.  With GNU <code>grep</code> it is not documented whether
+be invalid.  With GNU <code class="command">grep</code> it is not documented 
whether
 these range expressions use native code points, or use the collating
-sequence specified by the <code>LC_COLLATE</code> category, or have some
+sequence specified by the <code class="env">LC_COLLATE</code> category, or 
have some
 other interpretation.  Outside the POSIX locale, it is portable to use
-&lsquo;<samp>[[:lower:]]</samp>&rsquo; to match a lower-case letter, or
-&lsquo;<samp>[abcdefghijklmnopqrstuvwxyz]</samp>&rsquo; to match an ASCII 
lower-case
+&lsquo;<samp class="samp">[[:lower:]]</samp>&rsquo; to match a lower-case 
letter, or
+&lsquo;<samp class="samp">[abcdefghijklmnopqrstuvwxyz]</samp>&rsquo; to match 
an ASCII lower-case
 letter.
 
 </li></ul>
 
 </div>
 <hr>
-<div class="header">
+<div class="nav-panel">
 <p>
 Next: <a href="Character-Encoding.html">Character Encoding</a>, Previous: <a 
href="Basic-vs-Extended.html">Basic vs Extended Regular Expressions</a>, Up: <a 
href="Regular-Expressions.html">Regular Expressions</a> &nbsp; [<a 
href="index.html#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>][<a href="Index.html" title="Index" 
rel="index">Index</a>]</p>
 </div>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]