bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: diff style regexps?


From: Eric Blake
Subject: Re: diff style regexps?
Date: Thu, 19 Jun 2008 18:29:26 -0600
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080421 Thunderbird/2.0.0.14 Mnenhy/0.7.5.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to address@hidden on 6/19/2008 2:45 PM:
| In (info "(diff)Comparing Directories"):
|   Unlike in the shell, a period at the start of the base of a file name
|   matches a wildcard at the start of a pattern.
|
| OK, but in my brain there are
| 1. shell glob REGEXPs

Globs are not regexps.  They form a similar function of providing a
pattern that can match multiple strings, but globs are much more limited
in what they can match when compared to regular expressions.  Globs are
often referred to as wildcards or file name patterns.  And while all globs
can be rewritten as regexps, not all regexps can be written as globs.

| 2. perl/egrep/sed REGEXPs (perl needs less backslashes that the other two)

There are two basic styles of true regexps standardized (and then various
tweaks and extensions that differ among various apps that do regexp).  BRE
(basic regular expression) is the kind used by grep and sed, and needs
more backslashes to access some of the useful operators (and even then,
operators like \? are extensions not required by POSIX).  ERE (extended
regular express) is the kind used by egrep and perl, and has more standard
operators (for example, ?).  In both flavors of regexp, the term wildcard
is often used to refer to the operator ., which matches any character
(yes, it is confusing that wildcard can mean both an entire glob and a
single regexp operator).

| Are you saying you have now invented a third hybrid? Please clarify
| the sentence.

When diff refers to "wildcards", it is referring to globs.  This is not a
different hybrid of regular expression, since globs aren't regular
expressions.  But what IS being mentioned is that while file name matching
against the glob "*" generally ignores files with a leading dot (and you
have to use ".*" instead), diff's file name matching happens to match
files with a leading dot, without you having to specify an explicit
leading . in the glob.

| It seems you are hinting that you are using perl REGEXPs when maybe
| you only mean for character 1.
| Better say explicitly what . will do in other positions, else that's
| what we think you are hinting.

diff uses globs, not regular expressions.  In other positions within a
glob, . always matches itself.  When dealing with globs, the only special
treatment of . is whether or not it matches leading dots by default.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkha+eYACgkQ84KuGfSFAYDjpQCgtZjPYJblPiAAKUTCeYQKk7uL
AtMAoLRlM/M7nPkXpSTJhSghEf97eXFt
=0Rv8
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]