Re: Feature to add

sed-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Feature to add

From:	Assaf Gordon
Subject:	Re: Feature to add
Date:	Thu, 19 Jul 2018 05:44:06 -0600
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

(adding sed-devel@ mailing list, please use reply-all to keep the threadpublic and archived).



Hello Russell,


On 19/07/18 04:18 AM, Russell Harper wrote:

I'm not writing specifically about parsing floating point numbers orfactoring integers, these are just examples to illustrate. You cansubstitute anything else instead.
What I'm proposing is an x flag for substitutions to indicate that thesubstitution is obtained by running an executable and inserting its output.
     's/<reg-exp>/<executable> <argument>*/x'

Some examples:
's/UUID/uuidgen/gx' # replaces instances of "UUID"with output from uuidgen 's/([0-9]+)/factor \1/gx' # replaces integers with outputfrom factor <integer> 's~(http://[A-Za-z.]+)~wget \1~x' # replaces URL with output fromwget <URL>
     's~([a-z]+)~./pluralize \1~gix'    # custom utility to pluralize words
Currently there is no easy and robust way to do this in any of the coreutilities.


Thank you for expanding and explaining on your request.

This indeed seems like a specialized feature, perhaps a bit out of scope
for sed. GNU sed does have the "s///e" extension ("e" for "eval"),
but that runs a shell command on the entire pattern space once,
and not on every matched group as in your examples.

However Perl can easily do exactly what you ask for (and in a robust way).

First,

Perl's regex substitution also has an "e" flag, but it is more powerfulthan sed's: it calls a perl function on every matched group.


In the following example, every number (matching the regex /(\d+)/ )
is transformed using perl's built-in hex() function:

  $ echo 230 19 FOO 40 BAR 50 | perl -np -e 's/(\d+)/hex($1)/ge'
  560 25 FOO 64 BAR 80

(That is: 0x230 is 560 in decimal, 0x19 is 25 in decimal, etc.).

Similarly,

we can define our own perl function to do any transformation we'd like.The following example increments any matched number by 1:


  $ echo 230 19 FOO 40 BAR 50 \
        | perl -np -e 'sub f($) { return $_[0] + 1 ; }' \
                   -e 's/(\d+)/f($1)/ge'
  231 20 FOO 41 BAR 51


Lastly,
Perl excels at text processing and evaluating external commands,
so we modify our function to execute "factor" on any matched
number:

  $ echo 230 19 FOO 40 BAR 50 \
        | perl -np -e 'sub f($) { return `factor $_[0]` ; }' \
                   -e 's/(\d+)/f($1)/ge'
  230: 2 5 23
   19: 19
   FOO 40: 2 2 2 5
   BAR 50: 2 5 5


And an example with UUID:

  $ echo UUID FOO UUID BAR UUID \
      | perl -np -e 'sub f($) { $t = `uuidgen` ; chomp $t ; $t }' \
                 -e 's/(UUID)/f($1)/ge'

4a64a434-73b2-47f9-985f-2eff776b981d FOOfc7f3796-cfed-4850-a363-a70edfceee1b BARde65fe02-96fd-436e-ae2b-66127c438702



Of course,
when executing things like that on the shell, extra care must be taken
to ensure malicious input can't cause unintended consequences with shell
escaping tricks.

=======

As for adding a new feature to sed:

There is always a trade-off between adding more and more specialized
features to sed, and between using existing solution even if they are
a bit more verbose (i.e. my perl examples are much longer than the
hypothetical s///x sed feature).

I don't think we can/should modify sed's existing s///e flag (that would
break existing scripts), but we could perhaps consider adding a new
flag.

What do others think - is it worth it, or better just stick with perl ?(Jim?)


The semantics of such flag must be carefully defined, e.g.
what's the interplay with grouping, with global flag, with other flags?

regards,
 - assaf

[Prev in Thread]

Current Thread

[Next in Thread]

Feature to add, Russell Harper, 2018/07/18
- Message not available
  - Message not available
    - Re: Feature to add, Assaf Gordon <=
    - Re: Feature to add, Russell Harper, 2018/07/19
    - Re: Feature to add, Assaf Gordon, 2018/07/19

Prev by Date: Feature to add
Next by Date: Re: Feature to add
Previous by thread: Feature to add
Next by thread: Re: Feature to add
Index(es):
- Date
- Thread