lilypond-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Regexp Functions


From: Aaron Hill
Subject: Re: Regexp Functions
Date: Mon, 15 Jun 2020 19:32:51 -0700
User-agent: Roundcube Webmail/1.4.2

On 2020-06-15 5:57 pm, Freeman Gilmore wrote:
Thank you Aaron.  :

In order to ask my question, not knowing how to ask, I simplified it
too much.   The one or both of the first two below  may work but i do
not know how to apply them.

This is a meta problem I have often struggled with. While trying to solve problem A on my own, I encounter problem B of which I need help. So I seek guidance on problem B while omitting the context of problem A. The answers found might address problem B but do not work well in the greater context. The philosophical question is whether I should have originally framed my question about problem B with the scope of problem A or if I should have simply asked about problem A and allow the natural discussion to bring up problem B on its own.

Seeing more of what you have in mind below, I am confused about its connection with LilyPond. On the one hand, I think I can help you wrangle your text in the way you want; but I am worried that we are fighting a bit of the meta problem above. What is this text intended to represent with regards to music? What are the semantics of the transformation you want to apply? Could there be another way to express things that might be easier to work with or that might better fit the patterns of LilyPond?

Mind you, the answers to those questions would be taking us from the narrow problem into the broader scope. And perhaps only you truly benefit from answering them.


Say I have "-y -ax3 +rx2 -stx2 t"    I wanted, if  "-" followed "x"
before a space, then replace the "x" with " -x" for each.    If I use
"(^-.*)x" "-y -ax3 +rx2 -stx2 t"  'pre 1 " -x" 'post) , from what I
have read, this would happen at the first "x" from the right.   So
that will not work.

Ah, there are at least two critical pieces of information I overlooked. The prefix in question does not necessarily occur at the beginning of the string. Also, the string we are looking for might occur multiple times so we want the nearest match.

We can address this as follows by being more precise in our regular expression.

/^/ becomes /(^|\s)/

Here we are using alteration to assert that we are either at the beginning of the string or that we have just seen a whitespace character.

/.*/ becomes /[^x]*/

All quantifiers in POSIX ERE (and thusly GNU ERE) are greedy, so care must be taken to limit what can be matched so we do not grab too much. We do not want an "x" to be matched, so we use a negated character class.

Let us see this working:

;;;;
(regexp-substitute/global #f
  "((^|\\s)-[^x]*)(x)"
  "-y -ax3 +rx2 -stx2 t"
  'pre 1 " -" 3 'post)
;;;;
====
"-y -a -x3 +rx2 -st -x2 t"
====

Seems good. But therein lies the problem with test-driven development (TDD). Our code will only be as good as our test cases, which is why we are instructed to "test everything that can possibly break". There is a hidden problem above due to using /[^x]*/. This matches more than what we want. In particular, it matches whitespace.

The first substitution in the above example was not effectively operating on "-ax3" but rather on "-y -ax3". The end result looks correct, but what if we had said "-y +ax3"? This would have been matched because we included whitespace, resulting in "-y +a -x3", which is presumably incorrect.

The fix is simple: add [:space:] to the negated character class. Below I have amended the test string to include "-z +bx4" which should remain unchanged.

;;;;
(regexp-substitute/global #f
  "((^|\\s)-[^x[:space:]]*)(x)"
  "-y -ax3 +rx2 -stx2 t -z +bx4"
  'pre 1 " -" 3 'post)
;;;;
====
"-y -a -x3 +rx2 -st -x2 t -z +bx4"
====

What about "-x"? Our expression will match that because we are using the zero-or-more quantifier (*). The resulting substitution, "- -x", is probably not what we want. We should ensure that there is at least one non-"x" character:

;;;;
(regexp-substitute/global #f
  "((^|\\s)-[^x[:space:]]+)(x)"
  "-y -ax3 +rx2 -stx2 t -z +bx4 -x"
  'pre 1 " -" 3 'post)
;;;;
====
"-y -a -x3 +rx2 -st -x2 t -z +bx4 -x"
====


My next step was to convert the string to a list of strings.  So if i
convert first, "-y -ax3 +rx2 -stx2 t" =>  ("-y" "-ax3" "+rx2" "-stx2"
"t") .   I would guess that one or both of the first two below could
be applied to the list of strings.    But i do not have a clue?
Starting with "-y -ax3 +rx2 -stx2 t" , ending with ("-y" "-a" "-x3"
"+rx2" "-st" "-x2" "t")

This approach could be made to work as well. Having split the original string by whitespace, we no longer have to worry about its impact in our expression. We trade off having a simpler regular expression while requiring more logic outside.

;;;;
(map
  (lambda (str)
    (regexp-substitute/global #f
      "(^-[^x]+)(x)" str
      'pre 1 " -" 2 'post))
  (string-split "-y -ax3 +rx2 -stx2 t -z +bx4 -x" #\sp))
;;;;
====
("-y" "-a -x3" "+rx2" "-st -x2" "t" "-z" "+bx4" "-x")
====

Close. Where we go from here depends on what you actually need. Do you want to reconstitute the original string with spaces? If so, just use string-join. It will not matter that some elements have spaces already.

;;;;
(string-join
  (map
    (lambda (str)
      (regexp-substitute/global #f
        "(^-[^x]+)(x)" str
        'pre 1 " -" 2 'post))
    (string-split "-y -ax3 +rx2 -stx2 t -z +bx4 -x" #\sp))
  " ")
;;;;
====
"-y -a -x3 +rx2 -st -x2 t -z +bx4 -x"
====

Let us say you want a list, not a string. Do you want items like "-a -x3" to become separate elements? Then an additional usage of string-split could work. Although then you would have a list of lists. So in that case, you need to append the individual lists into one:

;;;;
(apply append
  (map
    (lambda (str)
      (string-split
        (regexp-substitute/global #f
          "(^-[^x]+)(x)" str
          'pre 1 " -" 2 'post)
        #\sp))
    (string-split "-y -ax3 +rx2 -stx2 t -z +bx4 -x" #\sp)))
;;;;
====
("-y" "-a" "-x3" "+rx2" "-st" "-x2" "t" "-z" "+bx4" "-x")
====


-- Aaron Hill



reply via email to

[Prev in Thread] Current Thread [Next in Thread]