bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: getusershell: Split file by lines instead of spaces.


From: Bruno Haible
Subject: Re: getusershell: Split file by lines instead of spaces.
Date: Tue, 21 May 2024 14:32:25 +0200

Hi Collin,

> I tried to do that originally but I don't think you
> would have liked the patch. :)
> 
> At the start of 'readname' there is a loop to skip spaces:
> 
>      while ((c == getc (stream)) != EOF
>             && isspace ((unsigned char) c))
>        /* Do nothing.  */;
> 
> Essentially in the main for (;;) loop you would have to compare every
> character for '#'. If one is found then you have to do a loop similar
> to that and then break afterwards.

Yep, I agree such code would not easy to understand.

> I've attached a patch moving it to the end of getusershell. Makes
> things easier to understand IMO.

It looks better, but there are two problems:

* If the file ends with a non-empty line without a newline, getline()
  returns a string that does not end in a newline.
  Quoting 
https://pubs.opengroup.org/onlinepubs/9699919799/functions/getline.html:
   "including the delimiter character if one was encountered before EOF".

  In the lines

          char *end = comment ? comment : start + nread - 1;

          /* Trim rightmost comment marker or newline.  */
          *end-- = '\0';

  it looks like you are assuming a newline at the end of the string.

* The following code does not work with tokens that consist of a single
  character:

          while (start < end && isspace ((unsigned char) *end))

          if (start < end && IS_ABSOLUTE_FILE_NAME (start))

  The condition "start < end" here tests whether the token has at least
  two characters. Which is probably not what you intended. (In this case,
  it is not fatal, since absolute file names of non-directories always
  consist of at least 2 characters. But anyway.)

  The cause is that you are working with start and end both being
  inclusive, that is, with a string of length end-start+1. There are 4
  possible intervals for two variables start and length:
    [start,end]
    [start,end)
    (start,end]
    (start,end)
  If you use sometimes [start,end], sometimes [start,end), you cannot
  remember working idioms and you will regularly make mistakes.
  The best way to avoid such mistakes is to work with [start,end)
  intervals each time you work with pointers into arrays.
  Then the condition "start < end" tests for a non-empty subsequence,
  and you can remember and reuse idioms.

> Do most libc implementations keep track of if a stream needs to be
> locked? I'm thinking about single-threaded programs that don't define
> GNULIB_GETUSERSHELL_SINGLE_THREAD. I imagine the stream doesn't locked
> x10 more, but a status flag is checked x10 more or something.

No, from the libc perspective, all streams are safe to access from
multiple threads, therefore all streams need locking. It's only the
application programmer who can use *_unlocked functions when they
know that a stream is being used from a single thread only.

Bruno






reply via email to

[Prev in Thread] Current Thread [Next in Thread]