lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] [lmi-commits] master c98c00d 10/33: Say "basename" rather than


From: Vadim Zeitlin
Subject: Re: [lmi] [lmi-commits] master c98c00d 10/33: Say "basename" rather than "leaf"
Date: Tue, 11 May 2021 23:18:56 +0200

On Tue, 11 May 2021 19:14:28 +0000 Greg Chicares <gchicares@sbcglobal.net> 
wrote:

GC> On 5/9/21 9:41 PM, Vadim Zeitlin wrote:
GC> > On Mon,  3 May 2021 08:15:52 -0400 (EDT) Greg Chicares 
<gchicares@sbcglobal.net> wrote:
GC> > 
GC> > GC> branch: master
GC> > GC> commit c98c00dbe5559b76a22b502dcce471f347514b3c
GC> > GC> Author: Gregory W. Chicares <gchicares@sbcglobal.net>
GC> > GC> Commit: Gregory W. Chicares <gchicares@sbcglobal.net>
GC> > GC> 
GC> > GC>     Say "basename" rather than "leaf"
GC> > GC>     
GC> > GC>     The POSIX dirname/basename nomenclature is universally understood.
GC> > 
GC> >  I must be an exception, but I never know if "basename" is supposed to
GC> > include the extension or not.
GC> 
GC> For basename(3), it does:
GC> 
GC>   https://pubs.opengroup.org/onlinepubs/9699919799/functions/basename.html
GC> | The basename() function shall take the pathname pointed to by path and
GC> | return a pointer to the final component of the pathname, deleting any
GC> | trailing '/' characters.

 Oh, indeed, the C function is completely unambiguous, but I just didn't
think about it because I don't think I've ever used it, while I do use
basename(1) from time to time in scripts.

GC> Looking further into this now, I note that basename(1) optionally
GC> removes a "suffix", which seems to be much like the shell parameter
GC> expansion "${parameter%%word}". Perhaps that's most commonly used
GC> to remove a file extension, so that it "suffix" comes to be seen as
GC> implying "extension"; I'm just not aware whether it's conventional
GC> to think of it that way.

 Me neither, but this is what I personally find confusing: I only ever use
basename to remove the extension and, hence, leave the "base name" of the
file (and the reason I don't use "%%" for this is because I still confuse
it with "##" and with "%" and have to stop and think and maybe look it up
every time -- basename is both much more readable and easier to write in
comparison). It could perfectly well be just my own idio{syncras,c}y.


GC> > don't have any real objections to using "stem", but doing it while not
GC> > using "filename" for the "stem + extension" part is 50% consistent with
GC> > std::fs, which is arguably the worst kind of (in)consistency.
GC> 
GC> Wait...are we talking past each other? Let's establish some
GC> definitions first. In '/home/greg/foo.txt':
GC>   let the HEAD be '/home/greg/'
GC>   let the TAIL be 'foo.txt'
GC> All I really want is an unambiguous name for each of those.
GC> I've proposed
GC>   dirname = HEAD
GC>   basename = TAIL
GC> 
GC> Although I consider this much less important, we could add
GC>   let the CEDILLA be '.txt'
GC> and, least important of all, there's 'foo', which IMO doesn't
GC> need a name, but for completeness we could say
GC>   let the TAIL_SANS_CEDILLA be 'foo', or
GC>   let the THING_THAT_DESERVES_NO_SHORT_NAME be 'foo'
GC> I think we can agree on
GC>   extension = CEDILLA
GC> 
GC> Then do we agree that 'dirname' is good nomenclature for HEAD?

 Yes.

GC> And, where I propose 'basename' as nomenclature for TAIL, do you
GC> propose 'stem'? It seems not, if you speak of "stem + extension";
GC> for that, do you propose 'filename'?

 As I said, I use "fullname". But I'm pretty sure this is not a common name
for it neither.

GC> The problem I have with 'filename' is that I never know what it
GC> means, even when I'm the one who's written it: is it the same as
GC> 'pathname', i.e., '/home/greg/foo.txt' above, or is it 'foo.txt',
GC> which I propose to call 'basename'?

 Yes, this is the reason I use "fullname" rather than "filename".

GC> >  FWIW personally I find the terms "base name" for "stem" and "full name"
GC> > for "stem + extension" the most clear, but I'm not going to claim that
GC> > there is anything universal about this terminology neither.
GC> 
GC> I'm averse to 'full name' for two reasons. First, in a definition like
GC>   std::string foo
GC>     (std::string const& full_name
GC>     ,std::string const& file_name
GC>     )
GC> I'm certain to get the "full" and "file" confused. Second, for the
GC> '/home/greg/foo.txt' example above, I'd say the "full" name is the
GC> whole thing: '/home/greg/foo.txt'.

 I call this "full path". Again, it's not ideal, but it's at least
consistent with the other parts of my nomenclature.

GC> And I wouldn't know what the "filename" is.

 We either have to accept that it doesn't mean anything, or that it's
ambiguous. I don't know what's best to be honest.


GC> For me, the most important thing by far is to devise unambiguous
GC> names for what I called HEAD and TAIL above. I had thought this:
GC>   dirname = HEAD
GC>   basename = TAIL
GC> was posixy and unambiguous, but what other names might we choose?
GC> 
GC> Would
GC>   dirname = HEAD
GC>   filename = TAIL
GC>   pathname = HEAD+TAIL
GC> seem good to you instead?

 I've never seen "pathname" and so it seems rather strange to me, but maybe
it's just a question of habit. The overlap between it and fs::path might be
more concerning.
 
GC> Does std::filesystem have any generic name for HEAD, or must we devise
GC> one ourselves? (And if we're going to follow std::filesystem, we can
GC> use "stem" on the rare occasions when we actually want stem().)

 IME it's not rare to need it at all, many programs working with files take
a file.in and produce a file.out, i.e. need to get the stem of the
input file in order to form the full name of the output one.

GC> I guess that "filename = TAIL" might be as good as we're going
GC> to do. It corresponds to the standard. My objection to 'filename'
GC> is that it has been used in multiple or ambiguous ways heretofore,
GC> so that the term is spoiled; but we can unspoil it today by fiat,
GC> for lmi at least, endowing it with a righteous meaning forevermore.
GC> What do you think?

 It could work, but I think that between this and "basename", I finally do
prefer "basename". It goes well with "dirname" and, again, my confusion
about it is probably my own problem that I'll just have to deal with
because POSIX functions are indeed completely unambiguous (unlike POSIX
commands...).

 So

        path ::= dirname / basename
        basename := stem . extension

is probably the best, sorry for starting the discussion just to get back to
what you originally did, but your starting argument has changed my mind.

VZ

Attachment: pgpAT_sOnm_dS.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]