lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] PATCH: Switch to using std::filesystem


From: Vadim Zeitlin
Subject: Re: [lmi] PATCH: Switch to using std::filesystem
Date: Mon, 10 May 2021 16:28:33 +0200

On Mon, 3 May 2021 12:21:29 +0000 Greg Chicares <gchicares@sbcglobal.net> wrote:

GC> On 4/26/21 9:53 PM, Vadim Zeitlin wrote:
GC> > 
GC> >  The branch "std-fs" of https://github.com/vadz/lmi.git repository, also
GC> > visible at https://github.com/let-me-illustrate/lmi/pull/180, contains the
GC> > long, long promised changes replacing the use of Boost.Filesystem with
GC> > std::filesystem.
GC> 
GC> I've reviewed all these changes, line by line, and pushed some
GC> modifications of my own, most notably rewriting some of the
GC> 'path_utility*.?pp' code, originally implemented in terms of
GC> boost::filesystem and now of std::filesystem.

 Thanks again for reviewing and pushing this so quickly, it's really a huge
relief for me to finally have this in lmi.

GC> > - Fix strings not using UTF-8 encoding
[...]
GC> Agreed.
GC> 
GC> > - Pretty minor, but now that we require C++20, we really ought to define
GC> >   operator<=>() rather than 6 relational operators for fs::path
[...]
GC> Agreed.

 My short-term TODO list is then:

0. Fix the CI builds (I feel acutely uncomfortable without the safety they
   provide, so for me this is the highest priority).

1. Use operator<=>() for fs::path -- and maybe other classes? Do you think
   I should restrict my patch to fs::path only or also use it for the other
   classes, such as:
 - calendar_date (which, surprisingly, has 4 out of 6 operators right now)
 - currency
 - minmax (which defines 2 of 6 comparison operators)
 - soa_v3_format::table::Number (you probably won't mind me changing this one)
 - TypeInfo (which is admittedly a rather special case)
 - tn_range (which defines 3 of 6 comparison operators)

2. UTF-8 support for paths.


GC> > - This is more speculative, but we could implement our own validation of
GC> >   paths for MSW, now that fs::path doesn't do it any longer. I'm not quite
GC> >   sure if we really need to do it, so I'd prefer to discuss (and do, if
GC> >   necessary), this later too.
GC> 
GC> Yes, that's perhaps the biggest difference between the boost original
GC> and the standard version. When users enter paths using OS-standard
GC> dialog controls, those controls reject names that the OS forbids, so
GC> maybe boost's validation didn't accomplish much.

 FWIW I've never seen its effect or maybe missed it if I did.

GC> OTOH, the rules are not very complicated, and Beman Dawes has set them
GC> out clearly:
GC>   
https://www.boost.org/doc/libs/1_33_1/libs/filesystem/doc/portability_guide.htm

 I think I've already given the rules, but the official ones are at

        https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file

To summarize, the file name must consist of Unicode characters outside the
range [0,31] and excluding <>:"/\|?* and the whole name must not end with a
space or a period and can't be any of the string CON, PRN, AUX, NUL, COMn,
LPTn (where n is 1..9).

GC> so maybe we should implement them in lmi, in order to make validity
GC> an explicit precondition.

 I wonder what do we gain by doing it. In the worst case, we're just going
to get an error when trying to actually use the file. Is it really much
worse than giving a (possibly spurious!) error before trying to use it?

GC> In addition to the two sets of {posix, msw} OS-specific requirements,
GC> that page also makes portability recommendations, which are at least
GC> partly the same as what lmi's portable_filename() does. I think it's
GC> a good idea to enforce at least some rules, lest an end user choose
GC> file names such as:
GC>   C:\My Stuff\An XYZ Corp Census.cns
GC>   C:\My Stuff\...and another XYZ Corp Census.cns

 Sorry, why is this a problem? Periods are not forbidden in the file names
and if the user really wants to use a file called like this, why shouldn't
they be able to do it?

GC> Here's one concrete example that just occurred to me:
GC>   File | New | Census
GC>   File | Save as...
GC>   enter "Yet another new census?" without the double-quotes
GC>   click the Save button
GC> At least with wine-5.x, nothing seems to happen: no file is saved,
GC> and the dialog does not close. But if that happens entirely within
GC> a standard dialog, there's probably nothing lmi can do about it
GC> anyway.

 Yes, this does happen entirely within the standard dialog. The behaviour
is confusing, but makes sense: it actually interprets the string as a
wildcard pattern (because of "?") and shows you all the files matching it
(probably none). This is actually pretty useful, at least with "*", if you
are unsure about the exact file name you want to use and the directory
contains many files, as by typing a string with wildcards you can restrict
the display to just the files containing the one you want.

 FWIW if you use any other invalid character, such as "<", in the file name
in the dialog, you'd get an error message, which seems good enough too.

GC> OTOH, testing the triple-dot example above gives
GC>   Warning: Unable to save 'Z:\opt\lmi.and another XYZ Corp Census.cns'.
GC>   [census_document.cpp : 94]
GC> and that's probably something lmi should trap.

 No, I think it's just a bug. But the problem is that I couldn't reproduce
it here with my existing MSVC build, i.e. the file gets saved without any
errors. I'll retest with MinGW binary slightly later, but I think this
might be actually something Wine-specific rather than a compiler-dependent
problem. And if this is really true, the impetus for doing anything here
becomes even smaller and, IMO, non-existent.

 To summarize, right now I don't see any reason to implement file path
validation, but I could well be missing something, of course, please let me
know if/what I do.

 Thanks,
VZ

Attachment: pgpsisIkqhbqw.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]