lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] [PATCH] Fix trailing whitespace detection in stream_cast()


From: Vadim Zeitlin
Subject: Re: [lmi] [PATCH] Fix trailing whitespace detection in stream_cast()
Date: Wed, 29 Jan 2020 13:06:28 +0100

On Tue, 28 Jan 2020 18:46:15 +0000 Greg Chicares <address@hidden> wrote:

GC> On 2020-01-27 17:13, Vadim Zeitlin wrote:
GC> > 
GC> >  I'd like to return to the issue first discussed in this message:
GC> > 
GC> >   https://lists.nongnu.org/archive/html/lmi/2018-11/msg00034.html
GC> > 
GC> > This might look like a minor problem, but it's really annoying to have to
GC> > juggle with the additional patch when testing lmi either with clang or
GC> > MSVC, so I'd like to propose the patch at
GC> >  
GC> >   https://github.com/vadz/lmi/pull/136
GC> > 
GC> > for inclusion in lmi master.
GC> > 
GC> >  This patch simply exchanges the order of ws() and eof() checks
GC> 
GC> The first thing I wonder about is how this became wrong.

 Unfortunately I can't answer this because the wrong code was already there
in the commit e8caef817ffa9765924302e0fcff91bdf85733a8 from 15 years ago
which added it to Git.

GC> I thought the idiom was ancient and generally accepted--cf. this from
GC> late 2000:
GC> 
GC> http://www.two-sdg.demon.co.uk/curbralan/code/lexical_cast/lexical_cast.hpp
GC> 
GC> | template<typename Target, typename Source>
GC> | Target lexical_cast(Source arg)
GC> | {
GC> |     std::strstream interpreter; // for out-of-the-box g++
GC> |     Target result;
GC> | 
GC> |     if(!(interpreter << arg) || !(interpreter >> result) ||
GC> |        !(interpreter >> std::ws).eof())
GC> |         throw bad_lexical_cast();
GC> |
GC> |     return result;
GC> | }

 This may well be idiomatic, but this code doesn't do the same thing as the
code in stream_cast does, or at least attempts to do. The code above checks
that there is nothing but whitespace remaining after the actual value and
is perfectly happy to ignore whitespace -- which, BTW, looks like very
reasonable behaviour to me. And the code in lmi actively tries to detect
trailing whitespace and give an error if it is there.

 As an aside, I'm really not sure if it's a good idea in the first place.
It's pretty easy to end up with a trailing space when entering the value
in the UI or a trailing new line when reading it from file and it's not
clear at all to me that this should produce an error instead of being
silently ignored. Generally speaking silently ignoring input data is a bad
idea, of course, but what harm could any trailing spaces possibly do in
this particular case? But this is just an aside, I don't really argue for
changing this, fixing the current bug is all I'm after here.


GC> But now I wonder whether I introduced the error, locally, when I split
GC>     !(interpreter >> std::ws).eof())
GC> into
GC>     else if(!(interpreter >> std::ws))
GC>         {
GC>         err << "Trailing whitespace remains ";
GC>         }
GC>     else if(!interpreter.eof())
GC>         {
GC>         err << "Unconverted data remains ";
GC>         }
GC> ?

 This version doesn't do the same thing as the original one, so I guess we
can indeed say that this is where the error was introduced, but it's a bit
misleading to say that it was due just to splitting the line: you split the
line _and_ changed the (intended) code behaviour.

GC> The problem is that
GC>   std::istringstream interpreter("123");
GC>   int n = -1;
GC>   interpreter >> n;
GC> leaves the stream at EOF, and subsequently, therefore,
GC>   interpreter >> std::ws
GC> is supposed to set failbit, so the standard requires that this code:
GC> 
GC>     else if(!(interpreter >> std::ws))
GC>         {
GC>         err << "Trailing whitespace remains ";
GC>         }
GC> 
GC> write the "Trailing whitespace remains " message into 'err'. Right?

 Right, but this doesn't work with libstdc++ due to a bug in it.

GC> But, given the same initial steps:
GC>   std::istringstream interpreter("123");
GC>   int n = -1;
GC>   interpreter >> n;
GC> what is this code
GC>   !(interpreter >> std::ws).eof())
GC> supposed to do?
GC>  - it invokes the std::ws extractor; in this case, eofbit was set
GC> by the ctor, so the effect is to set failbit; then
GC>  - eof() is called; it ignores failbit, inspects eofbit, and
GC> returns true. Right?

 Yes, except for "set by the ctor" part which I don't understand. But the
important thing is that the state of the stream after extracting ws() is
completely ignored here and the exclamation sign tests the state of the
stream after calling eof() on it.

GC> Do you concur with my conclusion that I introduced this defect by
GC> splitting the
GC>   !(interpreter >> std::ws).eof())
GC> line incorrectly--in particular, by invoking
GC>   std::ios_base::operator!()
GC> in my version, when Henney's original did not invoke operator!()
GC> (even though it would leave the stream flags in the same state)?

 Again, yes, the problem definitely comes from this change, but it was more
than just splitting the line.

GC> BTW, why add the 'calendar_date' TU to this unit test? Is that only
GC> because your original report arose on this line:
GC> 
GC> input.cpp:      ,ListBillDate                     {"2440588"} // Assume no 
inforce so old

 Yes, I wanted to test for the case that really was a problem in practice.

 Regards,
VZ

Attachment: pgpMu8iR2J6Jc.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]