Re: [lmi] Help modifying 'rate

lmi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Help modifying 'rate_table.cpp'

From:	Greg Chicares
Subject:	Re: [lmi] Help modifying 'rate_table.cpp'
Date:	Thu, 15 Dec 2016 06:02:29 +0000
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.4.0

On 2016-12-14 18:16, Vadim Zeitlin wrote:
> On Wed, 14 Dec 2016 14:50:23 +0000 Greg Chicares <address@hidden> wrote:
> 
> GC> The patch below [0] shows what I'm trying to accomplish: given a flawed
> GC> historical binary file whose values are inconsistent with its stated
> GC> number of decimals, write a text file that states the number of decimals
> GC> that seems to have been intended and formats values to that corrected
> GC> precision.

[...and fix the CRC too...]

> [...] I'd actually prefer to do this on reading the table,
> not when writing it. For me logically adjusting the number of decimals used
> is part of validation and so should be done as early as possible, i.e. in
> validate().

Thanks, I did it that way. Pushed 82b89079ffcdd3702b4c09b3c5455e4a7a3650e0:
    Detect and repair decimal-precision inconsistencies in rate tables

> GC> Or
> GC> maybe the right place for this change is
> GC>     soa_v3_format::table_impl::write_as_text()
> GC> so that it would be excluded from
> GC>     soa_v3_format::table_impl::write_as_binary()
> GC> but that breaks the implicit assumption that those two functions
> GC> share the same implementation.
> 
>  I think it's nice to reuse the same implementation for both of them. The
> idea was that we could add write_as_xml() (or maybe write_as_json() or
> whatever) later relatively easily, if needed.

Yes, I realize that now. At first, I wondered why functions like
writer::write_values() were implemented in each of two namespaces.
That does make it harder to search for a particular one. I guess
I'm used to it now, but for a while I was thinking life would be
easier if the implementations had a redundant tag, thus:
  binary_format::writer::write_values()
    text_format::writer::write_values()

I initially thought I'd find a common base class, with these
two functions being virtual. But maybe the things in different
namespaces aren't similar enough for polymorphism.

It still seems odd, though, to have do_write() but not do_read().

> GC> But there's another possible problem. I had imagined that changing the
> GC> number of decimals could affect only write operations, but its value
> GC> is used in some read operations as well. In parse_single_value(), it's
> GC> used to set 'num_spaces_allowed', which is used only for validation;
> GC> there, I'd hesitate to "fix" it, because that might prevent reading
> GC> a table. It's also used here:
> GC> 
> GC>                 // After the max select age only the last column remains, 
> just
> GC>                 // skip the spaces until it.
> GC>                 skip_spaces
> GC>                     
> (*select_period_*text_format::get_value_width(*num_decimals_)
> GC>                     ,start
> GC>                     ,current
> GC>                     ,line_num
> GC>                     );
> GC> 
> GC> and would the same reasoning apply there? I suppose that serves a
> GC> validation purpose, too: if the stated number of decimals is not
> GC> consistent with the format of the text table being read, then
> GC> that might indicate a serious problem (a different problem than the
> GC> one addressed by the patch below, which tries to fix an inconsistency
> GC> between the stated number of decimals and binary values).
> 
>  If we assume that the number of decimals in the text format always
> corresponds to the reality, then it shouldn't matter because it's never
> going to be adjusted anyhow in this case then. If we do encounter a text
> file with mismatching number of decimals, then it will result in an error
> when trying to read it -- but this would seem like the right thing to do,
> wouldn't it?

Sounds good. As the commit message explains in detail, that should be
the ultimate design, but for now I've made the code issue a mere warning
and then correct the error--so that defective old tables can readily be
fixed. I hope to get them all fixed soon, because it's tedious to clean
up all these megabytes of old data and I'd like to get past that.

> The only problem is that the error message won't clearly say
> that the problem is due to a mismatch between the number of decimals
> specified and the precision actually used for the values, but as long as we
> suppose that this is never going to happen anyhow, the quality of error
> message is probably not very important.

Take a look--I think the code I just committed manages to do that.

> GC> Vadim, where do you think I should make this change?
> 
>  In validate().

Doing it there, instead of in writer::write_values() (in the "text"
namespace IIRC) as in my original sketch, also brought the very
welcome incidental benefit that I didn't have to deal with objects
of type 'boost::optional const&'. I realize it's awfully late for
code review, but what was the rationale for 'optional'? I realize
that some fields such as "Construction method:" may be absent, but
for text fields like that, wouldn't an empty string do as well? Or
were any non-text fields missing--e.g., "Maximum select age:" for
a table that is not select (so that the field wouldn't really have
any meaning--but I thought it had a required value nonetheless)?

BTW, I looked over my latest change before committing it, and saw
that I had written
  *hash_value_ = compute_hash_value();
with an asterisk, where a few lines lower you had
  hash_value_ = correct_hash_value;
with no asterisk. Now, hash_value_ is 'boost::optional<uint32_t>',
so '*hash_value_' invokes 'T& operator*()' and we can assign a T
through that reference. But how does the line with no asterisk
work? The RHS of the assignment is 'correct_hash_value':
  auto const correct_hash_value = compute_hash_value();
and because it's auto we look up the function:
  unsigned long table_impl::compute_hash_value() const
and find that the type is 'unsigned long'. Oh, wait, I see:
  optional& operator=(T const&)
makes it work without an asterisk. But...was it really a good
design decision for boost to provide that assignment operator?
Even though it seems convenient, isn't it astonishing that these
two statements:
  *p = 42;
   p = 42;
are equivalent?

[Prev in Thread]

Current Thread

[Next in Thread]

[lmi] Help modifying 'rate_table.cpp', Greg Chicares, 2016/12/14
- Re: [lmi] Help modifying 'rate_table.cpp', Vadim Zeitlin, 2016/12/14
  - Re: [lmi] Help modifying 'rate_table.cpp', Greg Chicares <=
    - Re: [lmi] Help modifying 'rate_table.cpp', Vadim Zeitlin, 2016/12/15
    - Re: [lmi] Help modifying 'rate_table.cpp', Greg Chicares, 2016/12/15
    - Re: [lmi] Help modifying 'rate_table.cpp', Vadim Zeitlin, 2016/12/15

Prev by Date: Re: [lmi] Help modifying 'rate_table.cpp'
Next by Date: Re: [lmi] Help modifying 'rate_table.cpp'
Previous by thread: Re: [lmi] Help modifying 'rate_table.cpp'
Next by thread: Re: [lmi] Help modifying 'rate_table.cpp'
Index(es):
- Date
- Thread