[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] [lmi-commits] master bbcdaf5 1/2: Update commentary
From: |
Greg Chicares |
Subject: |
Re: [lmi] [lmi-commits] master bbcdaf5 1/2: Update commentary |
Date: |
Tue, 5 Jun 2018 03:17:21 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 |
On 2018-06-04 23:35, Vadim Zeitlin wrote:
> On Mon, 4 Jun 2018 19:11:49 -0400 (EDT) Greg Chicares <address@hidden> wrote:
>
> GC> branch: master
> GC> commit bbcdaf540882fd8947381904ceb31ea25f458c97
[...]
> GC> Update commentary
> GC> ---
> GC> actuarial_table.cpp | 4 ++--
[...]
> GC> @@ -280,7 +280,7 @@ void actuarial_table::find_table()
> GC> /// The record types of interest here are coded as:
> GC> /// 9999 end of table
> GC> /// 2 4-byte integer: Table number
> GC> -/// 3 [unsigned] char: Table type: {A, D, S} --> {age, duration,
> select}
> GC> +/// 3 1-byte char : Table type: {A, D, S} --> {age, duration,
> select}
> GC> /// 12 2-byte integer: Minimum age
> GC> /// 13 2-byte integer: Maximum age
[...]
> GC> - case 3: // [unsigned] char: Table type.
> GC> + case 3: // char: Table type.
> GC> {
> GC> // Meaning: {A, D, S} --> {age, duration, select}.
> GC> // SOA apparently permits upper or lower case.
>
> This is pretty minor, but I think this change is not right: in this case,
> it's really a byte, i.e. an unsigned char. We don't perform any arithmetic
> operations on it, so it doesn't really matter, but conceptually it still is
> an unsigned char.
I think of this commit as merely updating the documentation to reflect
the code change in commit f05c8c53.
How can we decide whether a char should be unsigned? I would declare it
unsigned iff we apply bit-wise operations to it. All we do with this
variable is convert it to uppercase and compare it to {A,D,S}.
Conversion with std::to_upper() requires the value to be *representable*
as an unsigned character (unless it's EOF). It doesn't require the char
to *be* unsigned. All the allowable values for this datum are in the
'basic execution character set' [C99 2.5.1/3] and must fit in one byte,
which doesn't guarantee that the high bit isn't used--but ASCII isn't
going to change, and the highest number we use here is 0x073 = 's'.
I'd happily agree that this entity is an '8-bit integer' or a 'byte',
but if we call it a 'char' of some sort, I don't see how we can say it's
signed, unsigned, or signless. Maybe I should just change the first
comment thus:
-/// 3 1-byte char : Table type: {A, D, S} --> {age, duration, select}
+/// 3 1-byte integer: Table type: {A, D, S} --> {age, duration, select}
For wider fields, we use std::int32_t and std::int16_t; following that
pattern, this byte would be int8_t (i.e., a signed type).
But that's just nomenclature; I have a substantive concern in this file:
static_assert(8 == CHAR_BIT);
static_assert(4 == sizeof(int));
static_assert(2 == sizeof(short int));
The "short int" assertion is needless because this TU doesn't use that
type. It's worrisome, though, because it has never yet been observed to
throw, yet elsewhere we have:
typedef unsigned short int md5_uint32;
and 32/8 is 4, which seems to contradict the assertion.
The "CHAR_BIT" assertion seems unlikely to be violated, unless we try
porting to some DSP; yet POSIX requires a char to be an octet.
The middle "int" assertion would fail on an ILP64 machine: a Cray today
perhaps, but decades from now--who really knows? My concern comes from
analyzing lmi's use of 'long int'. Aside from library and OS APIs that
require it, we use long integers mainly in two places:
- rate-table numbers (2^15 is probably enough, but not certainly so)
- calendar dates (JDN 2^16 ~ 4500 BC; 2^31 is way past the year 9595)
In rate-table code, we use std::int32_t and so on; clients use long int
table numbers, but should probably use int32_t. But for calendar dates,
there's a conflict: class calendar_date uses plain int, which isn't
guaranteed by the standard to be sufficient, yet some of its clients
are more conservative--e.g., 'census_view.cpp':
static long int const jdn_min = calendar_date::gregorian_epoch_jdn;
What should we do for JDN?
- specify 'long int', so that the standard guarantees correctness?
- specify 'int' and assert that it's a 32-bit type?
Furthermore, in 'census_view.cpp' we have
variant = static_cast<long int>(1 + row);
and it does make sense to allow more than 32767 rows, or even 65535,
but
unsigned int wxDataViewListCtrl::GetItemCount()
seems to constrain that to sizeof(int) (which in practice means 2^32),
so I guess we should use plain 'int' there. And we also have
wxSpinCtrl(...long int min, long int max, long int value);
which AFAICT should be changed to plain int to match the wx API.