groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 1/2] [troff]: Add lengthof() macro.


From: Alejandro Colomar
Subject: Re: [PATCH v1 1/2] [troff]: Add lengthof() macro.
Date: Sun, 27 Aug 2023 02:00:01 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.1

Hi Lennart, Branden,

On 2023-08-26 21:36, Lennart Jablonka wrote:
> Quoth Alejandro Colomar:
>> The good news is that I like the implementation.  I just don't like the
>> name.  I have a sizeof_array() macro that does
>>
>> #define sizeof_array(a)  (sizeof(a) + must_be_array(a))
>>
>> That is, it calculates the size in bytes that the array takes up in memory.
>>
>> <https://github.com/shadow-maint/shadow/pull/762/commits/8d06d849dcb5f7041e048d866ec7ce6c1853245b>
>>
>> For a macro that returns the number of elements in an array, I'd like a
>> name that cannot be confused with that at all.  NELEMS(), NITEMS(),
>> array_count(), or lengthof() all seem better than array_size().
>>
>> I'm not a fan of lengthof(), even if it's a proposal to ISO C, as so far
>> the term "length" was only the number of non-zero characters in a string,
>> and overloading it to mean the number of elements in an array would
>> similarly be a bad thing.  At least it's not so confusing as size, though.
>>
>> NITEMS() or NELEMS() seems the best choice to me.
> 
> I like both lengthof (as “length” is commonly used for the number 
> of elements in an array) and nelem (which is what Plan 9 uses).   
> I do wanna note that the term used by the C standard is “size” and 
> the term used by the C++ standard is “bound.”   How about boundof?

A quotation of ISO C would be interesting here.  Preferably C17.

I certainly remember ISO C talking about the number of elements in an
array.  While I wouldn't be surprised to learn that it uses the term
'size' when it's being more specific, I don't recall it from the top
of my head.


On 2023-08-26 18:43, G. Branden Robinson wrote:
> Hi Alex,
> 
> At 2023-08-26T00:55:16+0200, Alejandro Colomar wrote:
>> The good news is that I like the implementation.  I just don't like the
>> name.  I have a sizeof_array() macro that does
>>
>> #define sizeof_array(a)  (sizeof(a) + must_be_array(a))
>>
>> That is, it calculates the size in bytes that the array takes up in
>> memory.
>>
>> <https://github.com/shadow-maint/shadow/pull/762/commits/8d06d849dcb5f7041e048d866ec7ce6c1853245b>
>>
>> For a macro that returns the number of elements in an array, I'd like
>> a name that cannot be confused with that at all.  NELEMS(), NITEMS(),
>> array_count(), or lengthof() all seem better than array_size().
> 
> This is a sensible enough objection.  I'm renaming it to "array_length".

That's a fair enough name to me.

Acked-by: Alejandro Colomar <alx@kernel.org>

Thanks!

> I want to keep "array_" in the name because it works only with array
> types.

I don't think that's necessarily bad, although I question if it's a
bit redundant.  I assume you want it so that compiler errors are
clear.  Otherwise, someone calling this template with a non-array, may
be confused by the errors.

In my C version, must_be_array() (and it's internal helpers) is the one
that clarifies the warnings, as it has _array() in its name.  Also, C
having much less kinds of types, there's less chance to confuse
programmers.

[...]

>> I'm not a fan of lengthof(), even if it's a proposal to ISO C, as so
>> far the term "length" was only the number of non-zero characters in a
>> string,
> 
> I submit to you that this is an example of C programmers' parochialism.
> Character arrays have outsized importance to practitioners of that
> language, in part because C was applied to the problem domain of text
> processing before it had to mature features to support that application.

Pipes probably helped or were helped by this, so I guess overall it was
a good thing that it happened this way.

[...]

> 
> Further, you and I have spoken before of the Shlemiel the Painter
> problems to which users of string.h are prone.

Good compilers these days optimize strcat(3) and strlcat(3) out, so
it's not so bad.  And in source code they are readable.  I'm starting
to like them.  Some programmers may argue that GCC and Clang are not
representative of C, and that the ISO C (and K&R C before it)
language allows doing all kind of braindamaged stuff, so the language
is broken.

> 
> I submit that a language with even a moderately strong type system
> ineluctably implies the existence and application of collections of
> objects of those type.  "Length" is not a concept that `char[]` should
> be permitted to monopolize from a bunker.

But then we have a problem.  Let's have the following declaration of
local variables to a function.

size_t   len;
wchar_t  ws[10];

I'd like this code to already tell you what kind of code you should
expect of a function.  With present-day conventions, we will agree
that len will hold the value 9, most likely.  But if we allow length
to be the number of elements in an array, then I don't know if len
will be 9 or 10, which could incite off-by-one bugs.

I chose wide chars on purpose, to also note how sizeof() is often
abused as nelems() when we deal with a char array, as sizeof(char)
is 1 by definition.

If there was also a 'size_t  size;' declaration, I guess we'd also
agree that it would likely contain 'sizeof(wchar_t[10])', whatever
that is.

Considering how much strings are intricate to C, I'd say casting
the meaning of length in stones would be beneficial.  I tried to
do it in string_copying(7).

> 
>> NITEMS() or NELEMS() seems the best choice to me.
> 
> I don't like these very much because (a) they're not English words and

number_of_items() (or _elements()) would read better to you?  The
problem would be breaking long lines.  Indenting here with 8-wide
tabs and having 80-col terminals don't fit well with that.

I think NITEMS() or NELEMS() are quite understood abbreviations.

> (b) they shout.  I concede that there are reasons, in C (cf. C++) to
> shout function-like macro names.

Maybe we could make them lowcase, since macro magic ensures that
there's no danger of misuse.  Uppercase was used as a "here be
dragons" notice to the programmer, but having made the macros safe
with magic, I think lowcase names would be fair.

>  Rust's `!` suffix is a nice idea.
> 
>> I had a similar complaint to this one to kernel people, which were
>> even more evil than you,
> 
> I am wounded that you regard my capacity for evil as meager. :P

Oh, no, I didn't intend to offend you.  It was only meager in this
macro.  In the code I'm patching in patch 2/2, you were quite evil!  :P


[...]

>>> At 2023-08-04T15:40:30+0200, Alejandro Colomar wrote:
>>>> In C++17, I'd just call std::size().
>>>>
>>>> In C11 (or C++11), I'd add a static_assert(3) to that macro to make
>>>> it safer (but compiler warnings already make it reasonably
>>>> safe[1]).
>>>>
>>>> In a mix of C++98 / C99, there's nothing I know of.  We could use
>>>> templates, which is how I bet std::size() is implemented, but I
>>>> don't have enough experience with them to do this kind of magic.
> 
> Incidentally, this was a valuable summary of the language's history.
> I have a pending commit to the "HACKING" file informed by this.

Please ping me when there's something that I can have a look at :)

> 
>>> But I think want to understand the C++98 ships in the harbor a bit
>>> before I burn the fleet.
>>
>> I'll send some C99 ships to try to convince you to get on board with
>> their brand new sails --pure cotton, or 99%--.
> 
> I like C99 fine.  It's just not the language that (most of) groff is
> written in, and were I to undertake a ground-up replacement, I'd use a
> language I like more (albeit with an FFI with C/C++).

The good thing about moving to C is that you can do it opportunistically,
as you can compile most of it as C++.

> 
> Regards,
> Branden

Cheers,
Alex

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]