[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Nmh-workers] m_getfld() and Friends.
From: |
Ralph Corderoy |
Subject: |
[Nmh-workers] m_getfld() and Friends. |
Date: |
Tue, 23 May 2017 18:51:59 +0100 |
Hi,
I've been poking m_getfld() a bit, trying to get a firm understanding of
what all its callers demand. Thought I'd pick the list's brains.
int m_getfld(m_getfld_state_t *gstate,
char name[NAMESZ],
char *buf, int *bufsz,
FILE *iob)
On entry, *bufsz is the size of buf. I've been using buf[7] just to
have it be too small quite easily. m_getfld() returns one of
#define LENERR (-2) /* Name too long error from getfld */
#define FMTERR (-3) /* Message Format error */
#define FLD 0 /* Field returned */
#define FLDPLUS 1 /* Field returned with more to come */
#define BODY 3 /* Body returned with more to come */
#define FILEEOF 5 /* Reached end of input file */
to indicate the type of buf's contents.
I temporarily modified uip/scan.c so it just loops until FILEEOF is
returned. Here's the small test email I use.
$ wc -c email
75 email
$
$ cat email
a: A
ab: A
abc: A
abcd: A
f: ABCDEFGHIJKLMNOPQRSTUVWXYZ
body1
body2
body3
$
And the output.
state: field read: 5 0- 75 name: 'a' buf: ' A\n\0' =4
state: field read: 6 75- 75 name: 'ab' buf: ' A\n\0' =4
state: field read: 7 75- 75 name: 'abc' buf: ' A\n\0' =4
state: field read: 8 75- 75 name: 'abcd' buf: ' A\n\0' =4
state: field-plus read: 7 75- 75 name: 'f' buf: ' ABCDE\0' =7
state: field-plus read: 6 75- 75 name: 'f' buf: 'FGHIJK\0' =7
state: field-plus read: 6 75- 75 name: 'f' buf: 'LMNOPQ\0' =7
state: field-plus read: 6 75- 75 name: 'f' buf: 'RSTUVW\0' =7
state: field read: 5 75- 75 name: 'f' buf: 'XYZ\n\0' =5
state: body read: 7 75- 75 name: '' buf: 'body1\n\0' =7
state: body read: 6 75- 75 name: '' buf: 'body2\n\0' =7
state: body read: 6 75- 75 name: '' buf: 'body3\n\0' =7
state: eof read: 0 75- 75 name: '' buf: '\0' =1
`read' is the value of *bufsz after the call. It seems to be telling me
how many bytes of input have been processed?
The `0- 75' is ftello(3)'s result before and after m_getfld(). This
email is small enough that it can read the file in one go on the first
call into some buffer of its own; buf[7] being too small.
For field and field-plus state return values, `name' is the header's
name. A sequence of field-plus is terminated by a field.
I print buf[]'s contents until the NUL, the `=4' is how many bytes were
printed. Note, it does not tally with `read'; that's fine.
The sum of read's `5 6 7 8 7 6 6 6 5 7 6 6 0' is 75, matching wc(1)
above.
The body state doesn't have the `plus' variation, like field, despite
the `with more to come' part of the comment.
#define FLDPLUS 1 /* Field returned with more to come */
#define BODY 3 /* Body returned with more to come */
Looking more closely at the read values,
state: field read: 5 0- 75 name: 'a' buf: ' A\n\0' =4
state: field read: 6 75- 75 name: 'ab' buf: ' A\n\0' =4
state: field read: 7 75- 75 name: 'abc' buf: ' A\n\0' =4
state: field read: 8 75- 75 name: 'abcd' buf: ' A\n\0' =4
The 5 is `f: A\n'. 6, 7, and 8 are similar with growing header names.
So far, 5 6 7 8 sum to 26, and that checks out.
$ od -Ad -cN26 email
0000000 a : A \n a b : A \n a b c :
0000016 A \n a b c d : A \n
Next,
state: field-plus read: 7 75- 75 name: 'f' buf: ' ABCDE\0' =7
`f: ABCDE' is eight, but read is 7.
$ od -Ad -cj26 -N7 email
0000026 f : A B C D
sizeof buf is 7 so ' ABCDE\0' =7 above is correct; buf has been fully
utilised. Should read be 8, or have I misunderstood its intent?
state: field-plus read: 6 75- 75 name: 'f' buf: 'FGHIJK\0' =7
state: field-plus read: 6 75- 75 name: 'f' buf: 'LMNOPQ\0' =7
state: field-plus read: 6 75- 75 name: 'f' buf: 'RSTUVW\0' =7
Next three field-plus are back on track; six read each time.
state: field read: 5 75- 75 name: 'f' buf: 'XYZ\n\0' =5
The end of the `f' header, `XYZ\n' is four read, but I think read=5
because it's including the `\n' that ends the headers' section? Let's
assume that.
sum 5 6 7 8 7 6 6 6 5 is 56.
$ od -Ad -cN56 email
0000000 a : A \n a b : A \n a b c :
0000016 A \n a b c d : A \n f : A B C
0000032 D E F G H I J K L M N O P Q R S
0000048 T U V W X Y Z \n
Still one adrift after the earlier problem; header-ending `\n' not included.
state: body read: 7 75- 75 name: '' buf: 'body1\n\0' =7
`body1\n' is only six but read=7, so is this also counting the `\n' that
never ends up in buf, but precedes the body? That means it features
twice in `read' as an extra, but never in buf.
The double counting here "fixes" the shortage earlier at the first
field-plus. reads of 5 6 7 8 7 6 6 6 5 7 sum to 63.
$ od -Ad -cN63 email
0000000 a : A \n a b : A \n a b c :
0000016 A \n a b c d : A \n f : A B C
0000032 D E F G H I J K L M N O P Q R S
0000048 T U V W X Y Z \n \n b o d y 1 \n
We're back in sync!
state: body read: 6 75- 75 name: '' buf: 'body2\n\0' =7
state: body read: 6 75- 75 name: '' buf: 'body3\n\0' =7
These correctly have a read of 6.
state: eof read: 0 75- 75 name: '' buf: '\0' =1
eof state neatly says read=0 and makes sure nothing is in buf.
m_getfld() has another mode where it keeps the FILE's position set.
With that, ftello(3) before and after show different positions. Nothing
else changes, including the `read's.
state: field read: 5 0- 5 name: 'a' buf: ' A\n\0' =4
state: field read: 6 5- 11 name: 'ab' buf: ' A\n\0' =4
state: field read: 7 11- 18 name: 'abc' buf: ' A\n\0' =4
state: field read: 8 18- 26 name: 'abcd' buf: ' A\n\0' =4
state: field-plus read: 7 26- 33 name: 'f' buf: ' ABCDE\0' =7
state: field-plus read: 6 33- 39 name: 'f' buf: 'FGHIJK\0' =7
state: field-plus read: 6 39- 45 name: 'f' buf: 'LMNOPQ\0' =7
state: field-plus read: 6 45- 51 name: 'f' buf: 'RSTUVW\0' =7
state: field read: 5 51- 56 name: 'f' buf: 'XYZ\n\0' =5
state: body read: 7 56- 63 name: '' buf: 'body1\n\0' =7
state: body read: 6 63- 69 name: '' buf: 'body2\n\0' =7
state: body read: 6 69- 75 name: '' buf: 'body3\n\0' =7
state: eof read: 0 75- 75 name: '' buf: '\0' =1
The cumulation of the `read's matches the `after' file position.
$ dc <<<'0 5+p 6+p 7+p 8+p 7+p 6+p 6+p 6+p 5+p 7+p 6+p 6+p 0+p' | fmt
5 11 18 26 33 39 45 51 56 63 69 75 75
Reading those ranges of positions gives
0000000 a : A \n
0000005 a b : A \n
0000011 a b c : A \n
0000018 a b c d : A \n
0000026 f : A B C D
0000033 E F G H I J
0000039 K L M N O P
0000045 Q R S T U V
0000051 W X Y Z \n
0000056 \n b o d y 1 \n
0000063 b o d y 2 \n
0000069 b o d y 3 \n
0000075
This matches the above account; out of sync at the `E'. The separating
`\n' is in the range for the first `body'.
Questions:
Should the file position always be just after what's returned in buf?
And cumulative `read's to that point match the position?
buf should never have the separating `\n', but the `read' that skipped
it for the first `body' return will be one higher to keep the cumulation
in sync.
I think that makes the desired output
state: field read: 5 0- 5 name: 'a' buf: ' A\n\0' =4
state: field read: 6 5- 11 name: 'ab' buf: ' A\n\0' =4
state: field read: 7 11- 18 name: 'abc' buf: ' A\n\0' =4
state: field read: 8 18- 26 name: 'abcd' buf: ' A\n\0' =4
state: field-plus read: 8¹ 26- 34 name: 'f' buf: ' ABCDE\0' =7
state: field-plus read: 6 34- 40 name: 'f' buf: 'FGHIJK\0' =7
state: field-plus read: 6 40- 46 name: 'f' buf: 'LMNOPQ\0' =7
state: field-plus read: 6 46- 52 name: 'f' buf: 'RSTUVW\0' =7
state: field read: 4² 52- 56 name: 'f' buf: 'XYZ\n\0' =5
state: body read: 7³ 56- 63 name: '' buf: 'body1\n\0' =7
state: body read: 6 63- 69 name: '' buf: 'body2\n\0' =7
state: body read: 6 69- 75 name: '' buf: 'body3\n\0' =7
state: eof read: 0 75- 75 name: '' buf: '\0' =1
where
1. read=8 not 7 to include the `E' in buf.
2. read=4 not 5 to exclude the seperating `\n' not in buf.
state returned is `field', not `field-last', so I don't think
read should deviate.
3. read=7 still to cumulate the seperating '\n' not in buf.
--
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Nmh-workers] m_getfld() and Friends.,
Ralph Corderoy <=