[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Question] How the sed deal with the '\0' embedded in string?
From: |
Assaf Gordon |
Subject: |
Re: [Question] How the sed deal with the '\0' embedded in string? |
Date: |
Tue, 13 Sep 2016 10:49:25 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 |
Hello,
On 09/13/2016 12:11 AM, Du Dengke wrote:
I am learning the sed source code, I want to know how the sed deal with the
'\0' embedded in string.
I can't find the *.c deal with the '\0', could anybody tell me where?
I assume you're asking about NUL characters in the input file, and not in the
sed program.
The flow is:
execute.c:read_pattern_space()
(using function-pointer input->read_fn)
execute.c:read_file_line()
utils.c:ck_getdelim()
getdelim(3)
execute.c:str_append()
getdelim(3) is a standard POSIX function that reads from a stream (e.g. stdin)
until the specified delimiter is encountered, or EOF.
If the delimiter is '\n' (the default in sed),
then NULs in the input are read as-is without special treatment.
All these functions use explicit length (number of bytes read),
and not 'strlen' - so embedded NULs are not an issue.
The following demonstrates getdelim(3) (error checking omitted for brevity):
$ cat getdelimtest.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
int main()
{
char *buf = NULL;
size_t l = 0;
size_t n = getdelim(&buf, &l, '\n', stdin);
printf("getdelim read %zu bytes\n", n);
for (size_t i=0;i<n;i++) {
printf("[%zu] = ",i);
if (isprint(buf[i]))
printf("'%c'\n", buf[i]);
else
printf("0x%02x\n", buf[i]);
}
return 0;
}
$ gcc -g -Wall -o getdelimtest getdelimtest.c
$ printf 'a\0b\n' | ./getdelimtest
getdelim read 4 bytes
[0] = 'a'
[1] = 0x00
[2] = 'b'
[3] = 0x0a
regards,
- assaf