coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] stat: don't explicitly request file size for filenames


From: Pádraig Brady
Subject: Re: [PATCH] stat: don't explicitly request file size for filenames
Date: Fri, 12 Jul 2019 15:31:17 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0

On 09/07/19 22:43, Andreas Dilger wrote:
> On Jul 5, 2019, at 04:18, Jeff Layton <address@hidden> wrote:
>> 
>> On Thu, 2019-07-04 at 21:18 +0000, Andreas Dilger wrote:
>>> On Jul 4, 2019, at 04:43, Jeff Layton <address@hidden> wrote:
>>>> On Thu, 2019-07-04 at 06:37 -0400, Jeff Layton wrote:
>>>>> On Wed, 2019-07-03 at 20:24 +0000, Andreas Dilger wrote:
>>>>>> When calling 'stat -c %N' to print the filename, don't explicitly
>>>>>> request the size of the file via statx(), as it may add overhead on
>>>>>> some filesystems.  The size is only needed to optimize an allocation
>>>>>> for the relatively rare case of reading a symlink name, and the worst
>>>>>> effect is a somewhat-too-large temporary buffer may be allocated for
>>>>>> areadlink_with_size(), or internal retries if buffer is too small.
>>>>>> 
>>>>>> The file size will be returned by statx() on most filesystems, even
>>>>>> if not requested, unless the filesystem considers this to be too
>>>>>> expensive for that file, in which case the tradeoff is worthwhile.
>>>>>> 
>>>>>> * src/stat.c: Don't explicitly request STATX_SIZE for filenames.
>>>>>> Start with a 1KB buffer for areadlink_with_size() if st_size unset.
>>>>>> ---
>>>>>> src/stat.c | 6 ++++--
>>>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>>> 
>>>>>> diff --git a/src/stat.c b/src/stat.c
>>>>>> index ec0bb7d..c887013 100644
>>>>>> --- a/src/stat.c
>>>>>> +++ b/src/stat.c
>>>>>> @@ -1282,7 +1282,7 @@ fmt_to_mask (char fmt)
>>>>>>  switch (fmt)
>>>>>>    {
>>>>>>    case 'N':
>>>>>> -      return STATX_MODE|STATX_SIZE;
>>>>>> +      return STATX_MODE;
>>>>>>    case 'd':
>>>>>>    case 'D':
>>>>>>      return STATX_MODE;
>>>>>> @@ -1491,7 +1491,9 @@ print_stat (char *pformat, size_t prefix_len, 
>>>>>> unsigned int m,
>>>>>>      out_string (pformat, prefix_len, quoteN (filename));
>>>>>>      if (S_ISLNK (statbuf->st_mode))
>>>>>>        {
>>>>>> -          char *linkname = areadlink_with_size (filename, 
>>>>>> statbuf->st_size);
>>>>>> +          /* if statx() didn't set size, most symlinks are under 1KB */
>>>>>> +          char *linkname = areadlink_with_size (filename, 
>>>>>> statbuf->st_size ?:
>>>>>> +                                                1023);
>>>>>>          if (linkname == NULL)
>>>>>>            {
>>>>>>              error (0, errno, _("cannot read symbolic link %s"),
>>>>> 
>>>>> Looks reasonable to me.
>>>>> 
>>>>> Reviewed-by: Jeff Layton <address@hidden>
>>>> 
>>>> Actually, I'll retract that just yet...
>>>> 
>>>> I'm not sure that statbuf->st_size will be initialized to 0 if statx
>>>> didn't fill out the stx_size field. You may need to accompany this patch
>>>> with one that zeroes out the stx struct in do_stat.
>>> 
>>> The kernel will explicitly zero out struct kstat at the beginning of the 
>>> callchain
>>> so that only valid values filled in by the filesystem are returned, in 
>>> order to
>>> avoid leaking any kernel memory to userspace:
>>> 
>>>    int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
>>>                          u32 request_mask, unsigned int query_flags)
>>>    {
>>>            struct inode *inode = d_backing_inode(path->dentry);
>>> 
>>>            memset(stat, 0, sizeof(*stat));
>>>            stat->result_mask |= STATX_BASIC_STATS;
>>>            request_mask &= STATX_ALL;
>>>            query_flags &= KSTAT_QUERY_FLAGS;
>>> 
>>> 
>>> Both the kernel generic_fillattr() and statx_to_stat() copy all of the 
>>> fields,
>>> regardless of whether the corresponding bits are set in the request_mask or 
>>> not,
>>> so this should be totally safe and there is no benefit to zero the struct in
>>> userspace first.
>> 
>> I don't think it's wise to rely on undocumented behavior. That could
>> easily change in the future, and what if (e.g.) BSD grows a statx
>> implementation that doesn't do that?
>> 
>> I'd suggest at least zeroing out the stx_size field first.
> 
> I looked at this, but there isn't much value to zero out the field since it
> is unconditionally copied from the kernel upon return.  For the unusual case
> that the kernel doesn't touch the stx_size (or other) field then I've added
> a zero initializer for the statx struct.  Updated patch attached.
> 
> Even so, it isn't a huge deal if the stx_size is random garbage.  At worst it
> will be too small and need a few readlink() iterations to get the right size,
> and if too large it will be capped by areadlink_with_size(), so no harm is
> done.  With the separate changes to areadlink_with_size() to shrink the buffer
> after allocation even this would not be significant.

Given the changes to areadlink_with_size() I've removed the 0 -> 1023 change
in this patch, and just let areadlink_with_size() handle a 0 size appropriately.

I'll push this later today.

thanks,
Pádraig




reply via email to

[Prev in Thread] Current Thread [Next in Thread]