[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] stat: don't explicitly request file size for filenames
From: |
Jeff Layton |
Subject: |
Re: [PATCH] stat: don't explicitly request file size for filenames |
Date: |
Wed, 10 Jul 2019 07:21:35 -0400 |
User-agent: |
Evolution 3.32.3 (3.32.3-1.fc30) |
On Tue, 2019-07-09 at 21:43 +0000, Andreas Dilger wrote:
> On Jul 5, 2019, at 04:18, Jeff Layton <address@hidden> wrote:
> >
> > On Thu, 2019-07-04 at 21:18 +0000, Andreas Dilger wrote:
> >> On Jul 4, 2019, at 04:43, Jeff Layton <address@hidden> wrote:
> >>> On Thu, 2019-07-04 at 06:37 -0400, Jeff Layton wrote:
> >>>> On Wed, 2019-07-03 at 20:24 +0000, Andreas Dilger wrote:
> >>>>> When calling 'stat -c %N' to print the filename, don't explicitly
> >>>>> request the size of the file via statx(), as it may add overhead on
> >>>>> some filesystems. The size is only needed to optimize an allocation
> >>>>> for the relatively rare case of reading a symlink name, and the worst
> >>>>> effect is a somewhat-too-large temporary buffer may be allocated for
> >>>>> areadlink_with_size(), or internal retries if buffer is too small.
> >>>>>
> >>>>> The file size will be returned by statx() on most filesystems, even
> >>>>> if not requested, unless the filesystem considers this to be too
> >>>>> expensive for that file, in which case the tradeoff is worthwhile.
> >>>>>
> >>>>> * src/stat.c: Don't explicitly request STATX_SIZE for filenames.
> >>>>> Start with a 1KB buffer for areadlink_with_size() if st_size unset.
> >>>>> ---
> >>>>> src/stat.c | 6 ++++--
> >>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/src/stat.c b/src/stat.c
> >>>>> index ec0bb7d..c887013 100644
> >>>>> --- a/src/stat.c
> >>>>> +++ b/src/stat.c
> >>>>> @@ -1282,7 +1282,7 @@ fmt_to_mask (char fmt)
> >>>>> switch (fmt)
> >>>>> {
> >>>>> case 'N':
> >>>>> - return STATX_MODE|STATX_SIZE;
> >>>>> + return STATX_MODE;
> >>>>> case 'd':
> >>>>> case 'D':
> >>>>> return STATX_MODE;
> >>>>> @@ -1491,7 +1491,9 @@ print_stat (char *pformat, size_t prefix_len,
> >>>>> unsigned int m,
> >>>>> out_string (pformat, prefix_len, quoteN (filename));
> >>>>> if (S_ISLNK (statbuf->st_mode))
> >>>>> {
> >>>>> - char *linkname = areadlink_with_size (filename,
> >>>>> statbuf->st_size);
> >>>>> + /* if statx() didn't set size, most symlinks are under 1KB */
> >>>>> + char *linkname = areadlink_with_size (filename,
> >>>>> statbuf->st_size ?:
> >>>>> + 1023);
> >>>>> if (linkname == NULL)
> >>>>> {
> >>>>> error (0, errno, _("cannot read symbolic link %s"),
> >>>>
> >>>> Looks reasonable to me.
> >>>>
> >>>> Reviewed-by: Jeff Layton <address@hidden>
> >>>
> >>> Actually, I'll retract that just yet...
> >>>
> >>> I'm not sure that statbuf->st_size will be initialized to 0 if statx
> >>> didn't fill out the stx_size field. You may need to accompany this patch
> >>> with one that zeroes out the stx struct in do_stat.
> >>
> >> The kernel will explicitly zero out struct kstat at the beginning of the
> >> callchain
> >> so that only valid values filled in by the filesystem are returned, in
> >> order to
> >> avoid leaking any kernel memory to userspace:
> >>
> >> int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
> >> u32 request_mask, unsigned int query_flags)
> >> {
> >> struct inode *inode = d_backing_inode(path->dentry);
> >>
> >> memset(stat, 0, sizeof(*stat));
> >> stat->result_mask |= STATX_BASIC_STATS;
> >> request_mask &= STATX_ALL;
> >> query_flags &= KSTAT_QUERY_FLAGS;
> >>
> >>
> >> Both the kernel generic_fillattr() and statx_to_stat() copy all of the
> >> fields,
> >> regardless of whether the corresponding bits are set in the request_mask
> >> or not,
> >> so this should be totally safe and there is no benefit to zero the struct
> >> in
> >> userspace first.
> >
> > I don't think it's wise to rely on undocumented behavior. That could
> > easily change in the future, and what if (e.g.) BSD grows a statx
> > implementation that doesn't do that?
> >
> > I'd suggest at least zeroing out the stx_size field first.
>
> I looked at this, but there isn't much value to zero out the field since it
> is unconditionally copied from the kernel upon return. For the unusual case
> that the kernel doesn't touch the stx_size (or other) field then I've added
> a zero initializer for the statx struct. Updated patch attached.
>
> Even so, it isn't a huge deal if the stx_size is random garbage. At worst it
> will be too small and need a few readlink() iterations to get the right size,
> and if too large it will be capped by areadlink_with_size(), so no harm is
> done. With the separate changes to areadlink_with_size() to shrink the buffer
> after allocation even this would not be significant.
>
Fair enough. This patch looks fine to me.
If we were concerned about it, the cleaner fix would probably be to have
statx_to_stat zero out st_size if STATX_SIZE wasn't set in the returned
mask. This should be fine though.
Reviewed-by: Jeff Layton <address@hidden>
- [PATCH] stat: don't explicitly request file size for filenames, Andreas Dilger, 2019/07/03
- Re: [PATCH] stat: don't explicitly request file size for filenames, Jeff Layton, 2019/07/04
- Re: [PATCH] stat: don't explicitly request file size for filenames, Jeff Layton, 2019/07/04
- Re: [PATCH] stat: don't explicitly request file size for filenames, Andreas Dilger, 2019/07/04
- Re: [PATCH] stat: don't explicitly request file size for filenames, Jeff Layton, 2019/07/05
- Re: [PATCH] stat: don't explicitly request file size for filenames, Jeff Layton, 2019/07/05
- Re: [PATCH] stat: don't explicitly request file size for filenames, Andreas Dilger, 2019/07/09
- Re: [PATCH] stat: don't explicitly request file size for filenames,
Jeff Layton <=
- Re: [PATCH] stat: don't explicitly request file size for filenames, Pádraig Brady, 2019/07/12
Re: [PATCH] stat: don't explicitly request file size for filenames, Pádraig Brady, 2019/07/04
Re: [PATCH] stat: don't explicitly request file size for filenames, Paul Eggert, 2019/07/05