[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v6 3/5] 9pfs: add new function v9fs_co_readdir_many()
From: |
Greg Kurz |
Subject: |
Re: [PATCH v6 3/5] 9pfs: add new function v9fs_co_readdir_many() |
Date: |
Mon, 4 May 2020 11:18:34 +0200 |
On Fri, 01 May 2020 16:04:41 +0200
Christian Schoenebeck <address@hidden> wrote:
> On Donnerstag, 30. April 2020 15:30:49 CEST Greg Kurz wrote:
> > > > I agree that a client that issues concurrent readdir requests on the
> > > > same fid is probably asking for troubles, but this permitted by the
> > > > spec. Whether we should detect such conditions and warn or even fail
> > > > is discussion for another thread.
> > > >
> > > > The locking is only needed to avoid concurrent accesses to the dirent
> > > > structure returned by readdir(), otherwise we could return partially
> > > > overwritten file names to the client. It must be done for each
> > > > individual
> > > > call to readdir(), but certainly not for multiple calls.
> > >
> > > Yeah, that would resolve this issue more appropriately for 9p2000.L, since
> > > Treaddir specifies an offset, but for 9p2000.u the result of a concurrent
> > > read on a directory (9p2000.u) would still be undefined.
> >
> > The bad client behavior you want to tackle has nothing to do with
> > the locking itself. Since all the code in 9p.c runs serialized in
> > the context of the QEMU main loop, concurrent readdir requests could
> > easily be detected up-front with a simple flag in the fid structure.
>
> Well, it's fine with me. I don't really see an issue here right now. But that
> all the code was serialized is not fully true. Most of the 9p.c code is still
> heavily dispatching between main thread and worker threads back and forth.
> And
> for that reason the order of request processing might change quite
> arbitrarily
> in between. Just keep that in mind.
>
Just to make things clear. The code in 9p.c is ALWAYS exclusively run by
the main thread. Only the code called under v9fs_co_run_in_worker() is
dispatched on worker threads. So, yes the order of individual backend
operations may change, but the start of a new client request is necessarily
serialized with the completion of pending ones, which is the only thing
we care for actually.
> > > > > +
> > > > > + /* save the directory position */
> > > > > + saved_dir_pos = s->ops->telldir(&s->ctx, &fidp->fs);
> > > > > + if (saved_dir_pos < 0) {
> > > > > + err = saved_dir_pos;
> > > > > + goto out;
> > > > > + }
> > > > > +
> > > > > + while (true) {
> > > > > + /* get directory entry from fs driver */
> > > > > + err = do_readdir(pdu, fidp, &dent);
> > > > > + if (err || !dent) {
> > > > > + break;
> > > > > + }
> > > > > +
> > > > > + /*
> > > > > + * stop this loop as soon as it would exceed the allowed
> > > > > maximum
> > > > > + * response message size for the directory entries collected
> > > > > so
> > > > > far, + * because anything beyond that size would need to be
> > > > > discarded by + * 9p controller (main thread / top half) anyway
> > > > > + */
> > > > > + v9fs_string_init(&name);
> > > > > + v9fs_string_sprintf(&name, "%s", dent->d_name);
> > > > > + len = v9fs_readdir_response_size(&name);
> > > > > + v9fs_string_free(&name);
> > > > > + if (size + len > maxsize) {
> > > > > + /* this is not an error case actually */
> > > > > + break;
> > > > > + }
> > > > > +
> > > > > + /* append next node to result chain */
> > > > > + if (!e) {
> > > > > + *entries = e = g_malloc0(sizeof(V9fsDirEnt));
> > > > > + } else {
> > > > > + e = e->next = g_malloc0(sizeof(V9fsDirEnt));
> > > > > + }
> > > > > + e->dent = g_malloc0(sizeof(struct dirent));
> > > >
> > > > So we're allocating a bunch of stuff here...
> > > >
> > > > > + memcpy(e->dent, dent, sizeof(struct dirent));
> > > > > +
> > > > > + /* perform a full stat() for directory entry if requested by
> > > > > caller */ + if (dostat) {
> > > > > + err = s->ops->name_to_path(
> > > > > + &s->ctx, &fidp->path, dent->d_name, &path
> > > > > + );
> > > > > + if (err < 0) {
> > > > >
> > > > > err = -errno;
> > > > >
> > > > > - } else {
> > > > > - *dent = entry;
> > > > > - err = 0;
> > > > > + break;
> > > >
> > > > ... but we're erroring out there and it seems that we're leaking
> > > > all the entries that have been allocated so far.
> > >
> > > No, they are not leaking actually.
> > >
> > > You are right that they are not deallocated in do_readdir_many(), but
> > > that's intentional: in the new implementation of v9fs_do_readdir() you
> > > see that v9fs_free_dirents(entries) is *always* called at the very end of
> > > the function, no matter if success or any error. That's one of the
> > > measures to simplify overall code as much as possible.
> >
> > Hmm... I still don't quite like the idea of having an erroring function
> > asking for extra cleanup. I suggest you come up with an idem-potent version
> > of v9fs_free_dirents(), move it to codir.c (I also prefer locality of calls
> > to g_malloc and g_free in the same unit), make it extern and call it
> > both on the error path of v9fs_co_readdir_many() and in v9fs_do_readdir().
>
> I understand your position of course, but I still won't find that to be a
> good
> move.
>
> My veto here has a reason: your requested change would prevent an application
> that I had in mind for future purpose actually: Allowing "greedy" fetching
Are you telling that this series has some kind of hidden agenda related to
a possible future change ?!?
> directory entries, that is returning all successful read entries while some
> of
> the entries might have been errored for some reason. Think about a directory
> where one entry is another device which errors for some reason, then a user
> probably still would want to see the other entries at least. I know that
Hrm... if dostat is such a weak requirement that callers might want to
simply ignore the missing stat, then readdir_many() shouldn't error out
in the first place.
> purpose would still need some minor adjustments, but no changes to the
> current
> structure of this function actually.
>
> But to also make this clear: there is no "extra" cleanup right now.
> v9fs_free_dirents(entries) is *always* called by caller, no matter if
> success,
> error, allocated list or NULL. It couldn't be more simple. A user of this
> function must call v9fs_free_dirents(entries) at a certain point anyway.
>
> If you have a bad feeling about it, I can make it more clear by an API doc
> comment on v9fs_co_readdir_many() if you want, like e.g. "You @b MUST @
> ALWAYS
> call v9fs_free_dirents() after using this function, both on success or
> error.".
>
No, I just cannot ack the case of a function that returns partially valid
data and an error at the same time. Doesn't look sane to me.
> > > I think you mean the case when there's an error inside the if (dostat) {}
> > > block: The comments on struct V9fsDirEnt already suggest that the "st"
> > > member is optional and may be NULL. So if there's an error inside if
> > > (dostat) {} then caller still has a valid "dent" field at least and it's
> > > up to caller whether or not it's a problem for its purpose that "st" is
> > > empty. For that reason I would not move the block forward.
> >
> > Hrm... the comment is:
> >
> > /*
> > * optional (may be NULL): A full stat of each directory entry is just
> > * done if explicitly told to fs driver.
> > */
> >
> > I don't read that it is optional for the fs driver to populate "st"
> > if this was required by the caller.
>
> Well, if you find that ambigious, I could add an additional sentence "It
> might
> also be NULL if stat failed.".
>
> > Also, looking at the next patch
> > I see that the condition for calling stat() is V9FS_REMAP_INODES and
> > the code explicitly requires "st" to be available in this case.
>
> Yes, but there is also an if (!st) { ... } in the subsequent patch already.
> So
> I don't see an issue here.
>
> Changing the order in readdir_many() would prevent future applications of
> that
> function that I described. Or let's say, order would need to be reverted back
> again later on.
>
> Best regards,
> Christian Schoenebeck
>
>