Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization

From:	Greg Kurz
Subject:	Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization
Date:	Wed, 1 Jul 2020 17:12:40 +0200

On Wed, 01 Jul 2020 13:47:12 +0200
Christian Schoenebeck <qemu_oss@crudebyte.com> wrote:

> On Mittwoch, 1. Juli 2020 12:09:24 CEST Greg Kurz wrote:
> > No I'm talking about code that isn't changed by this series:
> > 
> >     if (initial_offset == 0) {
> >         v9fs_co_rewinddir(pdu, fidp);
> >     } else {
> >         v9fs_co_seekdir(pdu, fidp, initial_offset);
> >     }
> >     count = v9fs_do_readdir(pdu, fidp, max_count);
> > 
> > Leaving these outside the critical section seems to negate
> > your arguments... please clarify.
> 
> Yeah, I admit I have neglected this issue, as concurrent readdir requests 
> with 
> same FID always was some kind of theoretical issue. But yes, you are right, 

It's perfectly fine to miss things, that's what reviews are made for :)

> that should be addressed by 
> 
> 1. entirely removing the rewinddir/seekdir code here
> 
> 2. passing initial_offset to v9fs_do_readdir(), which in turn would
>    pass it to readdir_many()
> 
> 3. readdir_many() would handle rewinddir/seekdir exclusively in its crticial
>    section.
> 

Sounds good. v7 ?

> > There are indeed several issues with the current readdir:
> > 
> > 1) potential inconsistency on concurrent Treaddir requests
> > 
> > 2) excessive dispatching to worker threads
> > 
> > So we agreed that 1) should never happen with a regular linux
> > client (we could even dump the lock actually) but we want to
> > make it clear in the code anyway that actions on the directory
> > stream are serialized.
> 
> My point is you are trying to fix a merely thereotical issue on code that's 
> conceptually, inherently wrong and dead end code at first place. Top half 
> code 

I'm not trying to fix anything. We already have locking in place to
deal with this theoretical issue and it interferes with your changes.
I don't care that much if a silly guest shoots itself in the foot
actually, so it'll be fine with me if you just dump the lock, as
long as it doesn't cause QEMU to hang or crash.

> should always be designed to be as thin as possible. The old readdir code 
> though is the complete opposite.
> 

It isn't readdir only, most requests span over multiple v9fs_co_*() calls...

> > 2) basically means moving some logic from the frontend to the
> > backend, ie. called from v9fs_co_run_in_worker(). This implies
> > that a very long request (ie. one that would span on many calls
> > to readdir()) cannot be interrupted by the client anymore, as
> > opposed to what we have now BTW.
> 

... most likely to allow clients to interrupt a long request with a
Tflush. Have you considered this aspect in your changes ?

> 3) Complexity of old readdir code is so much bigger compared to the new one
>    such that probability of additional issues is also significantly higher.
> 
> > I tend to think that addressing both issues in one "rush" is
> > as much "error prone".
> 
> No it's not. The optimized readdir implementation is quantifyable, 
> significantly less complex than the old implementation (i.e. significantly 
> smaller amount of branches and error pathes, determenistic clear separation 
> of 
> thread's task assignments which includes much smaller amount of thread hops). 
> Less complexity and increased determinism consequently means reduced chance 
> of 
> misbehaviours. So that's not a subjective, but rather a quantifyable, proven 
> statement.
> 
> You can't switch from the old (wrong) concept to the new concept without some 
> minimum amount of changes, which yes are not small, but I don't see any way 
> to 
> make the change set smaller without yet introducing new negative side effects.
> 
> I am now using words that I heard from your side before: please be realistic. 
> Currently man power on 9p code is extremely limited to put it mildly. Yes, we 
> could spend time on fixing this (theoretical) issue on the old readdir could. 
> But what would it buy? Due to the limited man power we can only move forward 
> with compromises; in this case you are fighting for code that's DOA. The only 
> thing that you achieve by still sticking to the old readdir code is you are 
> preventing that we move forward at all. Remember: I originally sent this 
> patch 
> almost 7 months ago.
> 
> > > Also: it does not make sense to move locking on this series from fs driver
> > > back to main I/O thread. Atomicity must retain on driver side, not on top
> > > half.
> > 
> > Then the whole seekdir/rewinddir/telldir/readdir sequence should
> > be called with a single v9fs_co_run_in_worker() invocation, in
> > which case the locking could just be something like:
> > 
> > int coroutine_fn v9fs_co_readdir_many(V9fsPDU *pdu, V9fsFidState *fidp,
> >                                       struct V9fsDirEnt **entries,
> >                                       int32_t maxsize, bool dostat)
> > {
> >     int err = 0;
> > 
> >     if (v9fs_request_cancelled(pdu)) {
> >         return -EINTR;
> >     }
> > 
> >     v9fs_readdir_lock(&fidp->fs.dir);
> > 
> >     v9fs_co_run_in_worker({
> >         err = do_readdir_many(pdu, fidp, entries, maxsize, dostat);
> >     });
> > 
> >     v9fs_readdir_unlock(&fidp->fs.dir);
> >     return err;
> > }
> 
> That's exactly what should be prevented. The lock must be on driver thread 
> side, not on main thread side. The goal is to reduce the time slice of 
> individual locks. If the lock is on main thread, the time slice of a lock 
> (even on huge directories with thousands of entries) is multiple factors 
> larger than if the lock is on driver thread side. So that's not just few 
> percent or so, it is huge.
> 

Sorry I don't get it...  In a contention-less situation, which is the
case we really care for, qemu_co_mutex_lock() has no overhead.

> > This would technically leave the locking in the main I/O thread,
> > but it isn't conceptually different from locking at the beginning
> > of do_readdir_lock() and unlocking before returning. My concern
> > is that I don't know how coroutine mutexes behave with multiple
> > worker threads...
> 
> Well, your Mutex -> CoMutex change was intended to fix a deadlock with the 
> old 
> readdir implementation. That deadlock could not happen with the new readdir 
> implementation, which suggests that it probably makes sense to revert this 
> change (i.e. CoMutex -> Mutex) for this new implementation.
> 

No we can't because it is also used with 9p2000.u, that you said you
don't want to fix.

> Best regards,
> Christian Schoenebeck
> 
> 

Cheers,

--
Greg

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization, Greg Kurz, 2020/07/01
- Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization, Christian Schoenebeck, 2020/07/01
  - Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization, Greg Kurz <=
    - Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization, Christian Schoenebeck, 2020/07/02
    - Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization, Greg Kurz, 2020/07/02
    - Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization, Christian Schoenebeck, 2020/07/02
    - Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization, Christian Schoenebeck, 2020/07/03
    - Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization, Greg Kurz, 2020/07/03
    - Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization, Christian Schoenebeck, 2020/07/03
    - Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization, Greg Kurz, 2020/07/03
    - Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization, Christian Schoenebeck, 2020/07/03

Prev by Date: Re: [PATCH 2/2] xen: cleanup unrealized flash devices
Next by Date: Re: [PATCH] scripts/get_maintainer: Use .ignoredmailmap to ignore invalid emails
Previous by thread: Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization
Next by thread: Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization
Index(es):
- Date
- Thread