qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization


From: Christian Schoenebeck
Subject: Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization
Date: Wed, 01 Jul 2020 13:47:12 +0200

On Mittwoch, 1. Juli 2020 12:09:24 CEST Greg Kurz wrote:
> No I'm talking about code that isn't changed by this series:
> 
>     if (initial_offset == 0) {
>         v9fs_co_rewinddir(pdu, fidp);
>     } else {
>         v9fs_co_seekdir(pdu, fidp, initial_offset);
>     }
>     count = v9fs_do_readdir(pdu, fidp, max_count);
> 
> Leaving these outside the critical section seems to negate
> your arguments... please clarify.

Yeah, I admit I have neglected this issue, as concurrent readdir requests with 
same FID always was some kind of theoretical issue. But yes, you are right, 
that should be addressed by 

1. entirely removing the rewinddir/seekdir code here

2. passing initial_offset to v9fs_do_readdir(), which in turn would
   pass it to readdir_many()

3. readdir_many() would handle rewinddir/seekdir exclusively in its crticial
   section.

> There are indeed several issues with the current readdir:
> 
> 1) potential inconsistency on concurrent Treaddir requests
> 
> 2) excessive dispatching to worker threads
> 
> So we agreed that 1) should never happen with a regular linux
> client (we could even dump the lock actually) but we want to
> make it clear in the code anyway that actions on the directory
> stream are serialized.

My point is you are trying to fix a merely thereotical issue on code that's 
conceptually, inherently wrong and dead end code at first place. Top half code 
should always be designed to be as thin as possible. The old readdir code 
though is the complete opposite.

> 2) basically means moving some logic from the frontend to the
> backend, ie. called from v9fs_co_run_in_worker(). This implies
> that a very long request (ie. one that would span on many calls
> to readdir()) cannot be interrupted by the client anymore, as
> opposed to what we have now BTW.

3) Complexity of old readdir code is so much bigger compared to the new one
   such that probability of additional issues is also significantly higher.

> I tend to think that addressing both issues in one "rush" is
> as much "error prone".

No it's not. The optimized readdir implementation is quantifyable, 
significantly less complex than the old implementation (i.e. significantly 
smaller amount of branches and error pathes, determenistic clear separation of 
thread's task assignments which includes much smaller amount of thread hops). 
Less complexity and increased determinism consequently means reduced chance of 
misbehaviours. So that's not a subjective, but rather a quantifyable, proven 
statement.

You can't switch from the old (wrong) concept to the new concept without some 
minimum amount of changes, which yes are not small, but I don't see any way to 
make the change set smaller without yet introducing new negative side effects.

I am now using words that I heard from your side before: please be realistic. 
Currently man power on 9p code is extremely limited to put it mildly. Yes, we 
could spend time on fixing this (theoretical) issue on the old readdir could. 
But what would it buy? Due to the limited man power we can only move forward 
with compromises; in this case you are fighting for code that's DOA. The only 
thing that you achieve by still sticking to the old readdir code is you are 
preventing that we move forward at all. Remember: I originally sent this patch 
almost 7 months ago.

> > Also: it does not make sense to move locking on this series from fs driver
> > back to main I/O thread. Atomicity must retain on driver side, not on top
> > half.
> 
> Then the whole seekdir/rewinddir/telldir/readdir sequence should
> be called with a single v9fs_co_run_in_worker() invocation, in
> which case the locking could just be something like:
> 
> int coroutine_fn v9fs_co_readdir_many(V9fsPDU *pdu, V9fsFidState *fidp,
>                                       struct V9fsDirEnt **entries,
>                                       int32_t maxsize, bool dostat)
> {
>     int err = 0;
> 
>     if (v9fs_request_cancelled(pdu)) {
>         return -EINTR;
>     }
> 
>     v9fs_readdir_lock(&fidp->fs.dir);
> 
>     v9fs_co_run_in_worker({
>         err = do_readdir_many(pdu, fidp, entries, maxsize, dostat);
>     });
> 
>     v9fs_readdir_unlock(&fidp->fs.dir);
>     return err;
> }

That's exactly what should be prevented. The lock must be on driver thread 
side, not on main thread side. The goal is to reduce the time slice of 
individual locks. If the lock is on main thread, the time slice of a lock 
(even on huge directories with thousands of entries) is multiple factors 
larger than if the lock is on driver thread side. So that's not just few 
percent or so, it is huge.

> This would technically leave the locking in the main I/O thread,
> but it isn't conceptually different from locking at the beginning
> of do_readdir_lock() and unlocking before returning. My concern
> is that I don't know how coroutine mutexes behave with multiple
> worker threads...

Well, your Mutex -> CoMutex change was intended to fix a deadlock with the old 
readdir implementation. That deadlock could not happen with the new readdir 
implementation, which suggests that it probably makes sense to revert this 
change (i.e. CoMutex -> Mutex) for this new implementation.

Best regards,
Christian Schoenebeck





reply via email to

[Prev in Thread] Current Thread [Next in Thread]