qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 for-6.0? 0/3] qcow2: fix parallel rewrite and discard (rw-


From: Max Reitz
Subject: Re: [PATCH v4 for-6.0? 0/3] qcow2: fix parallel rewrite and discard (rw-lock)
Date: Tue, 30 Mar 2021 14:51:06 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0

On 30.03.21 12:51, Vladimir Sementsov-Ogievskiy wrote:
30.03.2021 12:49, Max Reitz wrote:
On 25.03.21 20:12, Vladimir Sementsov-Ogievskiy wrote:
ping. Do we want it for 6.0?

I’d rather wait.  I think the conclusion was that guests shouldn’t hit this because they serialize discards?

I think, that we never had bugs, so we of course can wait.


There’s also something Kevin wrote on IRC a couple of weeks ago, for which I had hoped he’d sent an email but I don’t think he did, so I’ll try to remember and paraphrase as well as I can...

He basically asked whether it wouldn’t be conceptually simpler to take a reference to some cluster in get_cluster_offset() and later release it with a to-be-added put_cluster_offset().

He also noted that reading is problematic, too, because if you read a discarded and reused cluster, this might result in an information leak (some guest application might be able to read data it isn’t allowed to read); that’s why making get_cluster_offset() the point of locking clusters against discarding would be better.

Yes, I thought about read too, (RFCed in cover letter of [PATCH v5 0/6] qcow2: fix parallel rewrite and discard (lockless))


This would probably work with both of your solutions.  For the in-memory solutions, you’d take a refcount to an actual cluster; in the CoRwLock solution, you’d take that lock.

What do you think?


Hmm. What do you mean? Just rename my qcow2_inflight_writes_inc() and qcow2_inflight_writes_dec() to get_cluster_offset()/put_cluster_offset(), to make it more native to use for read operations as well?

Hm.  Our discussion wasn’t so detailed.

I interpreted it to mean all qcow2 functions that find an offset to a qcow2 cluster, namely qcow2_get_host_offset(), qcow2_alloc_host_offset(), and qcow2_alloc_compressed_cluster_offset().

When those functions return an offset (in)to some cluster, that cluster (or the image as a whole) should be locked against discards. Every offset received this way would require an accompanying qcow2_put_host_offset().

Or to update any kind of "getting cluster offset" in the whole qcow2 driver to take a kind of "dynamic reference count" by get_cluster_offset() and then call corresponding put() somewhere? In this case I'm afraid it's a lot more work..

Hm, really? I would have assumed we need to do some locking in all functions that get a cluster offset this way, so it should be less work to take the lock in the functions they invoke to get the offset.

It would be also the problem that a lot of paths in qcow2 are not in coroutine and don't even take s->lock when they actually should.

I’m not sure what you mean here, because all functions that invoke any of the three functions I listed above are coroutine_fns (or, well, I didn’t look it up, but they all have *_co_* in their name).

This will also mean that we do same job as normal qcow2 refcounts already do: no sense in keeping additional "dynamic refcount" for L2 table cluster while reading it, as we already have non-zero qcow2 normal refcount for it..

I’m afraid I don’t understand how normal refcounts relate to this. For example, qcow2_get_host_offset() doesn’t touch refcounts at all.

Max




reply via email to

[Prev in Thread] Current Thread [Next in Thread]