[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784
From: |
Peter Lieven |
Subject: |
Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784 |
Date: |
Wed, 12 Jan 2022 12:55:39 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 |
Am 12.01.22 um 10:59 schrieb Ilya Dryomov:
> On Mon, Jan 10, 2022 at 12:43 PM Peter Lieven <pl@kamp.de> wrote:
>> librbd had a bug until early 2022 that affected all versions of ceph that
>> supported fast-diff. This bug results in reporting of incorrect offsets
>> if the offset parameter to rbd_diff_iterate2 is not object aligned.
>> Work around this bug by rounding down the offset to object boundaries.
>>
>> Fixes: https://tracker.ceph.com/issues/53784
> I don't think the Fixes tag is appropriate here. Linking librbd
> ticket is fine but this patch doesn't really fix anything.
Okay, I will change that to See:
>
>> Cc: qemu-stable@nongnu.org
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>> block/rbd.c | 17 ++++++++++++++++-
>> 1 file changed, 16 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/rbd.c b/block/rbd.c
>> index 5e9dc91d81..260cb9f4b4 100644
>> --- a/block/rbd.c
>> +++ b/block/rbd.c
>> @@ -1333,6 +1333,7 @@ static int coroutine_fn
>> qemu_rbd_co_block_status(BlockDriverState *bs,
>> int status, r;
>> RBDDiffIterateReq req = { .offs = offset };
>> uint64_t features, flags;
>> + int64_t head;
>>
>> assert(offset + bytes <= s->image_size);
>>
>> @@ -1360,6 +1361,19 @@ static int coroutine_fn
>> qemu_rbd_co_block_status(BlockDriverState *bs,
>> return status;
>> }
>>
>> + /*
>> + * librbd had a bug until early 2022 that affected all versions of ceph
>> that
>> + * supported fast-diff. This bug results in reporting of incorrect
>> offsets
>> + * if the offset parameter to rbd_diff_iterate2 is not object aligned.
>> + * Work around this bug by rounding down the offset to object
>> boundaries.
>> + *
>> + * See: https://tracker.ceph.com/issues/53784
>> + */
>> + head = offset & (s->object_size - 1);
>> + offset -= head;
>> + req.offs -= head;
>> + bytes += head;
> So it looks like the intention is to have more or less a permanent
> workaround since all librbd versions are affected, right? For that,
> I think we would need to also reject custom striping patterns and
> clones. For the above to be reliable, the image has to be standalone
> and have a default striping pattern (stripe_unit == object_size &&
> stripe_count == 1). Otherwise, behave as if fast-diff is disabled or
> invalid.
Do you have a fealing how many users use a different striping pattern than
default?
What about EC backed pools?
Do you have another idea how we can detect if the librbd version is broken?
>
>> +
> Nit: I'd replace { .offs = offset } initialization at the top with {}
> and assign to req.offs here, right before calling rbd_diff_iterate2().
>
>> r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true,
>> qemu_rbd_diff_iterate_cb, &req);
>> if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) {
>> @@ -1379,7 +1393,8 @@ static int coroutine_fn
>> qemu_rbd_co_block_status(BlockDriverState *bs,
>> status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID;
>> }
>>
>> - *pnum = req.bytes;
>> + assert(req.bytes > head);
> I'd expand the workaround comment with an explanation of why it's OK
> to round down the offset -- because rbd_diff_iterate2() is called with
> whole_object=true. If that wasn't the case, on top of inconsistent
> results for different offsets within an object, this assert could be
> triggered.
Sure, you are right. I had this in mind. This also does not change complexity
since we stay with the offset in the same object. I will mention both.
Peter