[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v8 9/9] qapi: query-blockstat: add driver specif
From: |
Anton Nefedov |
Subject: |
Re: [Qemu-devel] [PATCH v8 9/9] qapi: query-blockstat: add driver specific file-posix stats |
Date: |
Wed, 21 Aug 2019 12:22:05 +0000 |
On 21/8/2019 2:21 PM, Max Reitz wrote:
> On 21.08.19 13:00, Anton Nefedov wrote:
>> On 12/8/2019 10:04 PM, Max Reitz wrote:
>>> On 16.05.19 16:33, Anton Nefedov wrote:
>>>> A block driver can provide a callback to report driver-specific
>>>> statistics.
>>>>
>>>> file-posix driver now reports discard statistics
>>>>
>>>> Signed-off-by: Anton Nefedov <address@hidden>
>>>> Reviewed-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>>>> Acked-by: Markus Armbruster <address@hidden>
>>>> ---
>>>> qapi/block-core.json | 38 ++++++++++++++++++++++++++++++++++++++
>>>> include/block/block.h | 1 +
>>>> include/block/block_int.h | 1 +
>>>> block.c | 9 +++++++++
>>>> block/file-posix.c | 38 +++++++++++++++++++++++++++++++++++---
>>>> block/qapi.c | 5 +++++
>>>> 6 files changed, 89 insertions(+), 3 deletions(-)
>>>
>>>
>>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>>> index 55194f84ce..368e09ae37 100644
>>>> --- a/qapi/block-core.json
>>>> +++ b/qapi/block-core.json
>>>> @@ -956,6 +956,41 @@
>>>> '*wr_latency_histogram': 'BlockLatencyHistogramInfo',
>>>> '*flush_latency_histogram': 'BlockLatencyHistogramInfo' } }
>>>>
>>>> +##
>>>> +# @BlockStatsSpecificFile:
>>>> +#
>>>> +# File driver statistics
>>>> +#
>>>> +# @discard-nb-ok: The number of successful discard operations performed by
>>>> +# the driver.
>>>> +#
>>>> +# @discard-nb-failed: The number of failed discard operations performed by
>>>> +# the driver.
>>>> +#
>>>> +# @discard-bytes-ok: The number of bytes discarded by the driver.
>>>> +#
>>>> +# Since: 4.1
>>>> +##
>>>> +{ 'struct': 'BlockStatsSpecificFile',
>>>> + 'data': {
>>>> + 'discard-nb-ok': 'uint64',
>>>> + 'discard-nb-failed': 'uint64',
>>>> + 'discard-bytes-ok': 'uint64' } }
>>>> +
>>>> +##
>>>> +# @BlockStatsSpecific:
>>>> +#
>>>> +# Block driver specific statistics
>>>> +#
>>>> +# Since: 4.1
>>>> +##
>>>> +{ 'union': 'BlockStatsSpecific',
>>>> + 'base': { 'driver': 'BlockdevDriver' },
>>>> + 'discriminator': 'driver',
>>>> + 'data': {
>>>> + 'file': 'BlockStatsSpecificFile',
>>>> + 'host_device': 'BlockStatsSpecificFile' } }
>>>
>>> I would like to use these chance to complain that I find this awkward.
>>> My problem is that I don’t know how any management application is
>>> supposed to reasonably consume this. It feels weird to potentially have
>>> to recognize the result for every block driver.
>>>
>>> I would now like to note that I’m clearly not in a position to block
>>> this at this point, because I’ve had a year to do so, I didn’t, so it
>>> would be unfair to do it now.
>>>
>>> (Still, I feel like if I have a concern, I should raise it, even if it’s
>>> too late.)
>>>
>>> I know Markus has proposed this, but I don’t understand why. He set
>>> ImageInfoSpecific as a precedence, but that has a different reasoning
>>> behind it. The point for that is that it simply doesn’t work any other
>>> way, because it is clearly format-specific information that cannot be
>>> shared between drivers. Anything that can be shared is put into
>>> ImageInfo (like the cluster size).
>>>
>>> We have the same constellation here, BlockStats contains common stuff,
>>> and BlockStatsSpecific would contain driver-specific stuff. But to me,
>>> BlockStatsSpecificFile doesn’t look very special. It looks like it just
>>> duplicates fields that already exist in BlockDeviceStats.
>>>
>>>
>>> (Furthermore, most of ImageInfoSpecific is actually not useful to
>>> management software, but only as an information for humans (and having
>>> such a structure for that is perfectly fine). But these stats don’t
>>> really look like something for immediate human consumption.)
>>>
>>>
>>> So I wonder why you don’t just put this information into
>>> BlockDeviceStats. From what I can tell looking at
>>> bdrv_query_bds_stats() and qmp_query_blockstats(), the @stats field is
>>> currently completely 0 if @query-nodes is true.
>>>
>>> (Furthermore, I wonder whether it would make sense to re-add
>>> BlockAcctStats to each BDS and then let the generic block code do the
>>> accounting on it. I moved it to the BB in 7f0e9da6f13 because we didn’t
>>> care about node-level information at the time, but maybe it’s time to
>>> reconsider.)
>>>
>>>
>>> Anyway, as I’ve said, I fully understand that complaining about a design
>>> decision is just unfair at this point, so this is not a veto.
>>>
>>
>> hi!
>>
>> Having both "unmap" and "discard" stats in the same list feels weird.
>> The intention is that unmap belongs to the device level, and discard is
>> from the driver level.
>
> Sorry, what I meant wasn’t adding a separate “discard” group, but just
> filling in the existing “unmap” fields. As far as I understand, if we
> had BlockAcctStats for each BDS, the file node’s unmap stats would be
> the same as its discard stats, wouldn’t it?
>
So, you mean count it all on BDS level _instead_ of SCSI/IDE level?
Now it's:
"device": "drive-scsi0-0-0-0",
"node-name": "#block151",
"stats": {
"unmap_operations": 0, <--- filled by SCSI
[..]
"parent": {
"node-name": "#block056",
"stats": {
"unmap_operations": 0, <--- not filled
"driver-stats": { <--- filled by file-posix driver
"type": "file",
"data": {
"discard_bytes_ok": 0,
"discard_nb_failed": 0,
"discard_nb_ok": 0
}
}
},
Every level may drop some requests (i.e. qcow2 won't pass requests
smaller than a cluster size) i.e.
BlockBackend stats >= BDS format driver stats >= BDS protocol driver
stats
and the difference between them (mostly between the top and the bottom
ones) is interesting here too; good to know whether it's a guest who
doesn't send requests, or QEMU that limits them.
>> Now we have a separate structure named "driver-
>> specific". Could also be called "driver-stats".
>>
>> We could make this structure non-optional, present for all driver
>> types, as indeed there is nothing special about discard stats. But then
>> we need some way to distinguish
>> - discard_nb_ok == 0 as no request reached the driver level
>> - discard_nb_ok == 0 as the driver does not support the accounting
>
> You can simply make the fields optional. (Then the first case is
> “present, but 0”, and the second is “not present”.)
>
>> Yes, putting the accounting in the generic code would help, but do we
>> really want to burden it with accounting too? Tracking that each and
>> every case is covered with all those block_acct_done() invalid() and
>> failed() can really be a pain.
>
> That’s indeed a problem, yes. :-)
>
>> And what accounting should be there? All the operations? Measuring
>> discards at both device and BDS level is useful since discards are
>> optional. Double-measuring reads&writes is probably not so useful (RMW
>> accounting? Read stats for the backing images?)
>
> Yes, if we put BlockAcctStats at the node level, we should track all
> operations, I suppose. That would require adding accounting code in
> many places, so it wouldn’t be easy, correct.
>
> I think it would be the better solution, but you’re right in that it’s
> probably not worth it.
>
> But I do think it would be good if we could get away from a
> driver-specific structure (unless we really need it; and I don’t think
> we do if we just make the stats fields optional).
>