qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v8 9/9] qapi: query-blockstat: add driver specif


From: Anton Nefedov
Subject: Re: [Qemu-devel] [PATCH v8 9/9] qapi: query-blockstat: add driver specific file-posix stats
Date: Wed, 21 Aug 2019 12:22:05 +0000

On 21/8/2019 2:21 PM, Max Reitz wrote:
> On 21.08.19 13:00, Anton Nefedov wrote:
>> On 12/8/2019 10:04 PM, Max Reitz wrote:
>>> On 16.05.19 16:33, Anton Nefedov wrote:
>>>> A block driver can provide a callback to report driver-specific
>>>> statistics.
>>>>
>>>> file-posix driver now reports discard statistics
>>>>
>>>> Signed-off-by: Anton Nefedov <address@hidden>
>>>> Reviewed-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>>>> Acked-by: Markus Armbruster <address@hidden>
>>>> ---
>>>>    qapi/block-core.json      | 38 ++++++++++++++++++++++++++++++++++++++
>>>>    include/block/block.h     |  1 +
>>>>    include/block/block_int.h |  1 +
>>>>    block.c                   |  9 +++++++++
>>>>    block/file-posix.c        | 38 +++++++++++++++++++++++++++++++++++---
>>>>    block/qapi.c              |  5 +++++
>>>>    6 files changed, 89 insertions(+), 3 deletions(-)
>>>
>>>
>>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>>> index 55194f84ce..368e09ae37 100644
>>>> --- a/qapi/block-core.json
>>>> +++ b/qapi/block-core.json
>>>> @@ -956,6 +956,41 @@
>>>>               '*wr_latency_histogram': 'BlockLatencyHistogramInfo',
>>>>               '*flush_latency_histogram': 'BlockLatencyHistogramInfo' } }
>>>>    
>>>> +##
>>>> +# @BlockStatsSpecificFile:
>>>> +#
>>>> +# File driver statistics
>>>> +#
>>>> +# @discard-nb-ok: The number of successful discard operations performed by
>>>> +#                 the driver.
>>>> +#
>>>> +# @discard-nb-failed: The number of failed discard operations performed by
>>>> +#                     the driver.
>>>> +#
>>>> +# @discard-bytes-ok: The number of bytes discarded by the driver.
>>>> +#
>>>> +# Since: 4.1
>>>> +##
>>>> +{ 'struct': 'BlockStatsSpecificFile',
>>>> +  'data': {
>>>> +      'discard-nb-ok': 'uint64',
>>>> +      'discard-nb-failed': 'uint64',
>>>> +      'discard-bytes-ok': 'uint64' } }
>>>> +
>>>> +##
>>>> +# @BlockStatsSpecific:
>>>> +#
>>>> +# Block driver specific statistics
>>>> +#
>>>> +# Since: 4.1
>>>> +##
>>>> +{ 'union': 'BlockStatsSpecific',
>>>> +  'base': { 'driver': 'BlockdevDriver' },
>>>> +  'discriminator': 'driver',
>>>> +  'data': {
>>>> +      'file': 'BlockStatsSpecificFile',
>>>> +      'host_device': 'BlockStatsSpecificFile' } }
>>>
>>> I would like to use these chance to complain that I find this awkward.
>>> My problem is that I don’t know how any management application is
>>> supposed to reasonably consume this.  It feels weird to potentially have
>>> to recognize the result for every block driver.
>>>
>>> I would now like to note that I’m clearly not in a position to block
>>> this at this point, because I’ve had a year to do so, I didn’t, so it
>>> would be unfair to do it now.
>>>
>>> (Still, I feel like if I have a concern, I should raise it, even if it’s
>>> too late.)
>>>
>>> I know Markus has proposed this, but I don’t understand why.  He set
>>> ImageInfoSpecific as a precedence, but that has a different reasoning
>>> behind it.  The point for that is that it simply doesn’t work any other
>>> way, because it is clearly format-specific information that cannot be
>>> shared between drivers.  Anything that can be shared is put into
>>> ImageInfo (like the cluster size).
>>>
>>> We have the same constellation here, BlockStats contains common stuff,
>>> and BlockStatsSpecific would contain driver-specific stuff.  But to me,
>>> BlockStatsSpecificFile doesn’t look very special.  It looks like it just
>>> duplicates fields that already exist in BlockDeviceStats.
>>>
>>>
>>> (Furthermore, most of ImageInfoSpecific is actually not useful to
>>> management software, but only as an information for humans (and having
>>> such a structure for that is perfectly fine).  But these stats don’t
>>> really look like something for immediate human consumption.)
>>>
>>>
>>> So I wonder why you don’t just put this information into
>>> BlockDeviceStats.  From what I can tell looking at
>>> bdrv_query_bds_stats() and qmp_query_blockstats(), the @stats field is
>>> currently completely 0 if @query-nodes is true.
>>>
>>> (Furthermore, I wonder whether it would make sense to re-add
>>> BlockAcctStats to each BDS and then let the generic block code do the
>>> accounting on it.  I moved it to the BB in 7f0e9da6f13 because we didn’t
>>> care about node-level information at the time, but maybe it’s time to
>>> reconsider.)
>>>
>>>
>>> Anyway, as I’ve said, I fully understand that complaining about a design
>>> decision is just unfair at this point, so this is not a veto.
>>>
>>
>> hi!
>>
>> Having both "unmap" and "discard" stats in the same list feels weird.
>> The intention is that unmap belongs to the device level, and discard is
>> from the driver level.
> 
> Sorry, what I meant wasn’t adding a separate “discard” group, but just
> filling in the existing “unmap” fields.  As far as I understand, if we
> had BlockAcctStats for each BDS, the file node’s unmap stats would be
> the same as its discard stats, wouldn’t it?
> 

So, you mean count it all on BDS level _instead_ of SCSI/IDE level?

Now it's:

       "device": "drive-scsi0-0-0-0",
       "node-name": "#block151",
       "stats": {
         "unmap_operations": 0, <--- filled by SCSI
       [..]

       "parent": {
         "node-name": "#block056",
         "stats": {
           "unmap_operations": 0, <--- not filled

         "driver-stats": { <--- filled by file-posix driver
           "type": "file",
           "data": {
             "discard_bytes_ok": 0,
             "discard_nb_failed": 0,
             "discard_nb_ok": 0
           }
         }
       },


Every level may drop some requests (i.e. qcow2 won't pass requests
smaller than a cluster size) i.e.
BlockBackend stats >= BDS format driver stats >= BDS protocol driver
stats

and the difference between them (mostly between the top and the bottom 
ones) is interesting here too; good to know whether it's a guest who
doesn't send requests, or QEMU that limits them.

>> Now we have a separate structure named "driver-
>> specific". Could also be called "driver-stats".
>>
>> We could make this structure non-optional, present for all driver
>> types, as indeed there is nothing special about discard stats. But then
>> we need some way to distinguish
>>    - discard_nb_ok == 0 as no request reached the driver level
>>    - discard_nb_ok == 0 as the driver does not support the accounting
> 
> You can simply make the fields optional.  (Then the first case is
> “present, but 0”, and the second is “not present”.)
> 
>> Yes, putting the accounting in the generic code would help, but do we
>> really want to burden it with accounting too? Tracking that each and
>> every case is covered with all those block_acct_done() invalid() and
>> failed() can really be a pain.
> 
> That’s indeed a problem, yes. :-)
> 
>> And what accounting should be there? All the operations? Measuring
>> discards at both device and BDS level is useful since discards are
>> optional. Double-measuring reads&writes is probably not so useful (RMW
>> accounting? Read stats for the backing images?)
> 
> Yes, if we put BlockAcctStats at the node level, we should track all
> operations, I suppose.  That would require adding accounting code in
> many places, so it wouldn’t be easy, correct.
> 
> I think it would be the better solution, but you’re right in that it’s
> probably not worth it.
> 
> But I do think it would be good if we could get away from a
> driver-specific structure (unless we really need it; and I don’t think
> we do if we just make the stats fields optional).
> 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]