[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3 10/10] qapi: scripts: add a generator for qapi's examples
From: |
Markus Armbruster |
Subject: |
Re: [PATCH v3 10/10] qapi: scripts: add a generator for qapi's examples |
Date: |
Thu, 19 Oct 2023 08:26:58 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) |
Victor Toso <victortoso@redhat.com> writes:
> Hi Markus,
>
> Sorry the delay on reply here.
>
> On Thu, Sep 21, 2023 at 01:06:01PM +0200, Markus Armbruster wrote:
>> Victor Toso <victortoso@redhat.com> writes:
>>
>> > This generator has two goals:
>> > 1. Mechanical validation of QAPI examples
>> > 2. Generate the examples in a JSON format to be consumed for extra
>> > validation.
>> >
>> > The generator iterates over every Example section, parsing both server
>> > and client messages. The generator prints any inconsistency found, for
>> > example:
>> >
>> > | Error: Extra data: line 1 column 39 (char 38)
>> > | Location: cancel-vcpu-dirty-limit at qapi/migration.json:2017
>> > | Data: {"execute": "cancel-vcpu-dirty-limit"},
>> > | "arguments": { "cpu-index": 1 } }
>>
>> What language does it parse?
>
> It parsers the Example section. Fetches client and server JSON
> messages. Validate them.
>
>> Can you give a grammar?
>
> Not sure if needed? I just fetch the json from the example
> section and validate it.
The C++ parser just fetches the code from the text file and compiles it
:)
There are way too many parsers out there that show signs of "I'm not
parsing / I don't to nail down the language" delusions.
Let's start with an informal description of the language.
An example is a sequence of QMP input, QMP output, and explanatory text.
QMP input / output is a sequence of lines where the first one starts
with "<- " / "-> ", and the remaining ones start with " ".
Lines that are not QMP input / output are explanatory text.
>> Should the parser be integrated into the doc parser, i.e. class QAPIDoc?
>
> Yes, that would be better. I'll try that in the next iteration.
>
>> > The generator will output other JSON file with all the examples in the
>>
>> Another JSON file, or other JSON files?
>
> JSON files. QEMU'S QAPI qapi/migration.json will generate a
> migration.json with the format mentioned bellow.
>>
>> > QAPI module that they came from. This can be used to validate the
>> > introspection between QAPI/QMP to language bindings, for example:
>> >
>> > | { "examples": [
>> > | {
>> > | "id": "ksuxwzfayw",
>> > | "client": [
>> > | {
>> > | "sequence-order": 1
>>
>> Missing comma
>>
>> > | "message-type": "command",
>> > | "message":
>> > | { "arguments":
>> > | { "device": "scratch", "size": 1073741824 },
>> > | "execute": "block_resize"
>> > | },
>> > | } ],
>> > | "server": [
>> > | {
>> > | "sequence-order": 2
>>
>> Another one.
>>
>> > | "message-type": "return",
>> > | "message": { "return": {} },
>>
>> Extra comma.
>>
>> > | } ]
>> > | }
>> > | ] }
>>
>> Indentation is kind of weird.
>
> The missing comma was likely my fault on copy & paste. The
> indentation should be what json.dumps() provides, but I think I
> changed somehow in the git log too.
>
>> The JSON's Valid structure and semantics are not documented.
>> We've developed a way specify that, you might've heard of it,
>> it's called "QAPI schema" ;-P
>>
>> Kidding aside, we've done this before. E.g. docs/interop/firmware.json
>> specifies firmware descriptors. We have some in pc-bios/descriptors/.
>
> Oh, that's neat. You meant to use QAPI schema as a way to
> document output from examples.py?
Exactly!
>> > Note that the order matters, as read by the Example section and
>> > translated into "sequence-order". A language binding project can then
>> > consume this files to Marshal and Unmarshal, comparing if the results
>> > are what is to be expected.
>> >
>> > RFC discussion:
>> > https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg04641.html
>> >
>> > Signed-off-by: Victor Toso <victortoso@redhat.com>
>> > ---
>> > scripts/qapi/dumpexamples.py | 208 +++++++++++++++++++++++++++++++++++
>> > scripts/qapi/main.py | 3 +-
>> > 2 files changed, 210 insertions(+), 1 deletion(-)
>> > create mode 100644 scripts/qapi/dumpexamples.py
>> >
>> > diff --git a/scripts/qapi/dumpexamples.py b/scripts/qapi/dumpexamples.py
>> > new file mode 100644
>> > index 0000000000..55d9f13ab7
>> > --- /dev/null
>> > +++ b/scripts/qapi/dumpexamples.py
>>
>> Let's name this examples.py. It already does a bit more than
>> dump (namely validate), and any future code dealing with
>> examples will likely go into this file.
>
> Ack.
>
>> > @@ -0,0 +1,208 @@
>> > +"""
>> > +Dump examples for Developers
>> > +"""
>> > +# Copyright (c) 2023 Red Hat Inc.
>> > +#
>> > +# Authors:
>> > +# Victor Toso <victortoso@redhat.com>
>> > +#
>> > +# This work is licensed under the terms of the GNU GPL, version 2.
>>
>> We should've insisted on v2+ for the QAPI generator back when we had a
>> chance. *Sigh*
>
> :)
>
>> > +# See the COPYING file in the top-level directory.
>> > +
>> > +# Just for type hint on self
>> > +from __future__ import annotations
>> > +
>> > +import os
>> > +import json
>> > +import random
>> > +import string
>> > +
>> > +from typing import Dict, List, Optional
>> > +
>> > +from .schema import (
>> > + QAPISchema,
>> > + QAPISchemaType,
>> > + QAPISchemaVisitor,
>> > + QAPISchemaEnumMember,
>> > + QAPISchemaFeature,
>> > + QAPISchemaIfCond,
>> > + QAPISchemaObjectType,
>> > + QAPISchemaObjectTypeMember,
>> > + QAPISchemaVariants,
>>
>> pylint warns
>>
>> scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaEnumMember
>> imported from schema (unused-import)
>> scripts/qapi/dumpexamples.py:22:0: W0611: Unused
>> QAPISchemaObjectTypeMember imported from schema (unused-import)
>> scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaVariants
>> imported from schema (unused-import)
>
> Yes, now I learned a few tricks around python linters and
> formatting. Next iteration will be better.
>
>> > +)
>> > +from .source import QAPISourceInfo
>> > +
>> > +
>> > +def gen_examples(schema: QAPISchema,
>> > + output_dir: str,
>> > + prefix: str) -> None:
>> > + vis = QAPISchemaGenExamplesVisitor(prefix)
>> > + schema.visit(vis)
>> > + vis.write(output_dir)
>>
>> The other backends have this at the end of the file. Either order
>> works, but consistency is nice.
>
> Ok.
>
>> > +
>> > +
>> > +def get_id(random, size: int) -> str:
>>
>> pylint warns
>>
>> scripts/qapi/dumpexamples.py:44:11: W0621: Redefining name
>> 'random' from outer scope (line 17) (redefined-outer-name)
>>
>> > + letters = string.ascii_lowercase
>> > + return ''.join(random.choice(letters) for i in range(size))
>> > +
>> > +
>> > +def next_object(text, start, end, context) -> (Dict, bool):
>> > + # Start of json object
>> > + start = text.find("{", start)
>> > + end = text.rfind("}", start, end+1)
>>
>> Single quotes, please, for consistency with quote use elsewhere.
>>
>> Rules of thumb:
>>
>> * Double quotes for english text (e.g. error messages), so we don't have
>> to escape apostrophes.
>>
>> * Double quotes when they reduce the escaping
>>
>> * Else single quotes
>>
>> More elsewhere, not flagged.
>>
>> > +
>> > + # try catch, pretty print issues
>>
>> Is this comment useful?
>
> I'll remove.
>
>> > + try:
>> > + ret = json.loads(text[start:end+1])
>> > + except Exception as e:
>>
>> pylint warns
>>
>> scripts/qapi/dumpexamples.py:57:11: W0703: Catching too general
>> exception Exception (broad-except)
>>
>> Catch JSONDecodeError?
>
> Ack.
>
>> > + print("Error: {}\nLocation: {}\nData: {}\n".format(
>> > + str(e), context, text[start:end+1]))
>>
>> Errors need to go to stderr.
>>
>> Have you considered using QAPIError to report these errors?
>
> Did not cross my mind, no. I'll take a look.
>
>> > + return {}, True
>> > + else:
>> > + return ret, False
>> > +
>> > +
>> > +def parse_text_to_dicts(text: str, context: str) -> (List[Dict], bool):
>>
>> Before I review the parser, I'd like to know the (intended) language
>> being parsed.
>
> Ok, I'll add documentation about it in the next iteration.
>
>> > + examples, clients, servers = [], [], []
>> > + failed = False
>> > +
>> > + count = 1
>> > + c, s = text.find("->"), text.find("<-")
>> > + while c != -1 or s != -1:
>> > + if c == -1 or (s != -1 and s < c):
>> > + start, target = s, servers
>> > + else:
>> > + start, target = c, clients
>> > +
>> > + # Find the client and server, if any
>> > + if c != -1:
>> > + c = text.find("->", start + 1)
>> > + if s != -1:
>> > + s = text.find("<-", start + 1)
>> > +
>> > + # Find the limit of current's object.
>> > + # We first look for the next message, either client or server. If
>> > none
>> > + # is avaible, we set the end of the text as limit.
>> > + if c == -1 and s != -1:
>> > + end = s
>> > + elif c != -1 and s == -1:
>> > + end = c
>> > + elif c != -1 and s != -1:
>> > + end = (c < s) and c or s
>>
>> pylint advises
>>
>> scripts/qapi/dumpexamples.py:91:12: R1706: Consider using ternary (c if
>> c < s else s) (consider-using-ternary)
>>
>> > + else:
>> > + end = len(text) - 1
>> > +
>> > + message, error = next_object(text, start, end, context)
>> > + if error:
>> > + failed = True
>> > +
>> > + if len(message) > 0:
>> > + message_type = "return"
>> > + if "execute" in message:
>> > + message_type = "command"
>> > + elif "event" in message:
>> > + message_type = "event"
>> > +
>> > + target.append({
>> > + "sequence-order": count,
>> > + "message-type": message_type,
>> > + "message": message
>> > + })
>> > + count += 1
>> > +
>> > + examples.append({"client": clients, "server": servers})
>> > + return examples, failed
>> > +
>> > +
>> > +def parse_examples_of(self: QAPISchemaGenExamplesVisitor,
>>
>> Uh, shouldn't this be a method of QAPISchemaGenExamplesVisitor?
>
> Indeed.
>
>> > + name: str):
>> > +
>> > + assert(name in self.schema._entity_dict)
>>
>> Makes both pycodestyle and pylint unhappy. Better:
>>
>> assert name in self.schema._entity_dict
>>
>> pylint warns
>>
>> scripts/qapi/dumpexamples.py:120:19: W0212: Access to a protected member
>> _entity_dict of a client class (protected-access)
>>
>> here and two more times below.
>
> Thanks, I'll have all of those fixed.
>
>> > + obj = self.schema._entity_dict[name]
>> > +
>> > + if not obj.info.pragma.doc_required:
>> > + return
>> > +
>> > + assert((obj.doc is not None))
>>
>> Better:
>>
>> assert obj.doc is not None
>>
>> > + module_name = obj._module.name
>> > +
>> > + # We initialize random with the name so that we get consistent example
>> > + # ids over different generations. The ids of a given example might
>> > + # change when adding/removing examples, but that's acceptable as the
>> > + # goal is just to grep $id to find what example failed at a given test
>> > + # with minimum chorn over regenerating.
>>
>> churn from?
>
> There is one id per example section. The idea of having an id is
> that, if a test failed, we can easily find what test failed.
>
> The comment tries to clarify that the goal is the id to be kept
> intact (hence, we seed from its name), reducing the churn over
> regenerating the output.
I'm not sure "over" is the correct preposition here.
[...]
>> I think before I dig deeper, we should discuss my findings so far.
>
> Yes, I think I agreed with mostly of your suggestions. I'm
> learning how to write proper python, so sorry that we saw many
> basic lint failures here.
No problem at all! Such things only become a problem when people refuse
/ are unable to learn ;)
>> Here's my .pylintrc, in case you want to run pylint yourself:
>>
>> disable=
>> consider-using-f-string,
>> fixme,
>> invalid-name,
>> missing-docstring,
>> too-few-public-methods,
>> too-many-arguments,
>> too-many-branches,
>> too-many-instance-attributes,
>> too-many-lines,
>> too-many-locals,
>> too-many-statements,
>> unused-argument,
>> unused-wildcard-import,
>
> Thanks again for the review, I really appreciate it.
You're welcome!