qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 03/16] qapi/expr.py: constrain incoming expression types


From: Markus Armbruster
Subject: Re: [PATCH v3 03/16] qapi/expr.py: constrain incoming expression types
Date: Tue, 02 Mar 2021 06:23:16 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)

John Snow <jsnow@redhat.com> writes:

> On 2/25/21 6:56 AM, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>> 
>>> On 2/24/21 5:01 AM, Markus Armbruster wrote:
>>>> John Snow <jsnow@redhat.com> writes:
>>>>
>>>>> mypy does not know the types of values stored in Dicts that masquerade
>>>>> as objects. Help the type checker out by constraining the type.
>>>>>
>>>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>>>> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
>>>>> Reviewed-by: Cleber Rosa <crosa@redhat.com>
>>>>> ---
>>>>>    scripts/qapi/expr.py | 25 ++++++++++++++++++++++---
>>>>>    1 file changed, 22 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/scripts/qapi/expr.py b/scripts/qapi/expr.py
>>>>> index 5694c501fa3..783282b53ce 100644
>>>>> --- a/scripts/qapi/expr.py
>>>>> +++ b/scripts/qapi/expr.py
>>>>> @@ -15,9 +15,17 @@
>>>>>    # See the COPYING file in the top-level directory.
>>>>>    
>>>>>    import re
>>>>> +from typing import MutableMapping, Optional
>>>>>    
>>>>>    from .common import c_name
>>>>>    from .error import QAPISemError
>>>>> +from .parser import QAPIDoc
>>>>> +from .source import QAPISourceInfo
>>>>> +
>>>>> +
>>>>> +# Expressions in their raw form are JSON-like structures with arbitrary 
>>>>> forms.
>>>>> +# Minimally, their top-level form must be a mapping of strings to values.
>>>>> +Expression = MutableMapping[str, object]
>>>>
>>>> MutableMapping, fancy.  It's only ever dict.  Why abstract from that?
>> 
>> OrderedDict, actually.  MutableMapping is misleading, because it doesn't
>> specify "orderedness".
>> 
>
> Yeah, I am realizing that Dict helps imply that constraint on 3.6+ but 
> that MutableMapping doesn't.
>
> I am worried about how hard it's gonna hurt when I remember why I wanted 
> MutableMapping.
>
>  >:|
>
> For now, I'll go back to Dict.
>
>>> I don't know! I referenced this in the cover letter. I cannot remember
>>> the reason anymore. It had R-Bs on it so I left it alone.
>>>
>>> There are some differences, but I no longer remember why I thought they
>>> applied. Maybe some of my more exploratory work wanted it. Dunno.
>> 
>> Happens.  It's a long patch queue you're trying to flush.
>> 
>>>> The use of object is again owed to mypy's inability to do recursive
>>>> types.  What we really have here is something like
>>>>
>>>>      Expression = Union[bool, str, dict[str, Expression], list[Expression]]
>>>>
>>>> with the root further constrained to the Union's dict branch.  Spell
>>>> that out in a bit more detail, like you did in introspect.py?
>>>>
>>>
>>> Terminology problem?
>>>
>>> I am using "Expression" to mean very specifically a top-level object as
>>> returned from parser.py, which *must* be an Object, so it *must* be a
>>> mapping of str => yaddayadda.
>> 
>> Aha!
>> 
>> We'll talk some more about naming of type aliases in review of PATCH 08.
>> 
>>> The type as I intended it is Expression = Dict[str, yaddayadda]
>>>
>>> where yaddayadda is
>>> Union[int, str, bool, List[yaddayadda], Dict[str, yaddayadda]]
>> 
>> Yes.
>> 
>> As qapi-code-gen.txt explains, we have two layers of syntax:
>> 
>> * The bottom layer is (heavily bastardized) JSON.  qapi-code-gen.txt
>>    specifies it by listing the differences to RFC 8259.  parser.py parses
>>    it into abstract syntax trees.
>> 
>
> Aside: A new realization about a deviation from JSON: objects are 
> inherently unordered collections.

For a value of "new" :)

In JSON *syntax* (the thing defined by its grammar), order matters.
It doesn't in *semantics*.  Except when it does:

    JSON parsing libraries have been observed to differ as to whether or
    not they make the ordering of object members visible to calling
    software.  Implementations whose behavior does not depend on member
    ordering will be interoperable in the sense that they will not be
    affected by these differences.

This is RFC 8259 section 4, Objects.

qapi-code-gen.txt spells it out for the QAPI schema language.  Section
"Schema syntax":

    The order of members within JSON objects does not matter unless
    explicitly noted.

Later sections note explicitly.

>> * The upper layer recognizes the abstract syntax trees that are valid as
>>    QAPI schema.  qapi-code-gen.txt specifies it with a context-free
>>    grammar.  expr.py checks the ASTs against that grammar.  It also
>>    expands shorthand forms into longhand.
>> 
>> Detail not documented in qapi-code-gen.txt: parser.py rejects non-object
>> at the top-level, so expr.py doesn't have to.
>> 
>
> Yep.
>
>>> expr.py is what validates the yaddayadda, so there's no point in trying
>>> to type it further, I think.
>> 
>> If mypy could do recursive types, typing it further would be a
>> no-brainer: just state what is.
>> 
>> Since it can't, we need to stop typing / start cheating at some point.
>> Where exactly is not obvious.  Your idea is at least as good as mine.
>> 
>>> Probably worth a better comment.
>> 
>> Yes :)
>> 
>
> I'll look at Patch 8 and then revisit, but I will attempt to make a 
> better comment. I think there are bits of part 5 that makes it a bit 
> more obvious, because I create a Real Type :tm: to pass each 
> "Expression" as a whole over to to expr.py.
>
> (This is just kind of an intermediate form and as such it's not 
> necessarily something I gave tremendous thought to.)
>
>>>> Hmm, there you used Any, not object.  I guess that's because mypy lets
>>>> you get away with object here, but not there.  Correct?
>>>>
>>>
>>> Yep. I can get away with the stricter type here because of how we use
>>> it, so I did. That's an artifact of it not being recursive and how
>>> expr.py's entire raison d'etre is using isinstance() checks to
>>> effectively downcast for us everywhere already.
>>>
>>>> Also, PEP 8 comment line length.
>>>>
>>>
>>> Augh.
>>>
>>> Is there a way to set emacs mode highlighting in Python such that it
>>> highlights when I run past the 72-col margin, but only for comments?
>>>
>>> I have the general-purpose highlighter on for the 80-col margin.
>> 
>> Got a .emacs snippet for me?
>> 
>
> I use only these bits:
>
>   ;; Reflow-width defaults to 72.
>   '(fill-column 72)
>
>   ;; Highlight past column 80
>   '(whitespace-line-column 80)
>
>   ;; Theme whitespace highlighting as such:
>   '(whitespace-style '(face empty tabs lines-tail trailing))
>
>   ;; Don't insert tabs for spaces
>   '(indent-tabs-mode nil)
>
>>> I'm not familiar with any setting like this for any of the linters or
>>> pycharm right away either.
>> 
>> Hmm, ... okay, TIL from pycodestyle(1):
>> 
>>              --max-line-length=n  set maximum allowed line length (default: 
>> 79)
>>              --max-doc-length=n   set maximum allowed doc line length and 
>> perform these
>>                                   checks (unchecked if not set)
>> 
>> Let me know whether --max-doc-length=72 fits the bill.
>> 
>
> It does.
>
> I'd need to send a fixup patch in order to enable it, but I am not 
> thrilled with the idea of having to squabble with you over how to break 
> lines that are just barely overlong.
>
> Least annoying for me: I write a draft patch to get the flake8 baseline 
> for local testing, you copy-edit the patch to your stylistic liking, 
> I'll ACK the edits.

Go right ahead.

>>>
>>>>>    
>>>>>    
>>>>>    # Names must be letters, numbers, -, and _.  They must start with 
>>>>> letter,
>>>>> @@ -287,9 +295,20 @@ def check_event(expr, info):
>>>>>    
>>>>>    def check_exprs(exprs):
>>>>>        for expr_elem in exprs:
>>>>> -        expr = expr_elem['expr']
>>>>> -        info = expr_elem['info']
>>>>> -        doc = expr_elem.get('doc')
>>>>> +        # Expression
>>>>> +        assert isinstance(expr_elem['expr'], dict)
>> 
>> Must be an *ordered* mapping, actually.  It's only ever OrderedDict.
>> 
>
> Allegedly. Lawsuit pending appeal.
>
>>>>> +        for key in expr_elem['expr'].keys():
>>>>> +            assert isinstance(key, str)
>>>>> +        expr: Expression = expr_elem['expr']
>> 
>> *Unchecked* way to tell the type checker (I think):
>> 
>>               expr = cast(Expression, expr_elem['expr']
>> 
>> I like checking in general, but is it worth the bother here?
>> 
>
> It all goes away in the first half of part 5, where I create an 
> Expression object that has typed fields for its components, and mypy's 
> static checker does the rest of the lifting.
>
> Could do casts, yeah, but I suppose I liked the assert to let me know 
> that the types I saw on the back of my eyelids were the real ones.
>
>>>>
>>>> You're fine with repeating exp_elem['expr'] here, and ...
>>>>
>>>>> +
>>>>> +        # QAPISourceInfo
>>>>> +        assert isinstance(expr_elem['info'], QAPISourceInfo)
>>>>> +        info: QAPISourceInfo = expr_elem['info']
>>>>
>>>> ... expr_elem['info'] here, but ...
>>>>
>>>>> +
>>>>> +        # Optional[QAPIDoc]
>>>>> +        tmp = expr_elem.get('doc')
>>>>> +        assert tmp is None or isinstance(tmp, QAPIDoc)
>>>>> +        doc: Optional[QAPIDoc] = tmp
>>>>
>>>> ... you avoid repetition of expr_elem.get('doc') here.  Any particular
>>>> reason?
>>>>
>>>
>>> Because this looks like garbage written by a drunkard:
>>>
>>> assert expr_elem.get('doc') is None or isinstance(expr_elem.get('doc'),
>>> QAPIDoc)
>>> doc: Optional[QAPIDoc] = expr_elem.get('doc')
>> 
>> Unchecked way:
>> 
>>               doc = cast(Optional[QAPIDoc], expr_elem.get('doc'))
>> 
>>>>>    
>>>>>            if 'include' in expr:
>>>>>                continue
>
> I'll see if I can clean it up a little. I will take your suggestion of 
> casts to mean that you'd be okay with actually not checking the form at 
> runtime.

I am.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]