guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: vectors are something else


From: Mark H Weaver
Subject: Re: vectors are something else
Date: Mon, 15 Apr 2013 22:00:55 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Daniel Llorens <address@hidden> writes:

> Date: Fri, 12 Apr 2013 17:43:14 -0400
> From: Mark H Weaver <address@hidden>
>
>> I've not yet had time to carefully read this thread, but I wanted to say
>> that I think we *should* prohibit passing arrays to the vector
>> interfaces.  When we have native code generation, then it will be
>> possible to make the vector procedures extremely simple machine code
>> sequences.  We must avoid adding any complications to this.  Even a
>> single additional conditional branch will be painful, in both runtime
>> overhead and code size.
>> 
>> If we were to support a subset of arrays to the vector interface, the
>> way it should be done is have the array constructors produce actual
>> vector objects when possible, or at least something that can be used by
>> the simple vector code sequences without any additional conditional
>> branches.
>> 
>> Does this make sense?
>
> Two things about this.
>
> First, arrays cannot remain opaque types if we want any kind of
> performance. The compiler should know about the storage and about the
> array descriptor as independent objects. Then it's possible to inline
> the index computations, or to eliminate the array descriptor whenever
> it is not used or when its fields are known at compile time, or to
> eliminate type checks.

Unfortunately, this is rarely possible in a language like Scheme, where
calls to procedures bound by mutable top-level variables are frequent.
We cannot fix this without making most commonly-used top-level bindings
immutable.  Last I checked, there was strong resistance to this idea.

> Second, the array operations don't involve any more branching than the
> vector operations, only an extra indirection that can be amortized
> over the size of the vector.

The cost can only be amortized in cases where the entire loop over the
vector is done without any calls to procedures bound by mutable
variables.

> That's why I think that there're two options:
>
> [1] have vector- ops only accept vector- types. This removes container
> type dispatch from the vector- ops but not from the array- ops. The
> last patch set I've sent to the list goes in this direction. It
> matches the behavior of other vector-like types such as bytevectors
> and strings.

I very much favor this approach, but unfortunately I don't think we can
do it in 2.0.  It'll have to wait until 2.2.

> [2] force all vector objects into arrays. This removes container type
> dispatch from both the vector- and the array- ops, but it adds the
> cost of indirection to the vector- ops. Which is, I think, a good
> tradeoff, because that cost is smaller, there's some hope that the
> compiler will remove it in many cases, and we retain the flexibility
> in the use of vectors that we have now.

I consider this option completely unacceptable.  Arrays are much more
complex than vectors, and will inevitably be *much* slower in the common
case where the compiler doesn't know much.  Most egregiously,
'make-shared-array' requires that array references apply an arbitrary
affine transformation that's unknown at compile time, which means
multiplications and additions for each index.

Don't get me wrong: I'm all in favor of providing flexible, generic,
extensible procedures.  They have their place.  I think that's why
there's still a justification for keeping 'generalized-vectors' around.
However, efficiency demands that we also have simple, non-generic,
non-extensible procedures as well.  IMO, we should keep the vector,
bitvector, and bytevector procedures as simple as possible.

Does this make sense?  More thoughts?

Thanks for your work on this.

      Mark



reply via email to

[Prev in Thread] Current Thread [Next in Thread]