Re: propose deprecation of generalized-vector-*

On Thu, Feb 28, 2013 at 9:42 PM, Noah Lavine <address@hidden> wrote:

Hello,

On Thu, Feb 28, 2013 at 2:10 PM, Daniel Llorens <address@hidden> wrote:

On Feb 22, 2013, at 01:22, Noah Lavine wrote:

> I agree about the speed issue, but I hope it will get better soon. The RTL VM will fix some of it, and native compilation will fix more.

That's on Scheme, but there are also many optimization issues related to array operations. Temporaries, order of traversal, etc.

Yes, you're right. This will be a long-term project.

> I'm actually not very enthusiastic about this, not because you shouldn't be able to do this, but because in order to enable the automatic de-ranking, you have to have Guile assume which dimensions you want to map over. That's how C, C++ and Fortran do it because that's how arrays are actually stored in memory, so maybe that is the right way. It just seems too low-level for me - I'd rather see an array-slice function that can split along any dimensions.

enclosed-array also let you pick what axes you wanted for the cell. You needed to specify those axes every time. That feels /more/ low level to me.

The memory order of arrays in Guile is absolutely low level detail, especially since it can change at any time. However the ¿logical? order of the axes is not. It's simpler to define the looping operation so that the frame (the axes one loops over) consists of the axes that come first. It plays well with the rank extension / matching mechanism that I show at the end and with the view of an array as a list.

Yes, I think you've persuaded me that there is a "natural" order to axes. There should still be an operator that splits in other ways, but I agree that we can shortcut that in many cases.

> This gets at the heart of my issue with the array functionality. As far as I can tell, in Guile, there is no way to figure out what the rank of a function is. That's why you have to be explicit about what you're mapping over.
>
> I suppose the Common Lisp-y approach would be to make an object property called 'rank', set it for all of the built-in arithmetic functions, and maybe have some way to infer the rank of new functions That might be interesting, but I'm skeptical.

Exactly, this is what is needed. Then you can write array functions that can be extended for arguments of higher rank without the function itself having to deal with those extra axes that are none of its concern. Otherwise you need to give axis indications left and right. I've suffered this in numpy. This information belongs with the function.

(I'll reply to this below)

> Thanks a lot for starting the conversation. I would like to see Guile provide enough array functionality for serious scientific computing, and it sounds like you want the same thing. I don't really know what's missing yet, though, because I haven't tried to write a program that would use it.

It's a problem, because one needs at the very least mapping and reductions to write any kind of numeric program. Guile has absolutely nothing for array reductions and the mapping is very low level.

A slow array reduction is easy enough to add, but I'm guessing that's not what we need. :-) Perhaps we should have some sort of (ice-9 array) module where we put useful array functions.

> I think the idea of splitting arrays is great. My only concern is making it part of array-ref. I still think that's a really bad idea, because it introduces a new class of errors that are really easy to make - accidentally getting an array when you expected whatever was inside the array. I'm coming at this as a user of Matlab and Fortran. In those languages, this isn't a problem, because operations automatically map over arrays, so having an array where you expected a value doesn't lead to new errors. But in Scheme, operations *don't* automatically map, so getting an array could lead to an exception at some point later in a program when really the error was that you didn't give enough indices to array-ref.

In my experience the kind of rank errors you describe are unlikely to happen, because in most programs the ranks of arrays are static.

That's a good point. I'm still not convinced, because making array-ref do this means that array-ref can return two fundamentally different types of results - arrays and other objects. This is very different than most functions. But I think this comes down to a more fundamental difference - I still don't think that functions should automatically map over arrays, and you do. If they did automatically map, then I would agree with you about array-ref, because then arrays wouldn't be fundamentally different types from the objects they contained.

It's a bit like function arity, the general case is important and must be supported, but most functions have fixed arity, and that reveals many optimization opportunities. If the rank of an array is known, then the rank of the array-ref result is also known. The Guile compiler seems to ignore all of this right now, but it probably shouldn't.

I agree that the compiler should be better, but as one of the people working on it, there are lots of things that it should do and presently doesn't. I don't know when we'll get nice array optimizations.

I've implemented the idea of assigning rank to functions and then extending these over arrays of higher rank. At this point I'm mostly interested in having the basic mechanism right, so the code is probably a bit rough.

I wrote some description of how it works in the README. Please have a look and let me know what you think. You can find it at:

https://gitorious.org/guile-ploy

I read through your README. I still haven't looked at the code, but that looks very cool! I would be excited to have a library like that in Guile - but I think that this should be optional, and that not *every* function should have rank information. This is because while it is fairly natural for programs that involve a lot of array processing, I don't think it is as natural for, for example, networking code or the web server. I really like the ply function that lets you connect functions with rank information to functions without it.

I also wanted to write a bit about passing arrays between C/C++ and Guile, but it's really a different matter, so maybe some other time. The problem here is that each library has its own calling convention and has different constraints on the kind of arrays it takes, so it's not something that the ffi can handle transparently.

Yes, that seems like a big issue.

I definitely agree that we should provide enough primitive operators to write fast array code in Guile. Your README says that certain functions can't be implemented efficiently without access to the underlying array representation - that should certainly change. However, I don't think that we should make every function have rank information, when it's not really used in most areas of programming. I think the library you've presented is a great compromise, because it lets you put rank annotations on some functions, but not all.

What do you think?

Best,
Noah

From:	Noah Lavine
Subject:	Re: propose deprecation of generalized-vector-*
Date:	Thu, 28 Feb 2013 22:46:08 -0500