From: jimw@math.umass.edu (Jim Weigang)
Newsgroups: comp.lang.apl
Date: 25 Sep 1994 21:13:54 GMT
Subject: Good code/bad code & looping

Some recent posts on c.l.a have drawn me out of lurk mode to complain. My apologies to the unnamed author whose material I'm commenting on. I'm trying to criticize the ideas, which are fairly widespread, not the person who happened to utter them here. I hope my comments will not discourage further postings by the author.

A recent posting claimed that for "most things that one wants to do, ... there's a better way to do it in APL than using a loop". This may be true, depending on what you want to do, but I believe it gives an incorrect impression about the use of loops in APL code. My experience is that virtually all real-life nontrivial tasks involve the use of loops and ifs in APL, just not as many as in other languages. Many of these tasks simply cannot be coded without loops. Others can be coded either with or without loops.

Some loops can be eliminated by using operations on larger arrays. Doing so usually involves a tradeoff between the number of primitive functions executed and the amount of memory used. A certain amount of loop elimination is important for achieving reasonable efficiency in APL. But once the loop elimination has yielded an array size of O(1000) elements, there is little CPU efficiency to be gained by trading a loop for a larger array operation. Doing so just results in increased memory consumption, not significantly faster execution.

But relentless elimination of every possible loop also has a human cost: it can make APL code very difficult to write, painful for others to read, and much harder to modify. Some examples:

The posting goes on to call looping code "bad" and programmers that write such code "lazy". Nonlooping code is "good, idiomatic, APL". Personally, I don't think this attitude is at all helpful in trying to increase the use of APL. In other languages, if a program works correctly and is documented, you're done and you use the program. If the APL community demands that users to live up to some higher standard of writing "good" nonlooping programs, we are imposing an extra burden not imposed by other languages, and this will turn users away from APL. Not everyone is skilled enough to write nonlooping solutions, and not every loop should be eliminated. I think it's vitally important that nonlooping solutions be presented to novices in terms of, "Look, here's an easier way of doing the job, and here's a clear explanation of how it works. It's efficient, and once you see how it works, it really isn't that obscure." Nonlooping solutions that can't be presented in this way, because they are outrageously complex, should not be put forward as examples of good APL programming.

Manugistsc's control structures are described as an "abomination" that "pervert the language", resulting in something that is "purportedly APL." Come on, they're just another way of writing branch statements. If you don't use :IF X, you'll end up doing the same thing with {goto}(~X)/L1. If they seem clunky (in comparison to symbolic alternatives that have been proposed), perhaps that will discourage excessive use of them. And if they seem too conventional, bear in mind that many ideas which seem good on paper turn out to be a bad idea when you try them in practice. Most of the proposals for more "APLish" control structures have not been put into practice, and nobody can be sure whether or not they would really be palatable. By adopting the standard control structures found in other languages, STSC implemented a tried and true (and already familiar) notation. They did so without making massive changes to the language. You can easily write a program that translates :keywords to branch statements when porting APL*PLUS III code to other APLs. I think they did a fine job, and if you don't like the result you can always stick to branching and labels.

Jim


From: jimw@math.umass.edu (Jim Weigang)
Newsgroups: comp.lang.apl
Date: 27 Sep 1994 21:29:07 GMT
Subject: Re: Good code/bad code & looping

In case it wasn't clear from my original posting, I'm not speaking out against noniterative APL code, just against the practice of blindly branding looping code as "bad" and scolding beginners who haven't learned how to avoid unnecessary looping.

Bill Chang's examples of dyadic rotate, notequal and other scans, the vanilla-APL methods of vector-to-matrix conversion, and manipulation of n-dimensional arrays are all important basic techniques that I think should be taught to fledgling APLers. (Even to nested APLers who think they don't need flat-array techniques.) These techniques are not what I would call "outrageously complex". Modeling the state machine with partition operations is an example where I think loop avoidance went too far to be considered reasonable.

Yes, complex nonlooping solutions are sometimes dictated by requirements of CPU efficiency. But the vast majority of code is not that critical, either because the application runs fast enough that there's no need to improve it, or because the section of code is not a CPU bottleneck. Rarely do I find APL users saying, "The profiler showed that this routine uses 20% of the execution time. How can I speed it up?". Instead, some APLers seek noniterative solutions for all tasks, without considering whether such a solution is necessary, worth the cost, or clear enough to be maintainable.

I should point out that not all APL programmers do this. (Certainly not me, ha ha.) An APLer with enough experience and a healthy disregard for what purists think of his code can dig right in to a problem, instinctively lay out loops and ifs for the parts that need it, and write non-iterative code for the doable tasks within the loops.

Thinking more about that application with the incomprehensible code that gives WS FULLs, I realized that although the WS FULLs result from eliminating too many loops, the incomprehensiblity stems in large part from a lack of comments. Here's a concrete example. A colleague reported that the following line was giving a WS FULL:

   ANN{<-}{mix}{each}{mix}{each}{mix}{each}{mix}{each}     {+
+}      {split}{split}{split}{split}1 3 5 4 2{transpose}   {+
+}      {mix}(({shape}POS),2,NAA,iben){reshape}ANN

(This is written for APL*PLUS, evolution level 1. {mix}A is {disclose}A in APL2; it removes a level of nesting and puts the revealed dimension(s) to the right of existing dimensions. {split}A is {enclose}[{shape}{shape}A]A in APL2; it hides the last dimension, sinking it into a new level of nesting.)

The code had no comments describing the structure of ANN. It turns out that ANN has dimensions [POS;2,NAA,iben][16]. That is, it's a nested matrix having a row for each element of POS, and a column for each element of a raveled 2-by-NAA-by-iben array (three dimensions raveled into one). Each item of the matrix is a 16-element vector. POS has shape 11, NAA is 86, iben is 7, and the data is floating point, so the array contains 1.7 megabytes worth of data. The colleague figured out that the following loop did the same job, without resorting to small arrays or getting a WS FULL (and it ran faster to boot):

    I{<-}0
    R{<-}({shape}POS){reshape}0
   L10:{->}(({shape}R)<I{<-}I+1)/L11
    R[I]{<-}{enclose}2 4 3 1{transpose}{mix}(2,NAA,NBEN){reshape}ANN[I;]
    {->}L10
   L11:ANN{<-}R

From the loop, you can easily see that the result is a vector and that each item is a 4-dimensional array. It turns out that the goal of the original statement is to transform the structure of ANN from

             [POS;2,NAA,iben] [16]
to
             [POS] [16;2;iben;NAA]

Too bad there wasn't a comment explaining this, not only to clarify the original statement, but also to aid in understanding subsequent statements that use the transformed ANN. (Yes, I realize that the splits and mix-eaches can be replaced with {enclose}[2 3 4 5], but this isn't available in the version of APL*PLUS we're using.)

Regarding nested blocks and lexically scoped subroutines: I think these fall into the category of "making major changes to the language". Given the lack of agreement among vendors about how to extend APL, I'm glad STSC didn't try to make these changes unilaterally. Doing so would almost certainly widen the gulf between major APL implementations in a way that would make porting applications much more difficult. Adding control structures today in no way rules out the possibility of adding other enhancements such as these in the future.

Jim


Home Page