IDL Array Storage and Indexing

QUESTION: I always get confused. Is IDL a column-major or row-major language?

ANSWER: Oh, my Gosh. I can't believe you are bringing this up again! There is no question about IDL that brings up so much confusion and general anguish. Here is an especially cogent article by Stein Vidar Hagfors Haugan, first published on the IDL newsgroup on 3 March 1999 under the title Ohmygod, another round of column/row-major.... It is still the best article I've read on the subject. He writes this in response to an IDL newsgroup firefight about this subject. If you have an hour and you are looking for amusement, I encourage you to read the entire set of articles.

Sorry about bringing this up once more, but the authors of the IDL 5.1/5.2 help pages managed to twist my mind so thoroughly, I had to come here again for support.

And yes, I know this subject is covered in the FAQ, under the heading I'm confused by the meaning of column-major and row-major, but a lot of the FAQ text uses the concepts colum-ORDER and row-ORDER. These concepts seem to be opposites, and if they're *not*, then I'm completely lost.... If they are opposites, this should be spelled out in a prominent place in the FAQ entry....(but let's first see if we all agree here!).

So, after reading the FAQ and the online help, I felt like asking for another FAQ entry called But I'm still confused about column-major and row-major!.

In roaming around to resolve my confusion, I tracked down one of William Clodius' postings in the original discussion that led to the FAQ entry, and got even more confused when I read that he AGREED with the first paragraph of the online help text on "Arrays and Matrices".

He was talking about the online help for version 5.1, where Fortran is referred to as a "column-major language". The current text (version 5.2) says the exact opposite, namely that C/Pascal etc are column-major!

This took me quite a while to figure out. No wonder we're confused, when RSI from one IDL version to another decides that all the other languages in the world have suddenly changed the way they index their arrays! And RSI's current conclusion disagrees with just about everybody else.

How can we trust any help page using the concepts row-major vs column-major IDL, when RSI appears to disagree with the rest of the world on whether IDL as a language is one or the other?

Now, this is my current understanding of this issue:

All three contributors to the current FAQ entry have got this right: You cannot (meaningfully) determine whether a language is row-major or column-major unless you assume a convention for referring to (indexing) a matrix element by it's row and column numbers.

There are two possibilities for the indexing of matrices:

      matrix(row,column)  or matrix(column,row)

According to reliable sources the overwhelmingly dominant way of specifying matrix elements in *mathematics* is, in LaTeX notation, a_{row,column}. The first index is the row number, and the last index is the column.

That is, you'd write a 3 by 4 matrix like this:

    a_{11} a_{12} a_{13} a_{14}
    a_{21} a_{22} a_{23} a_{24}
    a_{31} a_{32} a_{33} a_{34}

This indexing convention (row,column) is so common in mathematics that it forms the basis of the traditional classification of computer languages as either column-major or row-major, given their array storage/indexing rules.

The corollary that "everybody else" agrees on:

If the first index runs faster when stepping through the elements of an array, it's a "column-major language".

If the last index runs faster when stepping through the elements of an array, it's a "row-major language".

So, if anybody insists on classifying IDL as a language, it's a column-major language. (And I would really like to know if everybody agrees that column-major == row-order?)

Then, somebody started to write IDL's online "help"....sigh!

Given a computer language with multi-dimensional arrays, it's up to anyone to write a package of matrix routines using either one of the two possible indexing conventions.

It seems like RSI is trying to persuade everybody to switch to the (column,row) indexing convention - most likely because this is easier to map onto the [row][column] convention used in the numerical recipes library (in C). And the "help" page for Numerical Recipes Functions puts this in writing:

In IDL versions up to and including IDL version 3.6, mathematics functions based on Numerical Recipes algorithms required that input be in column-major format. This is no longer the case. Routines based on Numerical Recipes algorithms have been reworked and renamed, so that all IDL functions now expect input arrays to be in row-major format (composed of row vectors). [...] We recommend that all new IDL programs take advantage of the new names and input convention.

That's fine. I have no quarrel with this text, but my advise is to include a sentence "Row-major format in IDL means that matrices are indexed as matrix(column,row)".

Given how IDL prints out arrays and displays images, the new convention has the beneficial side effect of aligning the "image notation" and "matrix notation" for IDL, in that the first dimension will always be horizontal, and the second dimension will always be vertical - whether you're *thinking* about matrixes, *printing* matrices/arrays or *displaying* images. That is, you may think of indexing two-dimensional arrays as "matrix(x,y)" or "image(x,y)", "array(x,y)" etc.

Please, RSI, leave it at that. It's OK to opt for the (column,row) indexing notation for matrices, to recommend it to everybody, and to supply matrix routines that rely on that convention.

But don't try to "reclassify" IDL as a "row-major language"! I guarantee that there will be no end to the confusion this will cause.

The reason is that whenever somebody sees a phrase like "IDL indexes data in row-major format", most seasoned programmers will nod and say, "OK, so it's like C", and they've got the whole thing wrong.

It's just plain wrong to say that "IDL is indexing data in row-major format" like it's done in the online help in version 5.2. It's even worse trying to say that C and Pascal is using column-major format!

It is, however, correct to say (that is, I think it is correct to say) all of the following:

And a very handy mnemonic rule:

Note that this rule doesn't say anything about the language. It says something about how a matrix is stored in memory.

To come back to whether or not we can trust most online help pages speaking about column-major or row-major stuff, the answer seems to be yes.

The reason is that most help pages on matrix functions (e.g. the Numerical Recipes functions) seem to always stick to the concept "ZZZ-major matrices" (i.e. not messing about with a classification of languages as such). That means they cannot go wrong.

It's left up to the user, however, to remember what this means in terms of how to index their matrices - (column,row) or (row,column), and how to interpret a printout of such a matrix.

The help pages often use the phrase "composed of column vectors" to "explain" the meaning of "column-major format".

To me, the phrase "composed of column vectors" has zero information content, at best.

I mean, using RSI's notation, a column vector is a fltarr(1,N), right? And a row vector is fltarr(N,1), though the last dimension is cannibalized by IDL for your own good :-)

Thus any two-dimensional array or matrix is composed of column vectors, and at the same time it is in fact also composed of row vectors!

Take for example the explanation of the LUSOL routine. To me, it would be a *lot* easier to understand than the current version if it said e.g:

The LUSOL function is used in conjunction with the LUDC procedure to solve a set of n linear equations in n unknowns Ax=b. The parameter A is input not as the original matrix, but as its LU decomposition, created by the routine LUDC. The result is an n-element vector whose type is identical to A.

LUSOL assumes that the matrix A is indexed as A(column,row) i.e. that A is a row-major matrix.

[.....]

Keywords:

COLUMN

Set this keyword if the input matrix A is indexed as a column-major matrix, i.e. A(row,column).

So, my advice is this:

There! I feel a lot better now, I think. Unless, of course, somebody replies that I've got the whole thing backwards.

But wouldn't that just prove my point, that things are still confusing?

Regards,

Stein Vidar

Google
 
Web Coyote's Guide to IDL Programming