Optimizing for..in constructs
Cython already contains optimisations for a couple of common for-in loop constructs:
for ... in range() will result in a C for-loop if the run variable is declared as a C integer.
for ... in some_dict, including the .iter*() methods, will result in a while-loop over PyDict_Next()
Note that Cython will also apply multiple optimisations to the same loop. For example, iterating over enumerate(some_dict) will result in an efficient PyDict_Next() loop with a separate counter.
This page was created mostly to note down some thoughts coming from prototyping an enumerate transform. They were noted in a mailing list discussion in May 2008.
- Status: implemented as of Cython 0.12.
The idea is trivial enough, turning
1 for idx, value in enumerate(L): 2 ...
into a regular iteration on the iterateable object and a manual int increment.
Of course, the problem is in the details, namely accessing the idx variable afterwards. Consider this Python session:
1 >>> for idx, value in enumerate(range(3)): 2 ... pass 3 ... 4 >>> print idx 5 2 6 >>> for idx2, value in enumerate(): 7 ... pass 8 ... 9 >>> print idx2 10 Traceback (most recent call last): 11 File "<stdin>", line 1, in <module> 12 NameError: name 'idx2' is not defined
So it looks like this must be transformed into a while-construct with manual iterator handling. (Though it should still be an ideal transform example, but may need to boost transforms somewhat first ).
Also there is an issue with overflows. For instance consider this:
1 cdef unsigned char idx 2 for idx, value in enumerate(xrange(1000)): 3 print idx
This should raise an exception after 255 iterations, not raise an exception before or overflow. A quick way to do this could be to check if idx <= 0 after the increment, which should be true only after an overflow both for signed and unsigned.
If idx is a Python object no check is needed, and furthermore, if it is a 32-bit int then one might consider disabling the check with a command-line "-fno-unlikely-overflow-checks" or similar, but I digress...)
The above enumerate() transform might also be considered a special case of a more general zip() transform, as enumerate(it) is only zip(count(),it). So this boils down to transforming
1 for a,b,...,z in zip(A,B,...,Z): # same number of targets and arguments to zip() 2 ...
into a while-loop with separate evaluations for each iterator/target pair.
Note that this cannot be done in all cases, as the input (A,B,...,Z) might be mutable sequences which can be altered inside the loop. So this optimisation is only safe to do for itertools.izip(), or zip() in Py3.