Differences between Cython and Pyrex
- Package names and cross-directory imports
- List/Set/Dict comprehensions and inlined generator expressions
- Conditional expressions "x if b else y" (python 2.5)
- Extended iterable unpacking
- cdef inline
- Assignment on declaration (e.g. "cdef int spam = 5")
- 'by' expression in for loop (e.g. "for i from 0 <= i < 10 by 2")
- Automatic range conversion
- Boolean int type (e.g. it acts like a c int, but coerces to/from python as a boolean)
- Executable class bodies
- Improved error reporting (e.g. syntax error on non-existent builtin)
- Many optimizations (e.g. list indexing, tuple unpacking, caching of builtin names and integer constants)
- cpdef functions
- More friendly type casting
- Optional arguments in cdef/cpdef functions
- Function pointers in structs
- C++ Exception handling
- Source code encoding
- Automatic type checking
Differences between Cython and Pyrex
In general, Cython compiled modules are a lot faster than with Pyrex, as Cython specialises the generated code much better and thus generates more optimal code. Major examples include loops or function argument unpacking.
Package names and cross-directory imports
Just like in python.
List/Set/Dict comprehensions and inlined generator expressions
[expr(x) for x in A if predicate(x)] is available, implementing the full specification at http://www.python.org/dev/peps/pep-0202/ . Set and dict comprehensions, as introduced in Python 3.0, are equally supported. In Python 2.3, set comprehensions will automatically use the implementation in the sets module, whereas later Python versions will use the faster builtin type.
Looping is optimized if A is a list or tuple. The for ... from syntax is also available, e.g. [i*i for i from 0 <= i < 10]
As of Cython 0.13, inlined generator expressions are supported in some cases, so that these will work:
a = sum(i*i for i in range(100)) b = any(x for x in seq) c = all(t for t in tuples if len(t) > 0) d = list(items for items in seq) # same for set/dict
Note that generator expressions are not supported in general, except for the special cases above.
Conditional expressions "x if b else y" (python 2.5)
Conditional expressions as described in http://www.python.org/dev/peps/pep-0308/
X if C else Y
Only one of X and Y is evaluated, (depending on the value of C).
Extended iterable unpacking
Cython implements PEP 3132 (see http://www.python.org/dev/peps/pep-3132/). You can therefore do
a, *b, c = [1,2,3,4,5,6]
which is equivalent to
a,b,c = 1, [2,3,4,5], 6
This works for any iterable on the right hand side.
Module level functions can now be declared inline, with the inline keyword passed on to the c compiler. These can be as fast as macros.
cdef inline int something_fast(int a, int b): return a*a+b
Note that class-level cdef functions are handled via a virtual function table, so the compiler won't be able to inline them in almost all cases.
Assignment on declaration (e.g. "cdef int spam = 5")
In Pyrex, one must write
cdef int i, j, k i = 2 j = 5 k = 7
Now, with cython, one can write
cdef int i = 2, j = 5, k = 7
The expression on the right hand side can be arbitrarily complicated, e.g.
cdef int n = python_call(foo(x,y), a+b+c) - 32
'by' expression in for loop (e.g. "for i from 0 <= i < 10 by 2")
for i from 0 <= i < 10 by 2: print i
0 2 4 6 8
Note also the range conversion below, which is preferred over the for-from syntax.
Automatic range conversion
This will convert statements of the form for i in range(...) to for i from ... when i is any cdef'd integer type, and the direction (i.e. sign of step) can be determined.
WARNING: This may change the semantics if the range causes assignment to i to overflow. Specifically, if this option is set, an error will be raised before the loop is entered, whereas without this option the loop will execute until a overflowing value is encountered. If this affects you, change Cython/Compiler/Options.py (eventually there will be a better way to set this).
Boolean int type (e.g. it acts like a c int, but coerces to/from python as a boolean)
In c, ints are used for truth values. In python, any object can be used as a truth value (using the __nonzero__ method), but the canonical choices are the two boolean objects True and False. The bint of "boolean int" object is compiled to a c int, but get coerced to and from Cython as booleans. The return type of comparisons and several builtins is a bint as well. This allows one to avoid having to wrap things in bool().
For example, one can write
def is_equal(x): return x == y
which would return 1 or 0 in Pyrex, but returns True or False in python. One can declare variables and return values for functions to be of the bint type. For example
cdef int i = x cdef bint b = x
The first conversion would happen via x.__int__() wheras the second would happen via x.__nonzero__(). (Actually, if x is the python object True or False then no method call is made.)
Executable class bodies
Including a working classmethod
cdef class Blah: def some_method(self): print self some_method = classmethod(some_method) a = 2*3 print "hi", a
Improved error reporting (e.g. syntax error on non-existent builtin)
Many optimizations (e.g. list indexing, tuple unpacking, caching of builtin names and integer constants)
Runtime checks for (exact) lists and tuples are created and, if true, macros are used rather than generic API for element access, etc.
Cython adds a third function type on top of the usual def and cdef. If a function is declared cpdef it can be called from and overridden by both extension and normal python subclasses.
You can essentially think of a cpdef method as a cdef method + some extras. (That's how it's implemented at least.) First, it creates a def method that does nothing but call the underlying cdef method (and does argument unpacking/coercion if needed). At the top of the cdef method a little bit of code is added to check to see if it's overridden. Specifically, in pseudocode
if type(self) has a __dict__: foo = self.getattr('foo') if foo is not wrapper_foo: return foo(args) [cdef method body]
To detect whether or not a type has a dictionary, it just checks the tp_dictoffset slot, which is NULL (by default) for extension types, but non- null for instance classes. If the dictionary exists, it does a single attribute lookup and can tell (by comparing pointers) whether or not the returned result is actually a new function. If, and only if, it is a new function, then the arguments are packed into a tuple and the method is called. This is all very fast.
A flag is set so this lookup does not occur if one calls the method on the class directly, e.g.
cdef class A: cpdef foo(self): pass x = A() x.foo() # will check to see if overridden A.foo(x) # will call A's implementation whether overridden or not
See EarlyBindingForSpeed for explanation and usage tips.
More friendly type casting
In Pyrex, if one types <int>x where x is a Python object, one will get the memory address of x. Likewise, if one types <object>i where i is a C int, one will get an "object" at location i in memory. This leads to confusing results and segfaults.
In Cython <type>x will try and do a coercion (as would happen on assignment of x to a variable of type type) if exactly one of the types is a python object. It does not stop one from casting where there is no conversion (though it will emit a warning). If one really wants the address, cast to a void * first.
As in Pyrex <MyExtensionType>x will cast x to type MyExtensionType without any type checking. Cython supports the syntax <MyExtensionType?> to do the cast with type checking (i.e. it will throw an error if x is not a (subclass of) MyExtensionType.
Optional arguments in cdef/cpdef functions
Cython now supports optional arguments for cdef and cpdef functions. The syntax in the .pyx file remains as in Python, but one declares such functions in the .pxd file by writing cdef foo(x=*). The number of arguments may increase on subclassing, but the argument types and order must remain the same. There is a slight performance penalty in some cases when a cdef/cpdef function without any optional is overridden with one that does have default argument values.
For example, one can have the .pxd file
cdef class A: cdef foo(self) cdef class B(A) cdef foo(self, x=*) cdef class C(B): cpdef foo(self, x=*, int k=*)
with corresponding .pyx file
cdef class A: cdef foo(self): print "A" cdef class B(A) cdef foo(self, x=None) print "B", x cdef class C(B): cpdef foo(self, x=True, int k=3) print "C", x, k
Note: this also demonstrates how cpdef functions can override cdef functions.
Function pointers in structs
Functions declared in structs are automatically converted to function pointers for convenience (especially handy when defining C++ classes).
C++ Exception handling
(Thanks to Felix Wu)
cdef functions can now be declared as
cdef int foo(...) except + cdef int foo(...) except +TypeError cdef int foo(...) except +python_error_raising_function
in which case a Python exception will be raised when a C++ error is caught. See WrappingCPlusPlus for more details.
cdef import from means the same thing as cdef extern from
Source code encoding
Cython supports PEP 3120 and PEP 263, i.e. you can start your Cython source file with an encoding comment and generally write your source code in UTF-8. This impacts the encoding of byte strings and the conversion of unicode string literals like u'abcd' to unicode objects.
Automatic type checking
Rather than introducing a new keyword typecheck as explained in the Pyrex docs at http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/Manual/special_methods.html, Cython emits a (non-spoofable and faster) typecheck whenever isinstance is used with an extension type as the second parameter.