Enhancement proposal: Basic Type Inference
Author: Robert Bradshaw
Status: First pass implemented, not on by default.
Cython makes most of its speed gains, and derives it's ability to efficiently wrap external libraries, by explicitly declaring types. Many of these type declarations are redundant, and forgetting to declare a type can lead to drastic inefficiencies (e.g. converting back and forth between python objects and native C types). Reducing the volume of explicit type declarations will also bring Cython code closer to pure Python code.
I propose that as a first pass undeclared variables be typed according to the union of the types of their assignments. Many objects (e.g. literal integers, lists, etc.) would already have types associated to them which would be propagated throughout the code. The most generic type would be a Python object. Thus for
def foo(self, int x): a = x a *= 1.5 b = x*x c = x*x c = "hi"
The type of a would be a c double, the type of b a c int, and the type of c a Python object.
This would probably require a new phase (specifically, splitting analyze_types into two distinct phases, with a type resolution step between them). It would also probably require more unification of the Python and non-python types, which should be looked at in the context of the other proposals.
A more complete and powerful type inference engine should be developed, but would necessarily be more complicated as well.
The current implementation can be enabled with the infer_types directive. Circular inference is only supported for a single variable.
- locals() will not be an issue when doing type inference because it is read-only.
tree transforms can be used for this (and that also advocates a split of analyze_types). An implementation strategy would then be first applying a more generic "find usages" transform that records all usages of a symbol and links them together, and then have a seperate type inference transform afterwards (the find usages transform has a lot more uses).