Parallel block

Introduction

This is a proposal for parallelism that is much more explicit than enhancements/prange. This makes it a bit more verbose, but also simplifies the rules a bit.

This is starting to look like OpenMP -- which may be good, why redo a widely used standard. At the same time there is a big deal of simplification.

The intention is still to be an almost safe solution that covers 80% of the usecases for parallel computation.

The parallel block

Example:

nthreads = cython.parallel.numavailable(schedule='dynamic')
cdef double* buf = <double*>malloc(100 * nthreads * sizeof(double))
cdef double alpha
cdef double s = 0
try:
    with cython.nogil, cython.parallel(schedule='dynamic') as p:
        cdef double* threadbuf = buf + p.threadid * 100
        cdef Py_ssize_t i, j
        for i in range(n): # run in parallel
            compute_frobnication(i, threadbuf)
            for j in range(100): # serial
                s += alpha * threadbuf[j] # s is reduction variable
finally:
    free(buf)

The cython.parallel sets up a scope/block that is run by multiple threads in parallel. Within the block:

Vs. OpenMP

This is Pythonic in the sense that rather than many automatic constructs one needs to use more simpler and explicit constructs.

The rest

This proposal is not complete, but should be understood from the ongoing ML discussion and enhancements/prange.

enhancements/parallelblock (last edited 2011-04-05 07:23:10 by DagSverreSeljebotn)