Parallelization

Intra parallelism via needs

For parallelism, use the needs parameter:

from jina import Flow

f = (Flow()
     .add(name='p1', needs='gateway')
     .add(name='p2', needs='gateway')
     .add(name='p3', needs='gateway')
     .needs(['p1', 'p2', 'p3'], name='r1'))
../../../_images/simple-plot3.svg

p1, p2, p3 now subscribe to Gateway and conduct their work in parallel. The last .needs() blocks all Executors until they finish their work.

.needs() is syntax sugar and roughly equal to:

.add(needs=['p1', 'p2', 'p3'])

.needs_all() is syntax sugar and roughly equal to:

.add(needs=[all_orphan_executors_so_far])

“Orphan” Executors have no connected Executors to their outputs. The above code snippet can be also written as:

from jina import Flow

f = (Flow()
     .add(name='p1', needs='gateway')
     .add(name='p2', needs='gateway')
     .add(name='p3', needs='gateway')
     .needs_all())

Inter parallelism via parallel

Parallelism can also be performed inside an Executor using parallel. The example below starts three p1:

from jina import Flow

f = (Flow()
     .add(name='p1', parallel=3)
     .add(name='p2'))
../../../_images/parallel-explain.svg

Note

By default:

  • only one p1 will receive a message.

  • p2 will be called when any one of p1 finished.

To change the default behavior, you can add polling argument to .add(), e.g. .add(parallel=3, polling='ALL'). Specifically,

polling Who will receive from upstream? When will downstream be called?
ANY one of parallels one of parallels is finished
ALL all parallels all parallels are finished
ALL_ASYNC all parallels one of parallels is finished

You can combine inter and inner parallelization via:

from jina import Flow

f = (Flow()
     .add(name='p1', needs='gateway')
     .add(name='p2', needs='gateway')
     .add(name='p3', parallel=3)
     .needs(['p1', 'p3'], name='r1'))
../../../_images/simple-plot4.svg