Welcome to Jina!#

Survey

Take our user experience survey to let us know your thoughts and help shape the future of Jina!

Jina lets you build multimodal AI services and pipelines that communicate via gRPC, HTTP and WebSockets, then scale them up and deploy to production. You can focus on your logic and algorithms, without worrying about the infrastructure complexity.

Jina provides a smooth Pythonic experience for serving ML models transitioning from local deployment to advanced orchestration frameworks like Docker-Compose, Kubernetes, or Jina AI Cloud. Jina makes advanced solution engineering and cloud-native technologies accessible to every developer.

Build and serve models for any data type and any mainstream deep learning framework.
Design high-performance services, with easy scaling, duplex client-server streaming, batching, dynamic batching, async/non-blocking data processing and any protocol.
Serve LLM models while streaming their output.
Docker container integration via Executor Hub, OpenTelemetry/Prometheus observability.
Streamlined CPU/GPU hosting via Jina AI Cloud.
Deploy to your own cloud or system with our Kubernetes and Docker Compose integration.

Wait, how is Jina different from FastAPI?

Jina's value proposition may seem quite similar to that of FastAPI. However, there are several fundamental differences:

Data structure and communication protocols

FastAPI communication relies on Pydantic and Jina relies on DocArray allowing Jina to support multiple protocols to expose its services. The support for gRPC protocol is specially useful for data intensive applications as for embedding services where the embeddings and tensors can be more efficiently serialized.

Advanced orchestration and scaling capabilities

Jina allows you to easily containerize and orchestrate your services and models, providing concurrency and scalability.
Jina lets you deploy applications formed from multiple microservices that can be containerized and scaled independently.

Journey to the cloud

Jina provides a smooth transition from local development (using DocArray) to local serving using Deployment and Flow to having production-ready services by using Kubernetes capacity to orchestrate the lifetime of containers.
By using Jina AI Cloud you have access to scalable and serverless deployments of your applications in one command.

Install#

Make sure that you have Python 3.7+ installed on Linux/macOS/Windows.

via PyPI

pip install -U jina

via Conda

conda install jina -c conda-forge

Getting Started#

Jina supports developers in building AI services and pipelines:

Build AI Services

Let’s build a fast, reliable and scalable gRPC-based AI service. In Jina we call this an Executor. Our simple Executor will wrap the StableLM LLM from Stability AI. We’ll then use a Deployment to serve it.

Note A Deployment serves just one Executor. To combine multiple Executors into a pipeline and serve that, use a Flow.

Let’s implement the service’s logic:

`executor.py`
from jina import Executor, requests from docarray import DocList, BaseDoc from transformers import pipeline class Prompt(BaseDoc): text: str class Generation(BaseDoc): prompt: str text: str class StableLM(Executor): def __init__(self, kwargs): super().__init__(kwargs) self.generator = pipeline( 'text-generation', model='stabilityai/stablelm-base-alpha-3b' ) @requests def generate(self, docs: DocList[Prompt], **kwargs) -> DocList[Generation]: generations = DocList[Generation]() prompts = docs.text llm_outputs = self.generator(prompts) for prompt, output in zip(prompts, llm_outputs): generations.append(Generation(prompt=prompt, text=output)) return generations

executor.py

from jina import Executor, requests
from docarray import DocList, BaseDoc

from transformers import pipeline


class Prompt(BaseDoc):
    text: str


class Generation(BaseDoc):
    prompt: str
    text: str


class StableLM(Executor):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.generator = pipeline(
            'text-generation', model='stabilityai/stablelm-base-alpha-3b'
        )

    @requests
    def generate(self, docs: DocList[Prompt], **kwargs) -> DocList[Generation]:
        generations = DocList[Generation]()
        prompts = docs.text
        llm_outputs = self.generator(prompts)
        for prompt, output in zip(prompts, llm_outputs):
            generations.append(Generation(prompt=prompt, text=output))
        return generations

Then we deploy it with either the Python API or YAML:

Python API: deployment.py YAML: deployment.yml

Python API: `deployment.py`	YAML: `deployment.yml`
from jina import Deployment from executor import StableLM dep = Deployment(uses=StableLM, timeout_ready=-1, port=12345) with dep: dep.block()	jtype: Deployment with: uses: StableLM py_modules: - executor.py timeout_ready: -1 port: 12345 And run the YAML Deployment with the CLI: `jina deployment --uses deployment.yml`

from jina import Deployment
from executor import StableLM

dep = Deployment(uses=StableLM, timeout_ready=-1, port=12345)

with dep:
    dep.block()

jtype: Deployment
with:
  uses: StableLM
  py_modules:
    - executor.py
  timeout_ready: -1
  port: 12345

And run the YAML Deployment with the CLI: jina deployment --uses deployment.yml

Use Jina Client to make requests to the service:

from jina import Client
from docarray import DocList, BaseDoc


class Prompt(BaseDoc):
    text: str


class Generation(BaseDoc):
    prompt: str
    text: str


prompt = Prompt(
    text='suggest an interesting image generation prompt for a mona lisa variant'
)

client = Client(port=12345)  # use port from output above
response = client.post(on='/', inputs=[prompt], return_type=DocList[Generation])

print(response[0].text)

a steampunk version of the Mona Lisa, incorporating mechanical gears, brass elements, and Victorian era clothing details

Build Pipelines

Sometimes you want to chain microservices together into a pipeline. That’s where a Flow comes in.

A Flow is a DAG pipeline, composed of a set of steps, It orchestrates a set of Executors and a Gateway to offer an end-to-end service.

Note If you just want to serve a single Executor, you can use a Deployment.

For instance, let’s combine our StableLM language model with a Stable Diffusion image generation model. Chaining these services together into a Flow will give us a service that will generate images based on a prompt generated by the LLM.

`text_to_image.py`
import numpy as np from jina import Executor, requests from docarray import BaseDoc, DocList from docarray.documents import ImageDoc class Generation(BaseDoc): prompt: str text: str class TextToImage(Executor): def __init__(self, kwargs): super().__init__(kwargs) from diffusers import StableDiffusionPipeline import torch self.pipe = StableDiffusionPipeline.from_pretrained( "CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16 ).to("cuda") @requests def generate_image(self, docs: DocList[Generation], **kwargs) -> DocList[ImageDoc]: result = DocList[ImageDoc]() images = self.pipe( docs.text ).images # image here is in [PIL format](https://pillow.readthedocs.io/en/stable/) result.tensor = np.array(images) return result

text_to_image.py

import numpy as np
from jina import Executor, requests
from docarray import BaseDoc, DocList
from docarray.documents import ImageDoc


class Generation(BaseDoc):
    prompt: str
    text: str


class TextToImage(Executor):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        from diffusers import StableDiffusionPipeline
        import torch

        self.pipe = StableDiffusionPipeline.from_pretrained(
            "CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16
        ).to("cuda")

    @requests
    def generate_image(self, docs: DocList[Generation], **kwargs) -> DocList[ImageDoc]:
        result = DocList[ImageDoc]()
        images = self.pipe(
            docs.text
        ).images  # image here is in [PIL format](https://pillow.readthedocs.io/en/stable/)
        result.tensor = np.array(images)
        return result

Build the Flow with either Python or YAML:

Python API: flow.py YAML: flow.yml

Python API: `flow.py`	YAML: `flow.yml`
from jina import Flow from executor import StableLM from text_to_image import TextToImage flow = ( Flow(port=12345) .add(uses=StableLM, timeout_ready=-1) .add(uses=TextToImage, timeout_ready=-1) ) with flow: flow.block()	jtype: Flow with: port: 12345 executors: - uses: StableLM timeout_ready: -1 py_modules: - executor.py - uses: TextToImage timeout_ready: -1 py_modules: - text_to_image.py Then run the YAML Flow with the CLI: `jina flow --uses flow.yml`

from jina import Flow
from executor import StableLM
from text_to_image import TextToImage

flow = (
    Flow(port=12345)
    .add(uses=StableLM, timeout_ready=-1)
    .add(uses=TextToImage, timeout_ready=-1)
)

with flow:
    flow.block()

jtype: Flow
with:
    port: 12345
executors:
  - uses: StableLM
    timeout_ready: -1
    py_modules:
      - executor.py
  - uses: TextToImage
    timeout_ready: -1
    py_modules:
      - text_to_image.py

Then run the YAML Flow with the CLI: jina flow --uses flow.yml

Then, use Jina Client to make requests to the Flow:

from jina import Client
from docarray import DocList, BaseDoc
from docarray.documents import ImageDoc


class Prompt(BaseDoc):
    text: str


prompt = Prompt(
    text='suggest an interesting image generation prompt for a mona lisa variant'
)

client = Client(port=12345)  # use port from output above
response = client.post(on='/', inputs=[prompt], return_type=DocList[ImageDoc])

response[0].display()

Next steps#

Learn DocArray API

DocArray is the foundational data structure of Jina. Before starting Jina, first learn DocArray to quickly build a PoC.

Learn Executor

Executor is a Python class that can serve logic using Documents.

Learn Deployment

Deployment serves an Executor as a scalable service making it available to receive Documents using gRPC or HTTP.

Learn Flow

Flow orchestrates Executors using different Deployments into a processing pipeline to accomplish a task.

Learn Gateway

The Gateway is a microservice that serves as the entrypoint of a Flow. It exposes multiple protocols for external communications and routes all internal traffic.

Explore Executor Hub

Executor Hub allows you to containerize, share, explore and make Executors ready for the cloud.

Deploy a Flow to Cloud

Jina AI Cloud is the MLOps platform for hosting Jina projects.

Support#

Join our Discord community and chat with other community members about ideas.
Subscribe to the latest video tutorials on our YouTube channel

Join Us#

Jina is backed by Jina AI and licensed under Apache-2.0.

Index | Module Index