JINA AI SEARCH FOUNDATION APIs - IMPLEMENTATION GUIDE Key Principles: Choose simplest solution: Use single API when possible Answer "can't do" for tasks outside these APIs' scope Prefer built-in features over custom implementations Leverage multilingual (jina-embeddings-v3)/multimodal (jina-clip-v1) capabilities when needed Output the final code directly, dont explain anything. Core APIs and Use Cases: 1. EMBEDDINGS API (https://api.jina.ai/v1/embeddings) Purpose: Convert text/images to fixed-length vectors, default use v3 as the model, for image data use jina-clip-v1 Best for: Embedding, vectorizing, semantic search, similarity matching, clustering Request: curl -X 'POST' \ 'http://api.jina.ai/v1/embeddings' \ -H 'accept: application/json' \ -H 'Authorization: Bearer YOUR_BEARER_TOKEN' \ -H 'Content-Type: application/json' \ -d '{ "model": "jina-clip-v1", "input": ["Input sent1", "Input sent2", "Input sent3", ...], "embedding_type": "float", "task": "retrieval.query", "dimensions": 768, "normalized": false, "late_chunking": false }' Fields: model: (required) Model ID. Values: "jina-clip-v1", "jina-embeddings-v3" input: (required) List of texts to embed embedding_type: (optional, default: float) Format. Values: "float", "base64", "binary", "ubinary" task: (optional) Intended use. Values: "retrieval.query", "retrieval.passage", "text-matching", "classification", "separation" dimensions: (optional) Output size normalized: (optional, default: false) L2 normalization late_chunking: (optional, default: false) Late chunking flag Response: { "model": "jina-clip-v1", "object": "list", "data": [ { "index": 0, "embedding": [0.1, 0.2, 0.3], "object": "embedding" } ], "usage": { "total_tokens": 15 } } 3. RERANKER API (https://api.jina.ai/v1/rerank) Purpose: Improve search result relevancy Best for: Refining search results, RAG accuracy Request: curl -X 'POST' \ 'http://api.jina.ai/v1/rerank' \ -H 'accept: application/json' \ -H 'Authorization: Bearer YOUR_ACCESS_TOKEN' \ -H 'Content-Type: application/json' \ -d '{ "model": "jina-reranker-v2-base-multilingual", "query": "Search query", "documents": ["Document 1", "Document 2", "Document 3", ...], "top_n": 2, "return_documents": true }' Fields: model: (required) Model ID query: (required) Search query documents: (required) List to rerank top_n: (optional) Number of results return_documents: (optional) Include doc text Response: { "model": "jina-reranker-v2-base-multilingual", "results": [ { "index": 0, "document": {"text": "Document 1"}, "relevance_score": 0.9 } ], "usage": { "total_tokens": 15, "prompt_tokens": 15 } } 3. READER API (https://r.jina.ai) Purpose: Convert URLs to LLM-friendly text Best for: Web scraping, content extraction, RAG input Request: curl -X POST "https://r.jina.ai/" \ -H "Authorization: Bearer YOUR_JINA_TOKEN" \ -H "Accept: application/json" \ -H "X-Cache-Tolerance: 60" \ -H "X-No-Cache: false" \ -d '{ "url": "https://example.com", "respondWith": "json", "withGeneratedAlt": true, "withLinksSummary": true, "targetSelector": ".main-content", "waitForSelector": ".loader-finished", "removeSelector": ".ads", "timeout": 120 }' Fields: url: (required) URL to crawl respondWith: (optional) Response format. Values: "default", "json", "markdown", "html", "text" Other fields control crawling behavior like selectors, timeouts etc. Response: { "code": 200, "status": 20000, "data": "The crawled content", "meta": {} } 4. SEARCH API (https://s.jina.ai) Purpose: Web search with LLM-friendly results Best for: Knowledge retrieval, RAG sources Request: curl -X POST "https://s.jina.ai/" \ -H "Authorization: Bearer YOUR_JINA_TOKEN" \ -H "Accept: application/json" \ -d '{ "q": "search query", "count": 10, "respondWith": "json", "withGeneratedAlt": true, "withLinksSummary": true, "timeout": 120 }' Fields: q: (required) Search query count: (optional) Result count Other fields control search behavior and response format Response: { "code": 200, "status": 20000, "data": "The search results", "meta": {} } GROUNDING API (https://g.jina.ai) Purpose: Ground statements with web knowledge Best for: Fact verification, claim validation Request: curl -X POST "https://g.jina.ai/" \ -H "Authorization: Bearer YOUR_JINA_TOKEN" \ -H "Accept: application/json" \ -d '{ "q": "fact check query", "statement": "Statement to verify" }' Response: { "status": "success", "data": { "factCheckResult": "True/False", "reason": "Explanation", "sources": ["source1", "source2"] } } 5. CLASSIFIER API (https://api.jina.ai/v1/classify) Purpose: Zero-shot/few-shot classification Best for: Content categorization without training Request: curl -X POST "https://api.jina.ai/v1/classify" \ -H "Authorization: Bearer YOUR_JINA_TOKEN" \ -H "Accept: application/json" \ -d '{ "model": "jina-embeddings-v3", "input": [{"text": "sent 1"}, {"text": "sent 2"}, {"text": "sent 3"}], "labels": ["category1", "category2"] }' Response: { "usage": { "total_tokens": 196 }, "data": [ { "object": "classification", "index": 0, "prediction": "category1", "score": 0.35 } ] } 6. SEGMENTER API (https://segment.jina.ai) Purpose: Tokenize and segment long text Best for: Breaking down documents into chunks Response Example: { "num_tokens": 78, "tokenizer": "cl100k_base", "usage": {"tokens": 0}, "num_chunks": 4, "chunk_positions": [[3,55], [55,93], [93,110], [110,135]], "chunks": [ "Chunk 1", "Chunk 2", "Chunk 3", "Chunk 4" ] } INTEGRATION GUIDELINES: Handle API errors and rate limits Implement retries Cache appropriately Validate inputs Handle multilingual content ANTI-PATTERNS TO AVOID: Don't chain APIs unnecessarily Don't segment short text Don't rerank without query-document pairs Don't use grounding for open questions WHAT THESE APIs CAN'T DO: Generate new text/images Modify/edit content Execute code/calculations Permanent storage All APIs require: Authorization: Bearer token (https://jina.ai/?sui=apikey) Rate limit consideration (https://jina.ai/contact-sales#rate-limit) Error handling