Search

The Content Lake provides hybrid search that blends three search methods in a single query:

Lexical search -- BM25 full-text matching for exact words and phrases
Semantic search -- vector similarity that finds conceptually similar results even when they share no exact words
Knowledge graph search -- entity-enriched context from the Packt Knowledge Graph

Results are returned at block level (individual paragraphs, headings, code blocks) by default. Set return_documents: true to receive full parent document bodies alongside block-level results.

Search Pipeline

Query
  |
  v
+--------------+     +--------------+     +--------------+     +----------+
|  Embedding   |---->|  Candidate   |---->|  Re-ranking  |---->| Response |
|  Generation  |     |  Retrieval   |     |  (scoring)   |     | Assembly |
+--------------+     +--------------+     +--------------+     +----------+

1. Embedding Generation

The query text is converted to a vector embedding. Used for vector and hybrid modes. In keyword mode, the query is passed directly to the full-text engine.

2. Candidate Retrieval

Up to candidate_pool_size candidates (default 500) are retrieved using ANN, BM25, or a weighted hybrid of both. Structured filters (tags, block types, document IDs) are applied natively during retrieval.

3. Re-ranking

A cross-encoder model re-scores candidates against the original query. The vector stage optimises for recall (wide net); the re-ranker optimises for relevance (precision). Candidates below min_score are discarded.

4. Response Assembly

The top limit results are returned with a search_id for pagination.

Search Modes

The search_mode parameter controls retrieval strategy:

hybrid (default) -- weighted combination of vector similarity and BM25 full-text matching. Best for general discovery.
keyword -- BM25 full-text ranking only. Best for known-item retrieval (exact titles, error messages, code snippets).
vector -- approximate nearest neighbour only. Best for recommendation-style queries ("content similar to X").

Filters

All filter fields are optional. When multiple fields are provided, they are combined with AND logic.

Field	Type	Description
`tags`	string[]	All listed tags must be present (AND logic)
`tags_any`	string[]	At least one tag must be present (OR logic)
`block_types`	string[]	Restrict to specific block types: `H1`, `H2`, `P`, `Table`, `Image`, `Codeblock`
`document_ids`	string[]	Scope search to specific documents

Result Diversification

By default, blocks from a single document can dominate results. The max_blocks_per_document parameter caps how many blocks from any one document appear, ensuring diversity without sacrificing relevance.

Pagination

Search uses a session-based pagination model. The first response includes a search_id; pass it with subsequent requests to get the next page with previously-returned results automatically excluded.

Hard limit: 500 total results across all pages of a search session.

API Reference

Endpoint

POST /v1/search

Request Parameters

Parameter	Type	Required	Default	Description
`query`	string	Yes	--	Natural language search query
`search_mode`	string	No	`hybrid`	`keyword`, `vector`, or `hybrid`
`limit`	integer	No	`20`	Results per page (1-100). Auto-capped to 20 when `return_documents` is true
`search_id`	string	No	--	ID from previous response for pagination
`filters`	object	No	--	Structured field filters
`candidate_pool_size`	integer	No	`500`	Candidates before re-ranking (100-2000)
`min_score`	float	No	`0.0`	Minimum relevance score (0.0-1.0)
`max_blocks_per_document`	integer	No	--	Per-document result cap
`return_documents`	boolean	No	`false`	Include full document bodies

Response

{
  "search_id": "srch_01JQ7Xk9m2v4Rz...",
  "results": [
    {
      "score": 0.87,
      "block": {
        "block_hash": "a1b2c3d4e5f6",
        "block_type": "P",
        "order": 12,
        "content": "Kubernetes uses a declarative model...",
        "highlight": "...<<Kubernetes>> uses a declarative model..."
      },
      "document": {
        "document_id": "9d73b576-e08a-470e-89a6-7f710b251b65",
        "title": "Container Orchestration with Kubernetes",
        "version": 3,
        "tags": ["system/lang/en", "source/type/ebook"]
      }
    }
  ],
  "total_candidates": 487,
  "page_results_returned": 20,
  "has_more": true
}

Field	Type	Description
`search_id`	string	Use to request next page
`results`	array	Ordered matching blocks with scores
`results[].score`	float	Relevance score (0.0-1.0)
`results[].block`	object	Block data: hash, type, order, content, highlight
`results[].document`	object	Parent document metadata
`total_candidates`	integer	Candidates before re-ranking
`page_results_returned`	integer	Cumulative results across all pages
`has_more`	boolean	False when 500-result cap reached or no more results

When return_documents: true, each result gains a document.body field with the full PCF block array.

Usage Examples

Editorial / Publishing

Hybrid search lets editors find exactly which paragraphs across the catalogue cover a topic. Block-level results mean you see the relevant paragraph, not just a document title.

{
  "search": {
    "query": {
      "keywords": ["microservices authentication patterns"]
    },
    "filter": {
      "tags": [
        "system/lang/en",
        "source/type/ebook"
      ]
    }
  },
  "search_mode": "hybrid",
  "min_score": 0.3
}

An editor reviewing a new manuscript on API security can instantly see every existing paragraph that discusses authentication in a microservices context, making it easy to avoid duplication or identify gaps.

AI Copilot / RAG

Vector mode with tight filters is ideal for retrieval-augmented generation. Restrict results to content that is contractually cleared for AI inference and limit block types to prose and code.

{
  "search": {
    "query": {
      "keywords": ["graceful shutdown signal handling"]
    },
    "filter": {
      "tags": [
        "contract/allow/derive-content/ai/inference"
      ],
      "block_types": ["P", "Codeblock"]
    }
  },
  "search_mode": "vector",
  "min_score": 0.5
}

A high min_score keeps only highly relevant blocks in the context window, reducing noise and improving generation quality.

Performance Guidance

Tag-filtered queries: ~10-100 ms regardless of index size.
Full hybrid search: 1-5 s, proportional to index size. Always combine with tag filters when scope is known.
candidate_pool_size: trades latency for recall. 100 for fast filtered lookups, 500 (default) for general use, 1000+ for thorough recall.
return_documents: true: increases payload significantly. Prefer fetching documents via GET /v1/content/{id} when only a few full documents are needed.
Pagination: adds minimal overhead per page request.