Skip to content

Search

The Content Lake provides hybrid search that blends three search methods in a single query:

  1. Lexical search -- BM25 full-text matching for exact words and phrases
  2. Semantic search -- vector similarity that finds conceptually similar results even when they share no exact words
  3. Knowledge graph search -- entity-enriched context from the Packt Knowledge Graph

Results are returned at block level (individual paragraphs, headings, code blocks) by default. Set return_documents: true to receive full parent document bodies alongside block-level results.

Search Pipeline

Query
  |
  v
+--------------+     +--------------+     +--------------+     +----------+
|  Embedding   |---->|  Candidate   |---->|  Re-ranking  |---->| Response |
|  Generation  |     |  Retrieval   |     |  (scoring)   |     | Assembly |
+--------------+     +--------------+     +--------------+     +----------+

1. Embedding Generation

The query text is converted to a vector embedding. Used for vector and hybrid modes. In keyword mode, the query is passed directly to the full-text engine.

2. Candidate Retrieval

Up to candidate_pool_size candidates (default 500) are retrieved using ANN, BM25, or a weighted hybrid of both. Structured filters (tags, block types, document IDs) are applied natively during retrieval.

3. Re-ranking

A cross-encoder model re-scores candidates against the original query. The vector stage optimises for recall (wide net); the re-ranker optimises for relevance (precision). Candidates below min_score are discarded.

4. Response Assembly

The top limit results are returned with a search_id for pagination.

Search Modes

The search_mode parameter controls retrieval strategy:

  • hybrid (default) -- weighted combination of vector similarity and BM25 full-text matching. Best for general discovery.
  • keyword -- BM25 full-text ranking only. Best for known-item retrieval (exact titles, error messages, code snippets).
  • vector -- approximate nearest neighbour only. Best for recommendation-style queries ("content similar to X").

Filters

All filter fields are optional. When multiple fields are provided, they are combined with AND logic.

Field Type Description
tags string[] All listed tags must be present (AND logic)
tags_any string[] At least one tag must be present (OR logic)
block_types string[] Restrict to specific block types: H1, H2, P, Table, Image, Codeblock
document_ids string[] Scope search to specific documents

Result Diversification

By default, blocks from a single document can dominate results. The max_blocks_per_document parameter caps how many blocks from any one document appear, ensuring diversity without sacrificing relevance.

Pagination

Search uses a session-based pagination model. The first response includes a search_id; pass it with subsequent requests to get the next page with previously-returned results automatically excluded.

Hard limit: 500 total results across all pages of a search session.

API Reference

Endpoint

POST /v1/search

Request Parameters

Parameter Type Required Default Description
query string Yes -- Natural language search query
search_mode string No hybrid keyword, vector, or hybrid
limit integer No 20 Results per page (1-100). Auto-capped to 20 when return_documents is true
search_id string No -- ID from previous response for pagination
filters object No -- Structured field filters
candidate_pool_size integer No 500 Candidates before re-ranking (100-2000)
min_score float No 0.0 Minimum relevance score (0.0-1.0)
max_blocks_per_document integer No -- Per-document result cap
return_documents boolean No false Include full document bodies

Response

{
  "search_id": "srch_01JQ7Xk9m2v4Rz...",
  "results": [
    {
      "score": 0.87,
      "block": {
        "block_hash": "a1b2c3d4e5f6",
        "block_type": "P",
        "order": 12,
        "content": "Kubernetes uses a declarative model...",
        "highlight": "...<<Kubernetes>> uses a declarative model..."
      },
      "document": {
        "document_id": "9d73b576-e08a-470e-89a6-7f710b251b65",
        "title": "Container Orchestration with Kubernetes",
        "version": 3,
        "tags": ["system/lang/en", "source/type/ebook"]
      }
    }
  ],
  "total_candidates": 487,
  "page_results_returned": 20,
  "has_more": true
}
Field Type Description
search_id string Use to request next page
results array Ordered matching blocks with scores
results[].score float Relevance score (0.0-1.0)
results[].block object Block data: hash, type, order, content, highlight
results[].document object Parent document metadata
total_candidates integer Candidates before re-ranking
page_results_returned integer Cumulative results across all pages
has_more boolean False when 500-result cap reached or no more results

When return_documents: true, each result gains a document.body field with the full PCF block array.

Find semantically similar content by providing up to 10 source document IDs:

{
  "search": {
    "similar": {
      "ids": ["438d189a-bf37-486d-8944-345e9749f721"],
      "min_score": 0.10
    }
  }
}

The min_score ranges from 0.10 to 0.99. Below 0.10, semantic keyword search generally produces better results.

Similar document search can be combined with tag filters.

Usage Examples

Editorial / Publishing

Hybrid search lets editors find exactly which paragraphs across the catalogue cover a topic. Block-level results mean you see the relevant paragraph, not just a document title.

{
  "search": {
    "query": {
      "keywords": ["microservices authentication patterns"]
    },
    "filter": {
      "tags": [
        "system/lang/en",
        "source/type/ebook"
      ]
    }
  },
  "search_mode": "hybrid",
  "min_score": 0.3
}

An editor reviewing a new manuscript on API security can instantly see every existing paragraph that discusses authentication in a microservices context, making it easy to avoid duplication or identify gaps.

AI Copilot / RAG

Vector mode with tight filters is ideal for retrieval-augmented generation. Restrict results to content that is contractually cleared for AI inference and limit block types to prose and code.

{
  "search": {
    "query": {
      "keywords": ["graceful shutdown signal handling"]
    },
    "filter": {
      "tags": [
        "contract/allow/derive-content/ai/inference"
      ],
      "block_types": ["P", "Codeblock"]
    }
  },
  "search_mode": "vector",
  "min_score": 0.5
}

A high min_score keeps only highly relevant blocks in the context window, reducing noise and improving generation quality.

Performance Guidance

  • Tag-filtered queries: ~10-100 ms regardless of index size.
  • Full hybrid search: 1-5 s, proportional to index size. Always combine with tag filters when scope is known.
  • candidate_pool_size: trades latency for recall. 100 for fast filtered lookups, 500 (default) for general use, 1000+ for thorough recall.
  • return_documents: true: increases payload significantly. Prefer fetching documents via GET /v1/content/{id} when only a few full documents are needed.
  • Pagination: adds minimal overhead per page request.