Back to Marketplace
30-day free campaign

Run this helper free — no credit card

Every helper is free for 30 days. Answer 3 questions and get the full result in 2 minutes.

Start free →
FREE
Scanned
Make Money

docctl Skill

"Agent skill for docctl multi-format ingestion and provenance-grounded retrieval."

👁 1 views · 📦 0 installs

Install in one line

mfkvault install docctl-skill

Requires the MFKVault CLI. Prefer MCP?

New skill
No reviews yet
New skill
🤖 Claude Code Cursor💻 Codex🦞 OpenClaw
This helper was discovered by MFKVault crawlers from public sources. Original author retains all rights. To request removal: [email protected]
Community helper
This helper was discovered by MFKVault crawlers from public sources. MFKVault does not create, maintain, or guarantee the output of this helper. Results are AI-generated and may be incomplete, inaccurate, or outdated. Use at your own risk. Original author retains all rights. Request removal
FREE

Free to install — no account needed

Copy the command below and paste into your agent.

Instant access • No coding needed • No account needed

What you get in 5 minutes

  • Full skill code ready to install
  • Works with 4 AI agents
  • Lifetime updates included
SecureBe the first
Ready to run

Run this helper

Answer a few questions and let this helper do the work.

Advanced: use with your AI agent

Description

--- name: "docctl" description: "Agent skill for docctl multi-format ingestion and provenance-grounded retrieval." --- # docctl Skill ## When to use - Use for document-grounded Q&A over local corpora (`.pdf`, `.docx`, `.txt`, `.md`). - Use when answers must include provenance (`source`, `title`, `chunk_id`). - Use when the agent can execute shell commands. ## Scope and non-scope - In scope: - `docctl ingest`, `search`, `show`, `stats`, `doctor`, and `session` orchestration. - Full lifecycle behavior: bootstrap ingest plus retrieval loops. - Metadata-constrained retrieval using `doc_id`, `source`, and `title`. - Optional rerank controls and interpretation in retrieval workflows (`--rerank`, `--rerank-candidates`, session `rerank`, `rerank_candidates`). - Out of scope (agent-owned responsibilities): - Query rewriting and query decomposition. - Conversation context handling and prior-turn memory policy. - Project-specific instruction interpretation and policy reasoning. - Hybrid keyword/full-text retrieval design. - Reranker model training/tuning and low-level scoring implementation. ## Inputs and assumptions - Expected inputs: - user question, - user-preferred response language when observable from the conversation, - optional corpus path(s), - optional retrieval filters (`doc_id`, `source`, `title`), - optional index settings. - Default CLI assumptions: - `--index-path ./.docctl` - `--collection default` - `--json` enabled for machine consumption. - Safety assumption: - `ingest` is mutating and should only run under explicit lifecycle conditions. - Language assumption: - infer a working retrieval language from the strongest available signal in this order: 1. explicit user instruction, 2. language of the latest user turn, 3. language used by retrieved evidence or cited answers in the active workflow. - if the user asks in one language but the evidence and grounded answer are clearly in another, switch query rewriting and follow-up searches to the evidence language while preserving the user's requested answer language unless they ask to change it. ## Operational workflow (ordered) 1. Run readiness checks. - Default to `docctl catalog` for readiness and full index inventory. - Run `docctl stats` only when quick aggregate counts are specifically needed. - Run `docctl doctor` only when diagnostics are needed (for example command failures, config issues, or unexpected index behavior). 2. Apply bootstrap ingest rules (full lifecycle). - If index is missing or empty, run `docctl ingest <path>`. - Reingest only on explicit user intent or stale corpus signals (file updates/new files). 3. Prepare retrieval query in the agent layer. - Rewrite/expand/paraphrase outside `docctl` if needed. - Choose the search language from the current working retrieval language, not only from the latest user wording. - If the user asks in English but the relevant evidence and grounded answers are in German, reformulate the next retrieval attempts in German. 4. Execute retrieval (session-first). - Primary: `docctl session` with `op:"search"` for iterative loops. - For two or more read operations in one workflow, open one `docctl session` and send multiple NDJSON requests in that session. - Do not run multiple sequential one-shot read commands via repeated tool calls when `session` is available. - Secondary fallback: one-shot `docctl search`. 5. Run bounded evidence expansion loop. - If no or weak results, broaden query and/or relax filters. - If results indicate the corpus language differs from the current query language, retry in the corpus language before exhausting attempts. - Increase `top_k` per policy and retry up to max attempts. 6. Inspect top evidence chunks. - Call `show` for selected chunk IDs before synthesis when precision matters. - Treat high-value returned sentences/snippets as a lead and inspect the full returned chunk before final synthesis to capture qualifiers and surrounding context. 7. Synthesize answer with explicit citations. - Include provenance and state uncertainty when evidence is insufficient. ## Tool guidance (docctl command contracts) - `ingest`: - Mutating operation. - Use when index is uninitialized/empty or corpus is stale. - Avoid repeated reingest unless needed. - `search`: - Use for one-shot retrieval. - Do not chain multiple `search`/`show`/`stats`/`catalog` calls via separate tool invocations for the same workflow; switch to `session`. - Relevant options: `--doc-id`, `--source`, `--title`, `--top-k`, `--min-score`, `--rerank`, `--rerank-candidates`. - Rerank constraints: candidate depth must be in `[1, 100]` and greater than or equal to `top_k`. - `session`: - Use for iterative retrieval workflows. - Preferred default for multi-step work: keep one session open and submit all read operations (`search`, `show`, `stats`, `catalog`, `doctor`) as NDJSON lines. - Supported operations: `search`, `show`, `stats`, `catalog`, `doctor`. - Search request accepts optional fields: `doc_id`, `source`, `title`, `top_k`, `min_score`, `rerank`, `rerank_candidates`. - `show`: - Use to inspect and quote exact chunk evidence by `chunk_id`. - `stats`: - Do not run by default in retrieval loops. - Use when quick aggregate counts are needed. - `catalog`: - Use to inspect per-document inventory (`doc_id`, `source`, `title`, `units`, `chunks`) with summary stats. - `doctor`: - Do not run by default in retrieval loops because it adds latency. - Use only to diagnose environment/config failures or unexpected runtime behavior. ## Retrieval policy defaults - Attempt 1 (baseline): - `top_k=5`, user query as-is, apply user-specified filters. - Attempt 2 (broaden): - `top_k=10`, relax restrictive filters unless user explicitly requires them. - Attempt 3 (final): - rewrite/broaden query in agent layer, keep only essential filters. - Hard stop after 3 attempts. - Evidence selection rule: - prioritize chunks with clear provenance and direct semantic match. ## Failure handling and recovery - Missing corpus path and empty index: - ask for corpus path or explicit permission to ingest known path. - Empty index: - ingest if allowed by lifecycle policy; otherwise return actionable instruction. - Tool or schema errors: - surface exact corrective action (for example invalid field type in session request). - Cross-language mismatch: - if the question language and evidence language diverge, state that the search language was switched to match the indexed evidence and keep citations in the original evidence language. - No verifiable evidence after bounded retries: - return `cannot verify from indexed documents` and list missing information. ## Output contract for downstream agents Return a structured payload (or equivalent human-readable response) with: - `answer`: grounded response text. - `answer_language`: language used for the final answer. - `search_language`: language used for the final retrieval attempt. - `citations`: list of objects with: - `source` - `chunk_id` - `title` - `confidence`: one of `high`, `medium`, `low`. - `limitations`: explicit gaps or uncertainty. - `next_actions`: concrete follow-up steps. ## Minimal examples CLI ingest/catalog/search/show: ```bash uv run docctl --index-path ./.docctl --collection default --json ingest ./docs --recursive --allow-model-download uv run docctl --index-path ./.docctl --collection default --json catalog uv run docctl --index-path ./.docctl --collection default --json search "gateway diagnostics" --top-k 5 --title "operations-manual" uv run docctl --index-path ./.docctl --collection default --json show <chunk_id> ``` NDJSON session loop: ```bash cat <<'EOF' | uv run docctl --index-path ./.docctl --collection default session {"id":"q1","op":"search","query":"gateway diagnostics","top_k":5,"title":"operations-manual"} {"id":"q2","op":"show","chunk_id":"<chunk_id-from-q1>"} {"id":"q3","op":"catalog"} EOF ``` ## Evaluation checklist - Trigger correctness: - skill is used only for docctl-retrieval tasks, not unrelated workflows. - Tool-use correctness: - sequence follows readiness -> ingest-if-needed -> retrieval -> evidence inspection. - Citation completeness: - claims map to retrieved chunks with `source` and chunk metadata. - Boundary adherence: - query rewriting is handled by the agent, not attributed to `docctl`. - Regression control: - rerun workflow checks after any meaningful change to this skill text. ## Accuracy note on ranking behavior - Default `docctl` retrieval ranking is vector-distance based when reranking is not enabled. - `docctl` supports opt-in two-stage reranking: vector retrieval for candidates, then local cross-encoder reranking. - When reranking is enabled, hits may include `vector_rank` and `rerank_score` in addition to base hit fields.

Preview in:

Security Status

Scanned

Passed automated security checks

Time saved
How much time did this skill save you?

Related AI Tools

More Make Money tools you might like

paper-fetch

Free

Use when the user wants to download a paper PDF from a DOI, title, or URL via legal open-access sources. Tries Unpaywall, arXiv, bioRxiv/medRxiv, PubMed Central, and Semantic Scholar in order. Never uses Sci-Hub or paywall bypass.

Run free

Beautiful Prose (Claude Skill)

Free

A hard-edged writing style contract for timeless, forceful English prose without modern AI tics. Use when users ask for prose or rewrites that must be clean, exact, concrete, and free of AI cadence, filler, or therapeutic tone.

Run free

SkillCheck (Free)

Free

Validate Claude Code skills against Anthropic guidelines. Use when user says "check skill", "skillcheck", "validate SKILL.md", or asks to find issues in skill definitions. Covers structural and semantic validation. Do NOT use for anti-slop detection,

Run free

Design Checker Skill

Free

"Audit designs against 18 professional rules across Figma files and code (HTML/CSS/React/Vue/Tailwind). Detects framework automatically, runs code superpowers (aria, focus, contrast, tokens, responsive, motion, forms, navigation, spacing), audits for

Run free

Vibe Science v7.0 — TRACE

Free

Scientific research engine with agentic tree search. Infinite loops until discovery, rigorous tracking, adversarial review, serendipity preserved.

Run free

Rails Convention Engineer

Free

Rails 8.x application architecture, implementation, and review guidance for production codebases. Use when building or reviewing Ruby on Rails 8 features across models, controllers, routes, Hotwire, jobs, APIs, performance, security, and testing. Tri

Run free