GhostFetch Skill

Name: GhostFetch Skill
Brand: MFKVault
Availability: InStock

Stealthy web fetcher that bypasses anti-bot protections. Fetches content from sites like X.com and converts to clean Markdown for AI agents.

Install in one line

CLI

$ mfkvault install ghostfetch-skill

Requires the MFKVault CLI. Prefer MCP?

New skill

No reviews yet

New skill

🤖 Claude Code⚡ Cursor💻 Codex🦞 OpenClaw

FREE

Free to install — no account needed

Copy the command below and paste into your agent.

Instant access • No coding needed • No account needed

What you get in 5 minutes

Full skill code ready to install
Works with 4 AI agents
Lifetime updates included

SecureBe the first

Description

--- name: ghostfetch description: Stealthy web fetcher that bypasses anti-bot protections. Fetches content from sites like X.com and converts to clean Markdown for AI agents. version: 1.0.0 author: iArsalanshah tags: - web-scraping - stealth - markdown - browser-automation - anti-bot-bypass --- # GhostFetch Skill Fetch web content from sites that block AI agents. Uses a stealthy headless browser with advanced fingerprinting to bypass anti-bot protections and returns clean Markdown. ## When to Use - Fetching content from X.com/Twitter posts - Reading articles from sites that block bots - Extracting content from JavaScript-heavy sites - Getting clean Markdown from any webpage for LLM consumption ## Prerequisites GhostFetch must be running as a service. Start it with: ```bash # Option 1: If installed via pip ghostfetch serve # Option 2: Docker docker run -p 8000:8000 iarsalanshah/ghostfetch ``` ## Usage ### Synchronous Fetch (Recommended) Use the `/fetch/sync` endpoint for simple, blocking requests: ```bash curl "http://localhost:8000/fetch/sync?url=https://example.com" ``` ### Python ```python import requests def ghostfetch(url: str, timeout: float = 120.0) -> dict: """ Fetch content from a URL using GhostFetch. Returns: dict with 'metadata' and 'markdown' keys """ response = requests.post( "http://localhost:8000/fetch/sync", json={"url": url, "timeout": timeout} ) response.raise_for_status() return response.json() # Example result = ghostfetch("https://x.com/user/status/123") print(result["markdown"]) ``` ### With SDK ```python from ghostfetch import fetch result = fetch("https://x.com/user/status/123") print(result["metadata"]["title"]) print(result["markdown"]) ``` ## Response Format ```json { "metadata": { "title": "Page Title", "author": "Author Name", "publish_date": "2024-01-15", "images": ["https://example.com/image.jpg"] }, "markdown": "# Page Title\n\nPage content in clean Markdown..." } ``` ## API Reference ### POST /fetch/sync Synchronous fetch - blocks until content is ready. **Request:** ```json { "url": "https://example.com", "context_id": "optional-session-id", "timeout": 120 } ``` **Response:** See Response Format above. ### GET /fetch/sync Same as POST but via query parameters: ``` GET /fetch/sync?url=https://example.com&timeout=60 ``` ### POST /fetch Async fetch - returns job ID immediately, poll for results. **Request:** ```json { "url": "https://example.com", "callback_url": "https://your-webhook.com/callback", "github_issue": 42 } ``` **Response:** ```json { "job_id": "abc123", "url": "https://example.com", "status": "queued" } ``` ### GET /job/{job_id} Check job status and get results. ### GET /health Health check endpoint. ## Configuration Set via environment variables when running the service: | Variable | Default | Description | |----------|---------|-------------| | `SYNC_TIMEOUT_DEFAULT` | 120 | Default timeout for sync requests (seconds) | | `MAX_SYNC_TIMEOUT` | 300 | Maximum allowed timeout | | `MAX_CONCURRENT_BROWSERS` | 2 | Concurrent browser contexts | | `MIN_DOMAIN_DELAY` | 10 | Seconds between requests to same domain | ## Error Handling | Status Code | Meaning | |-------------|---------| | 200 | Success | | 400 | Invalid request (non-retryable error) | | 502 | Fetch failed (retryable) | | 504 | Request timeout | ## Tips 1. **Use context_id for multi-step workflows** - Sessions are persisted per context, maintaining cookies between requests. 2. **Respect rate limits** - GhostFetch has built-in domain delays. Don't bypass these. 3. **Check metadata first** - The structured metadata often has what you need without parsing Markdown. ## Related Skills - `browser` - General browser automation - `web_fetch` - Simple HTTP fetching (for non-protected sites)

Preview in:

Security Status

Unvetted

Not yet security scanned

Time saved

How much time did this skill save you?

Related AI Tools

More Career Boost tools you might like

ru-text — Russian Text Quality

Free

Applies professional Russian typography, grammar, and style rules to improve text quality across content types

/forge：工作流总入口

Free

'Forge 工作流总入口。检查项目状态，推荐下一步该用哪个 skill。任何时候不知道下一步该干什么，就用 /forge。触发方式：用户说"forge"、"下一步"、"接下来做什么"、"继续"（在没有明确上下文时）。'

TypeScript React & Next.js Production Patterns

Free

Production-grade TypeScript reference for React & Next.js covering type safety, component patterns, API validation, state management, and debugging

Charles Proxy Session Extractor

Free

Extracts HTTP/HTTPS request and response data from Charles Proxy session files (.chlsj format), including URLs, methods, status codes, headers, request bodies, and response bodies. Use when analyzing captured network traffic from Charles Proxy debug

Java Backend Interview Simulator

Free

Simulates realistic Java backend technical interviews with customizable interviewer styles and candidate levels for Chinese tech companies

AI News & Trends Intelligence

Free

Fetches latest AI/ML news, trending open-source projects, and social media discussions from 75+ curated sources for comprehensive AI briefings