px-asset-extract
Extracts individual assets from images as transparent PNGs with classification into 10 types using pure computer vision
Free to install β no account needed
Copy the command below and paste into your agent.
Instant access β’ No coding needed β’ No account needed
What you get in 5 minutes
- Full skill code ready to install
- Works with 7 AI agents
- Lifetime updates included
Description
--- name: px-asset-extract description: > Extract individual assets from images (slides, posters, infographics, diagrams) as transparent PNGs with a JSON manifest. Zero ML models, pure classical CV (PIL+numpy). Automatically segments, classifies (text, illustration, icon, graphic, line, dot, diagram, shadow, element), and crops each element with anti-aliased alpha transparency. Supports type filtering (--types/--exclude-types) and pre-computed region extraction (--regions) for bridging with visual grounding models. Trigger on 'extract assets from image', 'decompose slide into elements', 'get all icons from poster', 'extract illustrations', 'segment and crop', 'pull out individual elements', or when the user has an image and wants individual transparent PNGs of each element. --- # px-asset-extract: Image Asset Extraction ## What It Does Decomposes images into individual transparent PNG assets with classification and a JSON manifest. The full pipeline runs in 2-6 seconds on CPU with zero ML models: 1. **Background detection** β median color from image borders 2. **Foreground mask** β Euclidean color distance thresholding 3. **Character bridging** β dilation connects letters into words 4. **Connected components** β union-find with 8-connectivity 5. **Classification** β heuristic typing into 10 categories 6. **Text-line merging** β groups word fragments into text lines 7. **Alpha extraction** β anti-aliased transparent cropping 8. **Deduplication** β removes overlapping and oversized segments ## When to Use This | Scenario | Use px-asset-extract? | |----------|----------------------| | Extract all elements from a slide/poster | Yes β this is the primary use case | | Get only illustrations, skip text | Yes β use `--types illustration` or `--exclude-types text` | | Extract specific objects by description | Use with `--regions` + a grounding model (e.g., Florence-2) | | Remove background from a single photo | No β use a background removal model instead | | Segment a photo scene | No β use SAM/FastSAM for photographic content | | Image has textured/photographic background | Limited β works best on clean/solid backgrounds | ## Installation ```bash git clone https://github.com/JadeLiu-tech/px-asset-extract.git cd px-asset-extract pip install . ``` ## Usage ### CLI ```bash # Basic extraction px-extract <image> -o <output_dir> # Only extract illustrations and icons px-extract <image> -o <output_dir> --types illustration,icon # Extract everything except text and dots px-extract <image> -o <output_dir> --exclude-types text,dot,line # Extract from pre-computed bounding boxes (e.g. from px-ground) px-extract <image> -o <output_dir> --regions regions.json # Segment only β output JSON, no PNGs px-extract <image> --segments-only # Batch processing px-extract images/*.png -o output/ --batch # JSON output to stdout px-extract <image> -o <output_dir> --json --quiet ``` ### Python API ```python from px_asset_extract import extract_assets, load_regions # Full extraction result = extract_assets("slide.png", output_dir="assets/") for asset in result.assets: print(f"{asset.id}: {asset.label} at ({asset.bbox.x}, {asset.bbox.y}) -> {asset.file_path}") # Type filtering result = extract_assets("slide.png", output_dir="icons/", types=["illustration", "icon"]) result = extract_assets("slide.png", output_dir="graphics/", exclude_types=["text", "line", "dot"]) # Pre-computed regions (from grounding model output) regions = load_regions("grounded.json") result = extract_assets("slide.png", output_dir="targeted/", regions=regions) # Combine regions + type filter result = extract_assets("slide.png", output_dir="charts/", regions=regions, types=["chart"]) ``` ## CLI Options | Option | Default | Description | |--------|---------|-------------| | `-o`, `--output` | `assets` | Output directory | | `--bg-threshold` | `22.0` | Background color distance (lower = more sensitive) | | `--min-area` | `60` | Minimum segment area in pixels | | `--dilation` | `2` | Character gap bridging passes | | `--padding` | `10` | Extra pixels around each asset | | `--max-coverage` | `0.5` | Max fraction of image a segment can cover | | `--types` | | Only extract these types (comma-separated) | | `--exclude-types` | | Skip these types (comma-separated) | | `--regions` | | JSON file with bounding boxes (skips segmentation) | | `--segments-only` | | Output segment JSON without extracting PNGs | | `--no-visualization` | | Skip visualization image | | `--batch` | | Create subdirectories per image | | `--json` | | Output results as JSON to stdout | | `--quiet` | | Suppress progress messages | ## Output Each run produces: - `asset_NNN_<type>.png` β individual transparent PNGs - `manifest.json` β positions, types, and metadata for all assets - `visualization.png` β input image with color-coded bounding boxes ### Manifest format ```json { "source_image": "slide.png", "source_size": {"width": 1920, "height": 1080}, "background_color": [255, 255, 255], "num_assets": 44, "assets": [ { "id": "asset_000_illustration", "label": "illustration", "file": "asset_000_illustration.png", "position": {"x": 100, "y": 50, "width": 400, "height": 300}, "pixel_area": 120000 } ] } ``` ### Regions JSON format (for --regions) ```json [ {"x": 100, "y": 50, "width": 400, "height": 300, "label": "chart"}, {"x1": 600, "y1": 100, "x2": 800, "y2": 300, "label": "logo"} ] ``` Also supports `{"regions": [...]}` wrapper. Label defaults to `"region"` if omitted. ## Asset Types | Type | Detection Logic | |------|----------------| | `text` | dark_ratio > 0.4, uniform ink color | | `illustration` | Large (>1% image area), colorful | | `icon` | Small (<3000px area, <60px max dimension) | | `graphic` | Medium-sized, colored | | `line` | Thin (min dimension <=5px, extreme aspect ratio) | | `dot` | Very small (<150px area, <20px dimension) | | `diagram` | Low fill ratio (<0.25) | | `diagram_network` | Spans >80% of image, very low fill | | `shadow` | Bright (>200), low contrast, low saturation | | `element` | Catch-all for unclassified objects | ## Performance | Image type | Assets | Time | |-----------|--------|------| | Presentation slide | 22-44 | 2-6s | | Poster | 11 | 3.9s | | Scientific diagram | 43 | 4.2s | | Technical diagram | 42 | 4.5s | | Data chart | 26 | 4.8s | ## Dependencies Only `Pillow` and `numpy`. Optional `opencv-python` for better alpha edges.
Security Status
Scanned
Passed automated security checks
Related AI Tools
More Grow Business tools you might like
codex-collab
FreeUse when the user asks to invoke, delegate to, or collaborate with Codex on any task. Also use PROACTIVELY when an independent, non-Claude perspective from Codex would add value β second opinions on code, plans, architecture, or design decisions.
Rails Upgrade Analyzer
FreeAnalyze Rails application upgrade path. Checks current version, finds latest release, fetches upgrade notes and diffs, then performs selective upgrade preserving local customizations.
Asta MCP β Academic Paper Search
FreeDomain expertise for Ai2 Asta MCP tools (Semantic Scholar corpus). Intent-to-tool routing, safe defaults, workflow patterns, and pitfall warnings for academic paper search, citation traversal, and author discovery.
Hand Drawn Diagrams
FreeCreate hand-drawn Excalidraw diagrams, flows, explainers, wireframes, and page mockups. Default to monochrome sketch output; allow restrained color only for page mockups when the user explicitly wants webpage-like fidelity.
Move Code Quality Checker
FreeAnalyzes Move language packages against the official Move Book Code Quality Checklist. Use this skill when reviewing Move code, checking Move 2024 Edition compliance, or analyzing Move packages for best practices. Activates automatically when working
Claude Memory Kit
Free"Persistent memory system for Claude Code. Your agent remembers everything across sessions and projects. Two-layer architecture: hot cache (MEMORY.md) + knowledge wiki. Safety hooks prevent context loss. /close-day captures your day in one command. Z