Scrape Robots
Scrape robots let you convert any webpage into clean HTML or LLM-ready Markdown with no setup or recording required.
Just provide a URL, choose your output format, and Maxun handles the rest.
What Are Scrape Robots?
Scrape robots are built for fast, reliable content extraction. Unlike Extract robots (which learn actions through the Recorder), scrape robots fetch a webpage directly and return a clean, structured version of its content—perfect for AI, RAG pipelines, summarization, and content processing.
Scrape robots are available in both Maxun Cloud and Maxun OSS (v0.0.27 and later).
How It Works
- Enter the URL you want to scrape.
- Choose your output format:
- Clean HTML
- Markdown (LLM-ready)
- Run the robot.
- Download the results or use them programmatically via API.
That's all.
Output Formats
Clean HTML
- No ads
- No scripts or styling
- Inline noise removed
- Structure preserved
- Ideal for downstream processing or storage
Markdown
- Optimized for LLMs
- Headings, lists, tables, links, and text extracted cleanly
- Great for embeddings, summarization, and RAG workflows
Both formats are always available for every run.
Features
- One-step scraping
- Works on most public webpages
- Consistent, clean output
- API support
- Cloud and OSS support
- Scheduled runs
- Webhook support
- MCP support
Batch scraping is coming soon, allowing you to process multiple URLs in a single run.
When to Use Scrape Robots
Use Scrape robots when you need:
- Fast content extraction
- Clean HTML or Markdown for an LLM
- Zero-configuration scraping
- A simple URL-in → content-out workflow
If you need logins, interactions, pagination, or element-level data capture, use Extract robots instead.