Scrape Robots

Scrape robots let you convert any webpage into clean HTML or LLM-ready Markdown with no setup or recording required.
Just provide a URL, choose your output format, and Maxun handles the rest.

What Are Scrape Robots?

Scrape robots are built for fast, reliable content extraction. Unlike Extract robots (which learn actions through the Recorder), scrape robots fetch a webpage directly and return a clean, structured version of its content—perfect for AI, RAG pipelines, summarization, and content processing.

Scrape robots are available in both Maxun Cloud and Maxun OSS (v0.0.27 and later).

How It Works

Enter the URL you want to scrape.
Choose your output format:
- Clean HTML
- Markdown (LLM-ready)
Run the robot.
Download the results or use them programmatically via API.

That's all.

Output Formats

Clean HTML

No ads
No scripts or styling
Inline noise removed
Structure preserved
Ideal for downstream processing or storage

Markdown

Optimized for LLMs
Headings, lists, tables, links, and text extracted cleanly
Great for embeddings, summarization, and RAG workflows

Both formats are always available for every run.

Features

One-step scraping
Works on most public webpages
Consistent, clean output
API support
Cloud and OSS support
Scheduled runs
Webhook support
MCP support

Batch scraping is coming soon, allowing you to process multiple URLs in a single run.

When to Use Scrape Robots

Use Scrape robots when you need:

Fast content extraction
Clean HTML or Markdown for an LLM
Zero-configuration scraping
A simple URL-in → content-out workflow

If you need logins, interactions, pagination, or element-level data capture, use Extract robots instead.

What Are Scrape Robots?​

How It Works​

Output Formats​

Clean HTML​

Markdown​

Features​

When to Use Scrape Robots​