Skip to main content

Scrape Robots

Convert webpages into clean HTML, LLM-ready Markdown, or screenshots with zero configuration.

Creating Scrape Robots

import { Scrape } from 'maxun-sdk';

const scraper = new Scrape({
apiKey: process.env.MAXUN_API_KEY
});

const robot = await scraper.create(
'Content Scraper',
'https://example.com/article',
{ formats: ['markdown', 'html'] }
);

Output Formats

Markdown

const robot = await scraper.create(
'Article Scraper',
'https://blog.example.com/post',
{ formats: ['markdown'] }
);

const result = await robot.run();
console.log(result.data.markdown);

HTML

const robot = await scraper.create(
'HTML Scraper',
'https://example.com',
{ formats: ['html'] }
);

const result = await robot.run();
console.log(result.data.html);

Screenshots

// Visible viewport
const robot = await scraper.create(
'Screenshot Bot',
'https://example.com',
{ formats: ['screenshot-visible'] }
);

// Full page
const robot = await scraper.create(
'Full Page Screenshot',
'https://example.com',
{ formats: ['screenshot-fullpage'] }
);

Multiple Formats

const robot = await scraper.create(
'Multi-Format Scraper',
'https://example.com',
{ formats: ['markdown', 'html', 'screenshot-visible'] }
);

const result = await robot.run();
console.log(result.data.markdown);
console.log(result.data.html);
console.log(result.data.screenshots);

Examples

RAG Pipeline

const robot = await scraper.create(
'RAG Content',
'https://docs.example.com/guide',
{ formats: ['markdown'] }
);

const result = await robot.run();
const markdown = result.data.markdown;

// Send to embedding service
await createEmbeddings(markdown);

Content Aggregation

const urls = [
'https://blog.example.com/post-1',
'https://blog.example.com/post-2'
];

for (const url of urls) {
const robot = await scraper.create(`Article ${url}`, url, {
formats: ['markdown']
});

const result = await robot.run();
await saveToDatabase(result.data.markdown);
}

Managing Scrape Robots

// Get all scrape robots
const robots = await scraper.getRobots();

// Get specific robot
const robot = await scraper.getRobot('robot-id');

// Delete robot
await scraper.deleteRobot('robot-id');

Running Scrape Robots

// Run immediately
const result = await robot.run();

// Run with timeout
const result = await robot.run({
timeout: 30000
});

For scheduling, webhooks, and other robot management features, see Robot Management.