Scrape API

Convert web pages to clean markdown

Transform webpages into clean markdown or HTML with automatic link extraction. Perfect for content analysis and building knowledge bases.

1 CreditCost
60s MaxTimeout
Markdown, HTMLFormats
Add your visual here

Quick Start

Get started with a simple API call.

const response = await fetch('https://api.crawlkit.sh/v1/crawl/scrape', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    url: 'https://example.com/article',
    options: {
      timeout: 30000,
      onlyMainContent: true,
      waitFor: '#content'
    }
  })
});

const { data } = await response.json();
console.log(data.markdown);
console.log('Links:', data.links);

Key Features

Everything you need for reliable scrape extraction.

Clean Markdown Output

Converts HTML to well-formatted markdown, removing ads, scripts, and navigation elements automatically.

Link Extraction

Automatically extracts and categorizes internal and external links for comprehensive site mapping.

Main Content Detection

Smart algorithm identifies and extracts only the main content, filtering out headers, footers, and sidebars.

Metadata Extraction

Captures page title, description, author, publish date, and other metadata automatically.

Custom Wait Conditions

Wait for specific selectors or delays before scraping to handle dynamic content loading.

Tag Filtering

Include or exclude specific HTML tags to customize the scraping output for your needs.

How It Works

Get started in minutes with our simple API.

1

Get Your API Key

Sign up for free and get your API key instantly. No credit card required.

2

Make Your First Request

Use our simple REST API with your favorite programming language.

3

Get Structured Data

Receive clean JSON responses ready to use in your application.

API Response

Get clean, structured JSON data with every request. Our API returns comprehensive data including status codes, timing information, and your remaining credits.

  • Consistent JSON structure
  • Detailed timing metrics
  • Credit usage tracking
  • Error handling built-in
Response
{
  "success": true,
  "data": {
    "url": "https://example.com/article",
    "finalUrl": "https://example.com/article",
    "markdown": "# Article Title\n\nThis is the main content...",
    "html": "<h1>Article Title</h1><p>This is the main content...</p>",
    "rawHtml": "<!DOCTYPE html>...",
    "metadata": {
      "title": "Article Title",
      "description": "Article description",
      "author": "John Doe"
    },
    "links": {
      "internal": ["https://example.com/page1"],
      "external": ["https://external.com"]
    },
    "timing": { "total": 1250 },
    "creditsUsed": 1,
    "creditsRemaining": 99
  }
}

Use Cases

Common applications for Scrape.

Content Archiving

Convert web articles and documentation to markdown for long-term storage and search.

Knowledge Base Building

Extract clean content from multiple sources to build comprehensive knowledge databases.

SEO Analysis

Analyze page content, links, and metadata for SEO optimization and competitive research.

Research & Monitoring

Track changes in web content, monitor competitor updates, and collect research data.

Frequently Asked Questions

Everything you need to know about the Scrape API.

The scrape API returns clean markdown, HTML, and raw HTML. You can also get structured metadata and categorized links (internal/external).

Our algorithm analyzes the page structure and removes navigation, ads, footers, and sidebars, extracting only the primary content. You can enable this with onlyMainContent: true.

Yes, use the waitFor option with a CSS selector or millisecond delay to wait for dynamic content to load before scraping.

Use includeTags to specify which tags to keep, or excludeTags to remove specific tags from the output.

Start Using Scrape API Today

Get 100 free credits to test the API. No credit card required.