Skip to content

Official JavaScript/TypeScript SDK for the ScrapeGraph AI API. Smart scraping, search, crawling, markdownify, agentic browser automation, scheduled jobs, and structured data extraction — all powered by AI.

Notifications You must be signed in to change notification settings

ScrapeGraphAI/scrapegraph-js

Repository files navigation

ScrapeGraph JS SDK

npm version License: MIT

ScrapeGraph API Banner

Official TypeScript SDK for the ScrapeGraph AI API. Zero dependencies.

Install

npm i scrapegraph-js
# or
bun add scrapegraph-js

Quick Start

import { smartScraper } from "scrapegraph-js";

const result = await smartScraper("your-api-key", {
  user_prompt: "Extract the page title and description",
  website_url: "https://example.com",
});

if (result.status === "success") {
  console.log(result.data);
} else {
  console.error(result.error);
}

Every function returns ApiResult<T> — no exceptions to catch:

type ApiResult<T> = {
  status: "success" | "error";
  data: T | null;
  error?: string;
  elapsedMs: number;
};

API

All functions take (apiKey, params) where params is a typed object.

smartScraper

Extract structured data from a webpage using AI.

const res = await smartScraper("key", {
  user_prompt: "Extract product names and prices",
  website_url: "https://example.com",
  output_schema: { /* JSON schema */ },  // optional
  number_of_scrolls: 5,                  // optional, 0-50
  total_pages: 3,                        // optional, 1-100
  stealth: true,                         // optional, +4 credits
  cookies: { session: "abc" },           // optional
  headers: { "Accept-Language": "en" },  // optional
  steps: ["Click 'Load More'"],          // optional, browser actions
  wait_ms: 5000,                         // optional, default 3000
  country_code: "us",                    // optional, proxy routing
  mock: true,                            // optional, testing mode
});

searchScraper

Search the web and extract structured results.

const res = await searchScraper("key", {
  user_prompt: "Latest TypeScript release features",
  num_results: 5,                  // optional, 3-20
  extraction_mode: true,           // optional, false for markdown
  output_schema: { /* */ },        // optional
  stealth: true,                   // optional, +4 credits
  time_range: "past_week",        // optional, past_hour|past_24_hours|past_week|past_month|past_year
  location_geo_code: "us",        // optional, geographic targeting
  mock: true,                      // optional, testing mode
});
// res.data.result (extraction mode) or res.data.markdown_content (markdown mode)

markdownify

Convert a webpage to clean markdown.

const res = await markdownify("key", {
  website_url: "https://example.com",
  stealth: true,         // optional, +4 credits
  wait_ms: 5000,         // optional, default 3000
  country_code: "us",    // optional, proxy routing
  mock: true,            // optional, testing mode
});
// res.data.result is the markdown string

scrape

Get raw HTML from a webpage.

const res = await scrape("key", {
  website_url: "https://example.com",
  stealth: true,       // optional, +4 credits
  branding: true,      // optional, extract brand design
  country_code: "us",  // optional, proxy routing
  wait_ms: 5000,       // optional, default 3000
});
// res.data.html is the HTML string
// res.data.scrape_request_id is the request identifier

crawl

Crawl a website and its linked pages. Async — polls until completion.

const res = await crawl(
  "key",
  {
    url: "https://example.com",
    prompt: "Extract company info",       // required when extraction_mode=true
    max_pages: 10,                        // optional, default 10
    depth: 2,                             // optional, default 1
    breadth: 5,                           // optional, max links per depth
    schema: { /* JSON schema */ },        // optional
    sitemap: true,                        // optional
    stealth: true,                        // optional, +4 credits
    wait_ms: 5000,                        // optional, default 3000
    batch_size: 3,                        // optional, default 1
    same_domain_only: true,               // optional, default true
    cache_website: true,                  // optional
    headers: { "Accept-Language": "en" }, // optional
  },
  (status) => console.log(status),        // optional poll callback
);

agenticScraper

Automate browser actions (click, type, navigate) then extract data.

const res = await agenticScraper("key", {
  url: "https://example.com/login",
  steps: ["Type user@example.com in email", "Click login button"],  // required
  user_prompt: "Extract dashboard data",  // required when ai_extraction=true
  output_schema: { /* */ },               // required when ai_extraction=true
  ai_extraction: true,                    // optional
  use_session: true,                      // optional
});

generateSchema

Generate a JSON schema from a natural language description.

const res = await generateSchema("key", {
  user_prompt: "Schema for a product with name, price, and rating",
  existing_schema: { /* modify this */ }, // optional
});

sitemap

Extract all URLs from a website's sitemap.

const res = await sitemap("key", {
  website_url: "https://example.com",
  headers: { /* */ },  // optional
  stealth: true,       // optional, +4 credits
  mock: true,          // optional, testing mode
});
// res.data.urls is string[]

getCredits / checkHealth

const credits = await getCredits("key");
// { remaining_credits: 420, total_credits_used: 69 }

const health = await checkHealth("key");
// { status: "healthy" }

history

Fetch request history for any service.

const res = await history("key", {
  service: "smartscraper",
  page: 1,       // optional, default 1
  page_size: 10, // optional, default 10
});

Examples

Find complete working examples in the examples/ directory:

Service Examples
SmartScraper basic, cookies, html input, infinite scroll, markdown input, pagination, stealth, with schema
SearchScraper basic, markdown mode, with schema
Markdownify basic, stealth
Scrape basic, stealth, with branding
Crawl basic, markdown mode, with schema
Agentic Scraper basic, AI extraction
Schema Generation basic, modify existing
Sitemap basic, with smartscraper
Utilities credits, health, history

Environment Variables

Variable Description Default
SGAI_API_URL Override API base URL https://api.scrapegraphai.com/v1
SGAI_DEBUG Enable debug logging ("1") off
SGAI_TIMEOUT_S Request timeout in seconds 120

Development

bun install
bun test          # 21 tests
bun run build     # tsup → dist/
bun run check     # tsc --noEmit + biome

License

MIT - ScrapeGraph AI

About

Official JavaScript/TypeScript SDK for the ScrapeGraph AI API. Smart scraping, search, crawling, markdownify, agentic browser automation, scheduled jobs, and structured data extraction — all powered by AI.

Topics

Resources

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •