Skip to content

Firecrawl transition guide#41

Merged
VinciGit00 merged 8 commits intomainfrom
firecrawl-transition-guide
Apr 29, 2026
Merged

Firecrawl transition guide#41
VinciGit00 merged 8 commits intomainfrom
firecrawl-transition-guide

Conversation

@VinciGit00
Copy link
Copy Markdown
Member

No description provided.

VinciGit00 and others added 8 commits April 16, 2026 09:16
Comprehensive migration guide covering endpoint mapping, SDK migration
(Python + JS), authentication, response format differences, monitoring,
crawling, and a step-by-step migration checklist.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove verbose endpoint mapping code examples, key-differences tables,
environment variables, response format, and features-unique sections.
Keep compact comparison table, auth, SDK install, and migration checklist.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Match canonical SDK init patterns (env-var based), use ScrapeGraphAI
class in JS example, and clarify that the ApiResult envelope is applied
by the SDKs client-side, not by the raw HTTP API.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tested all methods 4x live against both APIs. Findings applied:
- v2 base URL is https://v2-api.scrapegraphai.com/api (not api.scrapegraphai.com/api/v2)
- Endpoint paths are /api/scrape, /api/extract, /api/search, /api/crawl, /api/monitor
- Monitor response returns cronId (not id)
- Added rows for Firecrawl /map, /batch/scrape, changeTracking, and actions
- Aligned SDK snippets with the v1->v2 idioms (ScrapeGraphAI, Request objects, FetchConfig)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… args

Verified against the live API that every Python sample in the original
guide raised `ValidationError` / `TypeError` on `scrapegraph-py==2.1.0`
because it still used the removed `ScrapeRequest` / `ExtractRequest` /
`SearchRequest` / `CrawlRequest` / `MonitorCreateRequest` wrappers.

Rewrites every snippet to the 2.1.0 direct positional/keyword form, syncs
the install matrix to `>= 2.1.0`, fixes `res.data.json` -> `json_data`
(the Pydantic alias), and adds a verification script that exercises
scrape, extract, search, crawl (start + get), and monitor (create +
activity + delete) end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@VinciGit00
Copy link
Copy Markdown
Member Author

Verification — original samples were broken on scrapegraph-py>=2.1.0

Every Python snippet in this guide used the typed-request wrappers (ScrapeRequest, ExtractRequest, SearchRequest, CrawlRequest, MonitorCreateRequest). Those classes are still importable on 2.1.0, but the SDK methods no longer accept them as the first positional argument — they expect url: str (or query, prompt, …) directly. Reproduced against the live API with the SDK at the version we ship today:

scrapegraph-py: 2.1.0
[scrape ]  RAISED ValidationError: 1 validation error for ScrapeRequest
url
  URL input should be a string or URL [type=url_type, ...]
[extract]  RAISED ValidationError: 1 validation error for ExtractRequest
prompt
  Input should be a valid string [type=string_type, ...]
[search ]  RAISED ValidationError: 1 validation error for SearchRequest
query
  Input should be a valid string [type=string_type, ...]
[crawl  ]  RAISED ValidationError: 1 validation error for CrawlRequest
url
  URL input should be a string or URL [type=url_type, ...]
[monitor]  RAISED TypeError: MonitorResource.create() missing 1 required positional argument: 'interval'

Fix in 904cd8d

  • Rewrote every Python snippet to the 2.1.0 direct positional/keyword style (sgai.scrape("https://…", formats=[…]), sgai.extract(prompt, url=…), sgai.monitor.create(url, "*/30 * * * *", …), etc.).
  • Bumped the SDK matrix to ≥ 2.1.0 (Python ≥ 3.12, Node ≥ 22) — was ≥ 2.0.1.
  • Fixed res.data.jsonres.data.json_data (the snake_case alias on ExtractResponse).
  • Added tests/python-v2.1.0/test_firecrawl_transition.py that exercises every rewritten sample.

Proof — rewritten samples against the live API

[scrape ] status=success elapsed_ms=50
[extract] status=success elapsed_ms=336
[search ] status=success elapsed_ms=5498
[crawl  ] status=success elapsed_ms=152
           crawl.get -> status=success job=completed 1/2
[monitor] status=success elapsed_ms=147
           monitor.activity -> status=success
           cleaned up cron_id=1c72e4f2-1b11-4c28-95ed-48281cee50c0

All five endpoints (scrape, extract, search, crawl + crawl.get, monitor + monitor.activity + cleanup) now succeed end-to-end. Run yourself with:

SGAI_API_KEY=sgai-... ./venv/bin/python tests/python-v2.1.0/test_firecrawl_transition.py

🤖 Generated with Claude Code

@VinciGit00 VinciGit00 merged commit a8f3cab into main Apr 29, 2026
2 checks passed
@VinciGit00 VinciGit00 deleted the firecrawl-transition-guide branch April 29, 2026 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant