🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
-
Updated
Jun 17, 2026 - Python
🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
Python scraper based on AI
Extract Keywords from sentence or Replace keywords in sentences.
Browser automation CLI built for AI agents. Break through anti-bot walls, hand off to humans across platforms when stuck. Parallel multi-task execution, independent multi-session operation, isolated multi-account browsing.
Python package for scraping recipes data
ContextGem: Effortless LLM extraction from documents
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
A beginner-friendly yet powerful Python toolkit for financial analysis and automation — built to make modern investing accessible to everyone
Lightweight library for scraping web-sites with LLMs
⛏️ The extraction engine behind Maigret: turn any profile URL into a structured OSINT record across 150+ sites
📰 Let ChatGPT Summarize Hacker News for You
🤖 AI-powered web scraping editor with visual workflow builder. Build, test & deploy web scrapers using natural language. Powered by ScrapeGraphAI & LangGraph.
Pure Python, lightweight, Pillow-based solver for Amazon's text captcha.
Undetected web-scraping & seamless HTML parsing in Python!
Benchmarking PDF libraries
Apify SDK for Python—The official library for building Apify Actors: serverless cloud programs for web scraping, browser automation, data processing, and AI agents. Manages the Actor lifecycle, storages (datasets, key-value stores, request queues), events, proxies, and pay-per-event monetization. Built on top of the the Apify API Client.
A Python utility to digitize plots. A web-utility for non-batch mode https://tools.dilawars.me/tool/plot-extractor
A python client for the Sypht API
This repository provides usage examples for the Python module Newspaper3k.
PDFStract - Extract, Chunking and Embedding Layer in Your RAG Pipeline - Available as CLI - WEBUI - API
Add a description, image, and links to the data-extraction topic page so that developers can more easily learn about it.
To associate your repository with the data-extraction topic, visit your repo's landing page and select "manage topics."