Laravel adapter for Roach, the complete web scraping toolkit for PHP.
Support
Quality
Security
License
Reuse
w
webscraping_python_seleniumby gabrielfroes
Python 221 Version:Current License: No License (No License)
Web Scraping Javascript generated pages using Python and Selenium
Support
Quality
Security
License
Reuse
Python scripts for building 'Short Jokes' dataset, featured on Kaggle
Support
Quality
Security
License
Reuse
Scrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok
Support
Quality
Security
License
Reuse
A simple web scraper to extract Product Data and Pricing from Amazon
Support
Quality
Security
License
Reuse
A modern Python library for writing maintainable web scrapers.
Support
Quality
Security
License
Reuse
Transistor, a Python web scraping framework for intelligent use cases.
Support
Quality
Security
License
Reuse
Python library for retrieving free proxies (HTTP, HTTPS, SOCKS4, SOCKS5).
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
:chocolate_bar: learn to scrape the web with Node.js -- it tastes like chocolate
Support
Quality
Security
License
Reuse
Tutorial: Web scraping in Python with Beautiful Soup
Support
Quality
Security
License
Reuse
URLExtract is python class for collecting (extracting) URLs from given text based on locating TLD.
Support
Quality
Security
License
Reuse
A multi-purpose OSINT toolkit with a neat web-interface.
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)
Support
Quality
Security
License
Reuse
More than 50 web scraping examples using: Requests | Scrapy | Selenium | LXML | BeautifulSoup
Support
Quality
Security
License
Reuse
Scrape documentation into Meilisearch
Support
Quality
Security
License
Reuse
simple multi-level scraper json input/output for Cheerio
Support
Quality
Security
License
Reuse
g
goodreads-scraperby maria-antoniak
Jupyter Notebook 193 Version:Current License: Strong Copyleft (GPL-3.0)
A Python scraper for Goodreads books and reviews.
Support
Quality
Security
License
Reuse
Simple and fast scraper for Google
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
⛏ a library for scraping unreliable pages
Support
Quality
Security
License
Reuse
Download manga from MangaDex.org
Support
Quality
Security
License
Reuse
A Python-based web and data scraping tutorial
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Wwise *.BNK File Extractor
Support
Quality
Security
License
Reuse
Scrapes all the data of followers of any instagram account
Support
Quality
Security
License
Reuse
this shows how to use github actions to do periodic data scraping
Support
Quality
Security
License
Reuse
A scraper for https://bandcamp.com
Support
Quality
Security
License
Reuse
SEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type from searchresults for given keywords. Detect Ads or make automated screenshots. You can also fetch text content of urls provided in searchresults or by your own. It's usefull for SEO and business related research tasks.
Support
Quality
Security
License
Reuse
A step-by-step guide to writing a web scraper with Python
Support
Quality
Security
License
Reuse
Extract URLs to stylesheets, scripts, links, images or HTML imports from HTML
Support
Quality
Security
License
Reuse
Get preview data (a title, description, image, domain name) from a url. Library uses puppeteer headless browser to scrape the web site.
Support
Quality
Security
License
Reuse
A simple yet powerful module to retrieve organic search results and much more from Google.
Support
Quality
Security
License
Reuse
Some useful scripts simplifying bureaucracy
Support
Quality
Security
License
Reuse
A Chrome extension for writing custom web scraping programs and web automation programs. Just demonstrate how to collect the first row of data, then let the extension write the program for collecting all rows.
Support
Quality
Security
License
Reuse
Scraper of the Dutch real estate website www.funda.nl, implemented in Python with Scrapy
Support
Quality
Security
License
Reuse
Easy scraping library
Support
Quality
Security
License
Reuse
Object hydrators and array extraction
Support
Quality
Security
License
Reuse
PHP Link Checker
Support
Quality
Security
License
Reuse
SEO Macroscope is a website scanning tool, to check your website for broken links; including some technical SEO functionality, site scraping, Excel reporting, and more.
Support
Quality
Security
License
Reuse
A scraper that switches between normal mode and gentleman mode, built on Eletron, React
Support
Quality
Security
License
Reuse
Provides tools to analyze hashtags within posts scraped from TikTok.
Support
Quality
Security
License
Reuse
Node.js package to bypass CloudFlare's anti-bot JavaScript challenges
Support
Quality
Security
License
Reuse
l
Python 153 Version:Current License: No License (No License)
LLCD is a simple python scraper tool that downloads video lessons from Linkedin Learning
Support
Quality
Security
License
Reuse
Scrapy crawler to collect data on the back catalog of songs listed for sale.
Support
Quality
Security
License
Reuse
Data scraped from nflfastR package
Support
Quality
Security
License
Reuse
ScraperWiki Python library for scraping and saving data
Support
Quality
Security
License
Reuse
Pick the most common user-agents on the Internet 👻
Support
Quality
Security
License
Reuse
Lots and lots of web scrapers
Support
Quality
Security
License
Reuse
l
laravelby roach-php
Laravel adapter for Roach, the complete web scraping toolkit for PHP.
PHP 224Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
w
webscraping_python_seleniumby gabrielfroes
Web Scraping Javascript generated pages using Python and Selenium
Python 221Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
short-jokes-datasetby amoudgl
Python scripts for building 'Short Jokes' dataset, featured on Kaggle
Python 220Updated: 3 y ago License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
m
media-scraperby elvisyjlin
Scrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok
Python 217Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
amazon-scraperby scrapehero-code
A simple web scraper to extract Product Data and Pricing from Amazon
Python 216Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
spatulaby jamesturk
A modern Python library for writing maintainable web scrapers.
Python 214Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
transistorby bomquote
Transistor, a Python web scraping framework for intelligent use cases.
Python 213Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
proxyscrapeby JaredLGillespie
Python library for retrieving free proxies (HTTP, HTTPS, SOCKS4, SOCKS5).
Python 212Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pastebin_scraperby Critical-Start
Python 211Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
n
nutella-scrapeby okdistribute
:chocolate_bar: learn to scrape the web with Node.js -- it tastes like chocolate
JavaScript 209Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
t
trump-liesby justmarkham
Tutorial: Web scraping in Python with Beautiful Soup
Jupyter Notebook 209Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
U
URLExtractby lipoja
URLExtract is python class for collecting (extracting) URLs from given text based on locating TLD.
Python 208Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
seekrby seekr-osint
A multi-purpose OSINT toolkit with a neat web-interface.
Go 208Updated: 1 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
I
InputScannerby zseano
PHP 206Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
p
pupflareby unixfox
A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)
JavaScript 199Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
w
web-scrapingby lkuffo
More than 50 web scraping examples using: Requests | Scrapy | Selenium | LXML | BeautifulSoup
Python 198Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
d
docs-scraperby meilisearch
Scrape documentation into Meilisearch
Python 197Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
j
jsonframe-cheerioby gahabeen
simple multi-level scraper json input/output for Cheerio
JavaScript 196Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
goodreads-scraperby maria-antoniak
A Python scraper for Goodreads books and reviews.
Jupyter Notebook 193Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
i
images-scraperby pevers
Simple and fast scraper for Google
JavaScript 192Updated: 3 y ago License: Permissive (ISC)
Support
Quality
Security
License
Reuse
2
2016-new-coder-surveyby freeCodeCamp
R 191Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
scrapelibby jamesturk
⛏ a library for scraping unreliable pages
Python 190Updated: 2 y ago License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
m
mangadex-dlby frozenpandaman
Download manga from MangaDex.org
Python 184Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
p
python-web-scraping-tutorialby kjam
A Python-based web and data scraping tutorial
Python 183Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
p
py-linkedin-jobs-scraperby spinlud
Python 183Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
b
bnkextrby eXpl0it3r
Wwise *.BNK File Extractor
C++ 182Updated: 1 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
I
Instagram-Follower-Scraperby amitupreti
Scrapes all the data of followers of any instagram account
Python 176Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
g
gh-action-data-scrapingby swyxio
this shows how to use github actions to do periodic data scraping
JavaScript 176Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
b
bandcamp-scraperby masterT
A scraper for https://bandcamp.com
JavaScript 173Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SerpScrapby ecoron
SEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type from searchresults for given keywords. Detect Ads or make automated screenshots. You can also fetch text content of urls provided in searchresults or by your own. It's usefull for SEO and business related research tasks.
Python 172Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
f
first-web-scraperby ireapps
A step-by-step guide to writing a web scraper with Python
Python 170Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
o
oustby addyosmani
Extract URLs to stylesheets, scripts, links, images or HTML imports from HTML
JavaScript 170Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
l
link-preview-generatorby AndrejGajdos
Get preview data (a title, description, image, domain name) from a url. Library uses puppeteer headless browser to scrape the web site.
JavaScript 168Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
google-thisby LuanRT
A simple yet powerful module to retrieve organic search results and much more from Google.
JavaScript 168Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
munich-scriptsby okainov
Some useful scripts simplifying bureaucracy
Python 165Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
h
helenaby schasins
A Chrome extension for writing custom web scraping programs and web automation programs. Just demonstrate how to collect the first row of data, then let the extension write the program for collecting all rows.
JavaScript 165Updated: 3 y ago License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
f
funda-scraperby khpeek
Scraper of the Dutch real estate website www.funda.nl, implemented in Python with Scrapy
Python 164Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
e
Support
Quality
Security
License
Reuse
z
zend-hydratorby zendframework
Object hydrators and array extraction
PHP 162Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
f
Support
Quality
Security
License
Reuse
S
SEOMacroscopeby nazuke
SEO Macroscope is a website scanning tool, to check your website for broken links; including some technical SEO functionality, site scraping, Excel reporting, and more.
C# 161Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
scraperby videomanagertools
A scraper that switches between normal mode and gentleman mode, built on Eletron, React
TypeScript 159Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
tiktok-hashtag-analysisby bellingcat
Provides tools to analyze hashtags within posts scraped from TikTok.
Python 159Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
H
Humanoidby evyatarmeged
Node.js package to bypass CloudFlare's anti-bot JavaScript challenges
JavaScript 157Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
l
linkedin_learning_courses_downloaderby mypan
LLCD is a simple python scraper tool that downloads video lessons from Linkedin Learning
Python 153Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
j
juno_crawlerby mattmurray
Scrapy crawler to collect data on the back catalog of songs listed for sale.
Python 150Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
n
nflfastR-databy nflverse
Data scraped from nflfastR package
R 150Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
scraperwiki-pythonby sensiblecodeio
ScraperWiki Python library for scraping and saving data
Python 148Updated: 4 y ago License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
s
shadow-useragentby lobstrio
Pick the most common user-agents on the Internet 👻
Python 148Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
scrapersby ThaWeatherman
Lots and lots of web scrapers
Python 147Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse