Laravel adapter for Roach, the complete web scraping toolkit for PHP.
Support
Quality
Security
License
Reuse
w
webscraping_python_seleniumby gabrielfroes
Python 
221
Version:Current
License: No License (No License)
Web Scraping Javascript generated pages using Python and Selenium
Support
Quality
Security
License
Reuse
Python scripts for building 'Short Jokes' dataset, featured on Kaggle
Support
Quality
Security
License
Reuse
Scrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok
Support
Quality
Security
License
Reuse
A simple web scraper to extract Product Data and Pricing from Amazon
Support
Quality
Security
License
Reuse
A modern Python library for writing maintainable web scrapers.
Support
Quality
Security
License
Reuse
Transistor, a Python web scraping framework for intelligent use cases.
Support
Quality
Security
License
Reuse
Python library for retrieving free proxies (HTTP, HTTPS, SOCKS4, SOCKS5).
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
:chocolate_bar: learn to scrape the web with Node.js -- it tastes like chocolate
Support
Quality
Security
License
Reuse
Tutorial: Web scraping in Python with Beautiful Soup
Support
Quality
Security
License
Reuse
URLExtract is python class for collecting (extracting) URLs from given text based on locating TLD.
Support
Quality
Security
License
Reuse
A multi-purpose OSINT toolkit with a neat web-interface.
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)
Support
Quality
Security
License
Reuse
More than 50 web scraping examples using: Requests | Scrapy | Selenium | LXML | BeautifulSoup
Support
Quality
Security
License
Reuse
Scrape documentation into Meilisearch
Support
Quality
Security
License
Reuse
simple multi-level scraper json input/output for Cheerio
Support
Quality
Security
License
Reuse
g
goodreads-scraperby maria-antoniak
Jupyter Notebook 
193
Version:Current
License: Strong Copyleft (GPL-3.0)
A Python scraper for Goodreads books and reviews.
Support
Quality
Security
License
Reuse
Simple and fast scraper for Google
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
⛏ a library for scraping unreliable pages
Support
Quality
Security
License
Reuse
Download manga from MangaDex.org
Support
Quality
Security
License
Reuse
A Python-based web and data scraping tutorial
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Wwise *.BNK File Extractor
Support
Quality
Security
License
Reuse
Scrapes all the data of followers of any instagram account
Support
Quality
Security
License
Reuse
this shows how to use github actions to do periodic data scraping
Support
Quality
Security
License
Reuse
A scraper for https://bandcamp.com
Support
Quality
Security
License
Reuse
SEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type from searchresults for given keywords. Detect Ads or make automated screenshots. You can also fetch text content of urls provided in searchresults or by your own. It's usefull for SEO and business related research tasks.
Support
Quality
Security
License
Reuse
A step-by-step guide to writing a web scraper with Python
Support
Quality
Security
License
Reuse
Extract URLs to stylesheets, scripts, links, images or HTML imports from HTML
Support
Quality
Security
License
Reuse
Get preview data (a title, description, image, domain name) from a url. Library uses puppeteer headless browser to scrape the web site.
Support
Quality
Security
License
Reuse
A simple yet powerful module to retrieve organic search results and much more from Google.
Support
Quality
Security
License
Reuse
Some useful scripts simplifying bureaucracy
Support
Quality
Security
License
Reuse
A Chrome extension for writing custom web scraping programs and web automation programs. Just demonstrate how to collect the first row of data, then let the extension write the program for collecting all rows.
Support
Quality
Security
License
Reuse
Scraper of the Dutch real estate website www.funda.nl, implemented in Python with Scrapy
Support
Quality
Security
License
Reuse
Easy scraping library
Support
Quality
Security
License
Reuse
Object hydrators and array extraction
Support
Quality
Security
License
Reuse
PHP Link Checker
Support
Quality
Security
License
Reuse
SEO Macroscope is a website scanning tool, to check your website for broken links; including some technical SEO functionality, site scraping, Excel reporting, and more.
Support
Quality
Security
License
Reuse
A scraper that switches between normal mode and gentleman mode, built on Eletron, React
Support
Quality
Security
License
Reuse
Provides tools to analyze hashtags within posts scraped from TikTok.
Support
Quality
Security
License
Reuse
Node.js package to bypass CloudFlare's anti-bot JavaScript challenges
Support
Quality
Security
License
Reuse
l
Python 
153
Version:Current
License: No License (No License)
LLCD is a simple python scraper tool that downloads video lessons from Linkedin Learning
Support
Quality
Security
License
Reuse
Scrapy crawler to collect data on the back catalog of songs listed for sale.
Support
Quality
Security
License
Reuse
Data scraped from nflfastR package
Support
Quality
Security
License
Reuse
ScraperWiki Python library for scraping and saving data
Support
Quality
Security
License
Reuse
Pick the most common user-agents on the Internet 👻
Support
Quality
Security
License
Reuse
Lots and lots of web scrapers
Support
Quality
Security
License
Reuse
l
laravelby roach-php
Laravel adapter for Roach, the complete web scraping toolkit for PHP.
PHP
224
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
w
webscraping_python_seleniumby gabrielfroes
Web Scraping Javascript generated pages using Python and Selenium
Python
221
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
s
short-jokes-datasetby amoudgl
Python scripts for building 'Short Jokes' dataset, featured on Kaggle
Python
220
Updated: 4 y ago
License: Strong Copyleft (GPL-2.0)
Support
Quality
Security
License
Reuse
m
media-scraperby elvisyjlin
Scrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok
Python
217
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
a
amazon-scraperby scrapehero-code
A simple web scraper to extract Product Data and Pricing from Amazon
Python
216
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
s
spatulaby jamesturk
A modern Python library for writing maintainable web scrapers.
Python
214
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
transistorby bomquote
Transistor, a Python web scraping framework for intelligent use cases.
Python
213
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
proxyscrapeby JaredLGillespie
Python library for retrieving free proxies (HTTP, HTTPS, SOCKS4, SOCKS5).
Python
212
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
pastebin_scraperby Critical-Start
Python
211
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
n
nutella-scrapeby okdistribute
:chocolate_bar: learn to scrape the web with Node.js -- it tastes like chocolate
JavaScript
209
Updated: 5 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
t
trump-liesby justmarkham
Tutorial: Web scraping in Python with Beautiful Soup
Jupyter Notebook
209
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
U
URLExtractby lipoja
URLExtract is python class for collecting (extracting) URLs from given text based on locating TLD.
Python
208
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
seekrby seekr-osint
A multi-purpose OSINT toolkit with a neat web-interface.
Go
208
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
I
InputScannerby zseano
PHP
206
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
p
pupflareby unixfox
A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)
JavaScript
199
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
w
web-scrapingby lkuffo
More than 50 web scraping examples using: Requests | Scrapy | Selenium | LXML | BeautifulSoup
Python
198
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
d
docs-scraperby meilisearch
Scrape documentation into Meilisearch
Python
197
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
j
jsonframe-cheerioby gahabeen
simple multi-level scraper json input/output for Cheerio
JavaScript
196
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
goodreads-scraperby maria-antoniak
A Python scraper for Goodreads books and reviews.
Jupyter Notebook
193
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
i
images-scraperby pevers
Simple and fast scraper for Google
JavaScript
192
Updated: 3 y ago
License: Permissive (ISC)
Support
Quality
Security
License
Reuse
2
2016-new-coder-surveyby freeCodeCamp
R
191
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
s
scrapelibby jamesturk
⛏ a library for scraping unreliable pages
Python
190
Updated: 2 y ago
License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
m
mangadex-dlby frozenpandaman
Download manga from MangaDex.org
Python
184
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
p
python-web-scraping-tutorialby kjam
A Python-based web and data scraping tutorial
Python
183
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
p
py-linkedin-jobs-scraperby spinlud
Python
183
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
b
bnkextrby eXpl0it3r
Wwise *.BNK File Extractor
C++
182
Updated: 2 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
I
Instagram-Follower-Scraperby amitupreti
Scrapes all the data of followers of any instagram account
Python
176
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
g
gh-action-data-scrapingby swyxio
this shows how to use github actions to do periodic data scraping
JavaScript
176
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
b
bandcamp-scraperby masterT
A scraper for https://bandcamp.com
JavaScript
173
Updated: 3 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SerpScrapby ecoron
SEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type from searchresults for given keywords. Detect Ads or make automated screenshots. You can also fetch text content of urls provided in searchresults or by your own. It's usefull for SEO and business related research tasks.
Python
172
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
f
first-web-scraperby ireapps
A step-by-step guide to writing a web scraper with Python
Python
170
Updated: 4 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
o
oustby addyosmani
Extract URLs to stylesheets, scripts, links, images or HTML imports from HTML
JavaScript
170
Updated: 2 y ago
License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
l
link-preview-generatorby AndrejGajdos
Get preview data (a title, description, image, domain name) from a url. Library uses puppeteer headless browser to scrape the web site.
JavaScript
168
Updated: 3 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
g
google-thisby LuanRT
A simple yet powerful module to retrieve organic search results and much more from Google.
JavaScript
168
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
munich-scriptsby okainov
Some useful scripts simplifying bureaucracy
Python
165
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
h
helenaby schasins
A Chrome extension for writing custom web scraping programs and web automation programs. Just demonstrate how to collect the first row of data, then let the extension write the program for collecting all rows.
JavaScript
165
Updated: 4 y ago
License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
f
funda-scraperby khpeek
Scraper of the Dutch real estate website www.funda.nl, implemented in Python with Scrapy
Python
164
Updated: 2 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
e
Support
Quality
Security
License
Reuse
z
zend-hydratorby zendframework
Object hydrators and array extraction
PHP
162
Updated: 4 y ago
License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
f
Support
Quality
Security
License
Reuse
S
SEOMacroscopeby nazuke
SEO Macroscope is a website scanning tool, to check your website for broken links; including some technical SEO functionality, site scraping, Excel reporting, and more.
C#
161
Updated: 2 y ago
License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
scraperby videomanagertools
A scraper that switches between normal mode and gentleman mode, built on Eletron, React
TypeScript
159
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
t
tiktok-hashtag-analysisby bellingcat
Provides tools to analyze hashtags within posts scraped from TikTok.
Python
159
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
H
Humanoidby evyatarmeged
Node.js package to bypass CloudFlare's anti-bot JavaScript challenges
JavaScript
157
Updated: 2 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
l
linkedin_learning_courses_downloaderby mypan
LLCD is a simple python scraper tool that downloads video lessons from Linkedin Learning
Python
153
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
j
juno_crawlerby mattmurray
Scrapy crawler to collect data on the back catalog of songs listed for sale.
Python
150
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
n
nflfastR-databy nflverse
Data scraped from nflfastR package
R
150
Updated: 4 y ago
License: No License (No License)
Support
Quality
Security
License
Reuse
s
scraperwiki-pythonby sensiblecodeio
ScraperWiki Python library for scraping and saving data
Python
148
Updated: 4 y ago
License: Permissive (BSD-2-Clause)
Support
Quality
Security
License
Reuse
s
shadow-useragentby lobstrio
Pick the most common user-agents on the Internet 👻
Python
148
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
scrapersby ThaWeatherman
Lots and lots of web scrapers
Python
147
Updated: 4 y ago
License: Permissive (MIT)
Support
Quality
Security
License
Reuse