免费 IP 代理池。Scrapy 爬虫框架插件
Support
Quality
Security
License
Reuse
下载搜狗、百度、QQ输入法的词库文件的 python 爬虫,可用于构建不同行业的词汇库
Support
Quality
Security
License
Reuse
Ruby gem to detect bots and crawlers via the user agent
Support
Quality
Security
License
Reuse
PhantomJS Downloader for Scrapy, Yeah!
Support
Quality
Security
License
Reuse
批量下载收藏的抖音短视频
Support
Quality
Security
License
Reuse
抖音去水印、快手去水印、微博、网易云音乐视频解析接口
Support
Quality
Security
License
Reuse
新浪微博相册大图多线程爬虫。
Support
Quality
Security
License
Reuse
Simple web crawler written in Python
Support
Quality
Security
License
Reuse
s
scaleable-crawler-with-docker-clusterby tonywangcn
Python 97 Version:Current License: No License (No License)
a scaleable and efficient crawelr with docker cluster , crawl million pages in 2 hours with a single machine
Support
Quality
Security
License
Reuse
下载 tumblr 中喜欢的内容
Support
Quality
Security
License
Reuse
雪球网沪深全站股票评论爬虫
Support
Quality
Security
License
Reuse
An extensible crawler for downloading Android applications in third-party markets.
Support
Quality
Security
License
Reuse
c
compute-image-windowsby GoogleCloudPlatform
PowerShell 96 Version:Current License: Permissive (Apache-2.0)
Windows agents and scripts for Google Compute Engine images.
Support
Quality
Security
License
Reuse
GOPA, a spider written in Go.(NOTE: this project moved to https://github.com/infinitbyte/gopa )
Support
Quality
Security
License
Reuse
爬虫代理IP池服务,可供其他爬虫程序通过restapi获取
Support
Quality
Security
License
Reuse
🕷some website spider application base on proxy pool (support http & websocket)
Support
Quality
Security
License
Reuse
Scrapy-based Crawlers for news of Taiwan
Support
Quality
Security
License
Reuse
Domain names collector - Crawl website and find domain name with their availability status.
Support
Quality
Security
License
Reuse
BeautifulSoup 4 for Python 3.3
Support
Quality
Security
License
Reuse
知道创宇爬虫题目 持续更新版本
Support
Quality
Security
License
Reuse
Get itbooks from ebooks's website for free,such as allitebooks,digilibraries,etc
Support
Quality
Security
License
Reuse
知乎用户公开个人信息爬虫, 能够爬取用户关注关系,基于Python、使用代理、多线程
Support
Quality
Security
License
Reuse
nodejs爬取金数据表格,进行下载,封装到electron,提供可视化界面,方便用户操作
Support
Quality
Security
License
Reuse
L
Leetcode-Questions-Scraperby Bishalsarang
Python 93 Version:Current License: No License (No License)
Scrape Algorithm Questions from leetcode and generate html and epub file
Support
Quality
Security
License
Reuse
Source of the book "Clean Architectures in Python"
Support
Quality
Security
License
Reuse
Web Crawler, Scanner, and Analyzer Framework (Shell-Script based)
Support
Quality
Security
License
Reuse
Web Crawlers.
Support
Quality
Security
License
Reuse
A New Multithreading Crawler Supports Multiple Websites
Support
Quality
Security
License
Reuse
M
Multimodal_Retrieval.pytorchby gujiuxiang
Python 92 Version:Current License: No License (No License)
Multi-Modal and Cross-Modal Retrieval
Support
Quality
Security
License
Reuse
python Movie Info Web Crawler
Support
Quality
Security
License
Reuse
python爬虫练习
Support
Quality
Security
License
Reuse
Simple, clear and fast Web Crawler framework build on python3.6+, powered by asyncio.
Support
Quality
Security
License
Reuse
新闻爬虫,爬取新浪、搜狐、新华网即时财经新闻。
Support
Quality
Security
License
Reuse
Run a Scrapy spider programmatically from a script or a Celery task - no project required.
Support
Quality
Security
License
Reuse
D
Python 90 Version:Current License: No License (No License)
Django based application that allows creating, deploying and running Scrapy spiders in a distributed manner
Support
Quality
Security
License
Reuse
自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫 / Web / 数据分析 / 机器学习 方向
Support
Quality
Security
License
Reuse
A simple script to crawl Google Profile pages and extract their information as structured data
Support
Quality
Security
License
Reuse
Larbin Web Crawler
Support
Quality
Security
License
Reuse
知乎爬虫,爬取答案、问题、收藏夹、专栏等,并以HTML或Markdown形式收藏知乎优质内容。
Support
Quality
Security
License
Reuse
爬取CSDN上的博客文章
Support
Quality
Security
License
Reuse
QQ空间(Qzone)爬虫,手机扫描登陆后即可并发下载相册原图/视频,嗯~就是这么简单
Support
Quality
Security
License
Reuse
一款精简版github信息泄露搜集工具
Support
Quality
Security
License
Reuse
Continuously search 4chan (and other imageboards) threads for images/webms and download them
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
Web crawler to download pictures from zhihu.com
Support
Quality
Security
License
Reuse
简简单单spider
Support
Quality
Security
License
Reuse
Opensource Framework Crawler in Node.js.
Support
Quality
Security
License
Reuse
微博爬虫,一个基于Scrapy框架的轻量微博爬虫,Sina Weibo Spider
Support
Quality
Security
License
Reuse
Dataset of Adult Image Metadata for Training Spam Detection Models
Support
Quality
Security
License
Reuse
Simple library for exploring/scraping the web or testing a website you’re developing
Support
Quality
Security
License
Reuse
S
Scrapy_IPProxyPoolby monkey-soft
免费 IP 代理池。Scrapy 爬虫框架插件
Python 99Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
T
ThesaurusSpiderby WuLC
下载搜狗、百度、QQ输入法的词库文件的 python 爬虫,可用于构建不同行业的词汇库
Python 99Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
c
crawler_detectby loadkpi
Ruby gem to detect bots and crawlers via the user agent
Ruby 99Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
scrapy-phantomjs-downloaderby flisky
PhantomJS Downloader for Scrapy, Yeah!
Python 98Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
d
Support
Quality
Security
License
Reuse
M
MediaQuickServerby zbfzn
抖音去水印、快手去水印、微博、网易云音乐视频解析接口
Java 98Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
W
Support
Quality
Security
License
Reuse
p
python-webcrawlerby ewa
Simple web crawler written in Python
Python 97Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
scaleable-crawler-with-docker-clusterby tonywangcn
a scaleable and efficient crawelr with docker cluster , crawl million pages in 2 hours with a single machine
Python 97Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
g
get_tumblr_likesby cyang812
下载 tumblr 中喜欢的内容
Python 96Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
x
xueiqiu_spiderby xiaobeibei26
雪球网沪深全站股票评论爬虫
Python 96Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
android-apps-crawlerby mssun
An extensible crawler for downloading Android applications in third-party markets.
Python 96Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
compute-image-windowsby GoogleCloudPlatform
Windows agents and scripts for Google Compute Engine images.
PowerShell 96Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
g
gopa-abandonedby medcl
GOPA, a spider written in Go.(NOTE: this project moved to https://github.com/infinitbyte/gopa )
Go 96Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
p
proxy-poolby denghuichao
爬虫代理IP池服务,可供其他爬虫程序通过restapi获取
Java 95Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
spiderby iofu728
🕷some website spider application base on proxy pool (support http & websocket)
Python 95Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
T
Taiwan-news-crawlersby TaiwanStat
Scrapy-based Crawlers for news of Taiwan
Python 95Updated: 1 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
spidyby twiny
Domain names collector - Crawl website and find domain name with their availability status.
Go 95Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
B
BeautifulSoup4by il-vladislav
BeautifulSoup 4 for Python 3.3
Python 93Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
k
Support
Quality
Security
License
Reuse
I
ITBooksby howie6879
Get itbooks from ebooks's website for free,such as allitebooks,digilibraries,etc
Python 93Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
Z
ZhihuSpiderby KEN-LJQ
知乎用户公开个人信息爬虫, 能够爬取用户关注关系,基于Python、使用代理、多线程
Python 93Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
e
electron-spider-jinshujuby jaxQin
nodejs爬取金数据表格,进行下载,封装到electron,提供可视化界面,方便用户操作
JavaScript 93Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
L
Leetcode-Questions-Scraperby Bishalsarang
Scrape Algorithm Questions from leetcode and generate html and epub file
Python 93Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
p
pycabookby pycabook
Source of the book "Clean Architectures in Python"
CSS 93Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
B
Bashterby zerobyte-id-bak
Web Crawler, Scanner, and Analyzer Framework (Shell-Script based)
Shell 93Updated: 4 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse
s
Support
Quality
Security
License
Reuse
D
DriveItby XIAZY
A New Multithreading Crawler Supports Multiple Websites
Python 92Updated: 4 y ago License: Permissive (WTFPL)
Support
Quality
Security
License
Reuse
M
Multimodal_Retrieval.pytorchby gujiuxiang
Multi-Modal and Cross-Modal Retrieval
Python 92Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
spiderby omengye
python Movie Info Web Crawler
Python 91Updated: 5 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
Support
Quality
Security
License
Reuse
a
ant_nestby strongbugman
Simple, clear and fast Web Crawler framework build on python3.6+, powered by asyncio.
Python 91Updated: 4 y ago License: Weak Copyleft (LGPL-3.0)
Support
Quality
Security
License
Reuse
N
NewsCrawlerby Jacen789
新闻爬虫,爬取新浪、搜狐、新华网即时财经新闻。
Python 91Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
scrapyscriptby jschnurr
Run a Scrapy spider programmatically from a script or a Celery task - no project required.
Python 90Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
D
Distributed-Multi-User-Scrapy-System-with-a-Web-UIby aaldaber
Django based application that allows creating, deploying and running Scrapy spiders in a distributed manner
Python 90Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
awesome-python-primerby zkqiang
自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫 / Web / 数据分析 / 机器学习 方向
Python 90Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
b
buzzprofilecrawlby petewarden
A simple script to crawl Google Profile pages and extract their information as structured data
PHP 90Updated: 5 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
l
Support
Quality
Security
License
Reuse
Z
ZhihuSpiderby Milloyy
知乎爬虫,爬取答案、问题、收藏夹、专栏等,并以HTML或Markdown形式收藏知乎优质内容。
Python 89Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
c
Support
Quality
Security
License
Reuse
q
qq-zoneby qinjintian
QQ空间(Qzone)爬虫,手机扫描登陆后即可并发下载相册原图/视频,嗯~就是这么简单
Go 89Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
g
github_disby dongfangyuxiao
一款精简版github信息泄露搜集工具
Python 88Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
4
4scannerby pboardman
Continuously search 4chan (and other imageboards) threads for images/webms and download them
Python 88Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
spider-course-2by junglelord
HTML 88Updated: 5 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
w
webcrawlerby huntingzhu
Web crawler to download pictures from zhihu.com
Python 87Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
Support
Quality
Security
License
Reuse
C
CrawlerJSby CrawlerJS
Opensource Framework Crawler in Node.js.
JavaScript 87Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
W
WeiboSpiderby CharesFang
微博爬虫,一个基于Scrapy框架的轻量微博爬虫,Sina Weibo Spider
Python 87Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
h
hub-dbby cdipaolo
Dataset of Adult Image Metadata for Training Spam Detection Models
Go 87Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
r
roboxby danclaudiupop
Simple library for exploring/scraping the web or testing a website you’re developing
Python 87Updated: 2 y ago License: Permissive (BSD-3-Clause)
Support
Quality
Security
License
Reuse