WEIBO_SCRAPY is a Multi-Threading SINA WEIBO data extraction Framework in Python.
Support
Quality
Security
License
Reuse
.NET based webcrawler
Support
Quality
Security
License
Reuse
a
adstxtcrawlerby InteractiveAdvertisingBureau
Python 154 Version:Current License: No License (No License)
A reference implementation in python of a simple crawler for Ads.txt
Support
Quality
Security
License
Reuse
Crawler for linguistic corpora
Support
Quality
Security
License
Reuse
Enjoy driving on a Javascriptive (originally Pythonic) way to Japanese AV!
Support
Quality
Security
License
Reuse
简单、易用、高效 一个有态度的开源.Net Http请求框架!可以用制作爬虫,api请求等等。
Support
Quality
Security
License
Reuse
:blush::blush::blush: 知乎问题爬虫
Support
Quality
Security
License
Reuse
CoCrawler is a versatile web crawler built using modern tools and concurrency.
Support
Quality
Security
License
Reuse
A lite distributed Java spider framework :-)
Support
Quality
Security
License
Reuse
python爬虫 全球网址URL滚动提取
Support
Quality
Security
License
Reuse
Support
Quality
Security
License
Reuse
netease-music-spider is a sipder that you can find beautiful girlfriend or handsome boyfriend.
Support
Quality
Security
License
Reuse
A scrapy-based Hacker News crawler.
Support
Quality
Security
License
Reuse
all kinds of scrapy demo
Support
Quality
Security
License
Reuse
Nodejs Crawler, including schedule, spider, web ui config, proxy modules. using nodejs, redis/ssdb, hbase, phantomjs. css selector extraction rules and regex extraction rules supported.
Support
Quality
Security
License
Reuse
由Python编写的全异步实现的动漫之家(dmzj)漫画批量下载器(爬虫)
Support
Quality
Security
License
Reuse
The python crawler which automatically crawls the original microblogs and pictures of the specified user, analyzes the microblogs, and displays them in the form of html charts.
Support
Quality
Security
License
Reuse
Redis Priority Queue offers a priority/timeline based queue for use with Redis
Support
Quality
Security
License
Reuse
Scrape a public LinkedIn profile.
Support
Quality
Security
License
Reuse
Interactive CLI Web Crawler
Support
Quality
Security
License
Reuse
使用scrapy和pandas完成对知乎300w用户的数据分析。首先使用scrapy爬取知乎网的300w,用户资料,最后使用pandas对数据进行过滤,找出想要的知乎大牛,并用图表的形式可视化。
Support
Quality
Security
License
Reuse
Go process used to crawl websites
Support
Quality
Security
License
Reuse
html+ python +django +爬虫 +pyecharts 实时疫情动态
Support
Quality
Security
License
Reuse
scrapy爬取知乎用户数据
Support
Quality
Security
License
Reuse
爬虫项目:链家网(普通/scrapy)、虎扑、维基百科、百度地图api、房天下(分布式爬虫)、微信公众号(代理池爬取)
Support
Quality
Security
License
Reuse
An Extensible Image Crawler
Support
Quality
Security
License
Reuse
Everybody can be scrapy guru
Support
Quality
Security
License
Reuse
🔥 Shadowsocks 账号爬虫
Support
Quality
Security
License
Reuse
📱 百度贴吧多线程扫码登陆 / 自动签到 / 自动打码
Support
Quality
Security
License
Reuse
基于Python3的pornhub网站爬虫
Support
Quality
Security
License
Reuse
This is a crawler for Sina Weiqun website(WAP) information, including given Weiqun's posts, replies, users and their follow relation. Written in Python 2.7.1, store data in SQLite3. Relation-crawling part customized on Github Project sina_reptile.
Support
Quality
Security
License
Reuse
A fast tool to fetch URLs from HTML attributes by crawl-in.
Support
Quality
Security
License
Reuse
Download a large list of files concurrently
Support
Quality
Security
License
Reuse
Scala DSL for web crawling
Support
Quality
Security
License
Reuse
A distributed Sina Weibo Search spider base on Scrapy and Redis.
Support
Quality
Security
License
Reuse
爬取豆瓣小组帖子的爬虫。
Support
Quality
Security
License
Reuse
🌟⏳🌟 各种网站的签到(停止维护)
Support
Quality
Security
License
Reuse
国家自然科学基金查询
Support
Quality
Security
License
Reuse
(2020年最新)斗鱼弹幕抓取及可视化管理平台第二版,提供弹幕抓取、弹幕实时发送速度可视化、抓取记录查询、弹幕下载、自定义关键词统计、铁粉统计、高光时刻自动捕获、高频弹幕词云等功能,起飞~~~
Support
Quality
Security
License
Reuse
Android wav/pcm 录音机,支持暂停、再录制。支持跳过静音区模式。
Support
Quality
Security
License
Reuse
一个基于✨HOOK机制的微信机器人,支持🌱安全新闻定时推送【FreeBuf,先知,安全客,奇安信攻防社区】,👯后缀名查询,⚡备案查询,⚡手机号归属地查询,⚡WHOIS信息查询,🎉星座查询,⚡天气查询,🌱摸鱼日历⚡微步威胁情报查询, 🐛美女视频,⚡美女图片,👯帮助菜单。📫 支持积分功能,😄自定义程度丰富,小白也可轻松上手!
Support
Quality
Security
License
Reuse
爬虫所需要的IP代理,抓取九个网站的代理IP检测/清洗/入库/更新,添加调用接口
Support
Quality
Security
License
Reuse
As you can see, a kuaishou crawler
Support
Quality
Security
License
Reuse
A Python crawler uses Facebook Graph API to crawling fan page's public posts, comments, and reactions.
Support
Quality
Security
License
Reuse
A parallel web spider of PornHub.成人网站Pornhub的并行网络爬虫。
Support
Quality
Security
License
Reuse
useful crawler project for practice
Support
Quality
Security
License
Reuse
短视频图集图片去水印:快手,皮皮虾,最右,小红书,微博
Support
Quality
Security
License
Reuse
MM131网站图片爬取 :rotating_light:
Support
Quality
Security
License
Reuse
An object crawler/property search library.
Support
Quality
Security
License
Reuse
Asynchronous Web Crawler & Scraper
Support
Quality
Security
License
Reuse
w
weibo_scrapyby yoyzhou
WEIBO_SCRAPY is a Multi-Threading SINA WEIBO data extraction Framework in Python.
Python 154Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
N
Support
Quality
Security
License
Reuse
a
adstxtcrawlerby InteractiveAdvertisingBureau
A reference implementation in python of a simple crawler for Ads.txt
Python 154Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
corpuscrawlerby google
Crawler for linguistic corpora
Python 153Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
J
JavPyby TheodoreKrypton
Enjoy driving on a Javascriptive (originally Pythonic) way to Japanese AV!
JavaScript 153Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
H
HttpCode.Coreby stulzq
简单、易用、高效 一个有态度的开源.Net Http请求框架!可以用制作爬虫,api请求等等。
C# 153Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
Z
ZhihuQuestionsSpiderby StevenKin
:blush::blush::blush: 知乎问题爬虫
Java 152Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
c
cocrawlerby cocrawler
CoCrawler is a versatile web crawler built using modern tools and concurrency.
Python 152Updated: 3 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
j
jlitespiderby luohaha
A lite distributed Java spider framework :-)
Java 151Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
Support
Quality
Security
License
Reuse
E
EasyCSRFby 0ang3el
Python 151Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
n
netease-music-spiderby wenhaoliang
netease-music-spider is a sipder that you can find beautiful girlfriend or handsome boyfriend.
Python 150Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
h
hncrawlby mvanveen
A scrapy-based Hacker News crawler.
Python 150Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
scrapy_demoby BruceDone
all kinds of scrapy demo
Python 150Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
n
neocrawlerby ahkimkoo
Nodejs Crawler, including schedule, spider, web ui config, proxy modules. using nodejs, redis/ssdb, hbase, phantomjs. css selector extraction rules and regex extraction rules supported.
JavaScript 150Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
p
python-dcdownloaderby dev-techmoe
由Python编写的全异步实现的动漫之家(dmzj)漫画批量下载器(爬虫)
Python 148Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
w
weibo_analysisby dingmyu
The python crawler which automatically crawls the original microblogs and pictures of the specified user, analyzes the microblogs, and displays them in the form of html charts.
HTML 147Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
r
rpqueueby josiahcarlson
Redis Priority Queue offers a priority/timeline based queue for use with Redis
Python 146Updated: 2 y ago License: Weak Copyleft (LGPL-2.1)
Support
Quality
Security
License
Reuse
s
scrape-linkedinby ericfourrier
Scrape a public LinkedIn profile.
Python 146Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
e
evineby saeeddhqan
Interactive CLI Web Crawler
Go 146Updated: 2 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
Z
Zhihu_bigdataby yoghurtjia
使用scrapy和pandas完成对知乎300w用户的数据分析。首先使用scrapy爬取知乎网的300w,用户资料,最后使用pandas对数据进行过滤,找出想要的知乎大牛,并用图表的形式可视化。
HTML 146Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
crawlerby trandoshan-io
Go process used to crawl websites
Go 146Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
C
COVID-19-KSHby whhsky
html+ python +django +爬虫 +pyecharts 实时疫情动态
JavaScript 146Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
scrapy-zhihu-usersby ansenhuang
scrapy爬取知乎用户数据
Python 145Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
C
CrawlerProjectby LMFrank
爬虫项目:链家网(普通/scrapy)、虎扑、维基百科、百度地图api、房天下(分布式爬虫)、微信公众号(代理池爬取)
Python 145Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
h
Support
Quality
Security
License
Reuse
s
scrapy_guruby michael-yin
Everybody can be scrapy guru
JavaScript 145Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
soksaccountsby chenjiandongx
🔥 Shadowsocks 账号爬虫
Python 144Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
t
tieba_signby Aruelius
📱 百度贴吧多线程扫码登陆 / 自动签到 / 自动打码
Python 144Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
Support
Quality
Security
License
Reuse
w
weiquncrawlerby liuslevis
This is a crawler for Sina Weiqun website(WAP) information, including given Weiqun's posts, replies, users and their follow relation. Written in Python 2.7.1, store data in SQLite3. Relation-crawling part customized on Github Project sina_reptile.
Python 144Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
g
galerby dwisiswant0
A fast tool to fetch URLs from HTML attributes by crawl-in.
Shell 144Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
m
massivedlby dimkouv
Download a large list of files concurrently
Go 144Updated: 3 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
c
crawlerby bplawler
Scala DSL for web crawling
Scala 144Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
w
weibosearchby tpeng
A distributed Sina Weibo Search spider base on Scrapy and Redis.
Python 143Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
d
douban-group-spiderby kaito-kidd
爬取豆瓣小组帖子的爬虫。
Python 143Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
q
Support
Quality
Security
License
Reuse
n
Support
Quality
Security
License
Reuse
D
DouyuBarrage-Proby Crawler995
(2020年最新)斗鱼弹幕抓取及可视化管理平台第二版,提供弹幕抓取、弹幕实时发送速度可视化、抓取记录查询、弹幕下载、自定义关键词统计、铁粉统计、高光时刻自动捕获、高频弹幕词云等功能,起飞~~~
TypeScript 143Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
R
RecordWavby shaoshuai904
Android wav/pcm 录音机,支持暂停、再录制。支持跳过静音区模式。
Java 141Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
N
NGCBotby ngc660sec
一个基于✨HOOK机制的微信机器人,支持🌱安全新闻定时推送【FreeBuf,先知,安全客,奇安信攻防社区】,👯后缀名查询,⚡备案查询,⚡手机号归属地查询,⚡WHOIS信息查询,🎉星座查询,⚡天气查询,🌱摸鱼日历⚡微步威胁情报查询, 🐛美女视频,⚡美女图片,👯帮助菜单。📫 支持积分功能,😄自定义程度丰富,小白也可轻松上手!
Python 141Updated: 1 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
I
IPProxyby ZKeeer
爬虫所需要的IP代理,抓取九个网站的代理IP检测/清洗/入库/更新,添加调用接口
Python 140Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
k
kuaishou-crawlerby oGsLP
As you can see, a kuaishou crawler
Python 140Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
F
Facebook-Page-Crawlerby chenjr0719
A Python crawler uses Facebook Graph API to crawling fan page's public posts, comments, and reactions.
Python 140Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
P
PornSpiderby QuantumLiu
A parallel web spider of PornHub.成人网站Pornhub的并行网络爬虫。
HTML 140Updated: 3 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
g
go-crawlerby liunian1004
useful crawler project for practice
Go 140Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
i
images_spiderby 5ime
短视频图集图片去水印:快手,皮皮虾,最右,小红书,微博
PHP 140Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
m
mm131by qwertyuiop6
MM131网站图片爬取 :rotating_light:
Python 137Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
spotlight.jsby bestiejs
An object crawler/property search library.
JavaScript 137Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
r
rubyretrieverby joenorton
Asynchronous Web Crawler & Scraper
Ruby 137Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse