一个可以自动生成PTGen,MediaInfo/BDInfo,截图,并且生成发布所需内容的快速发种工具
Support
Quality
Security
License
Reuse
fetchman is a simple crawler system/简单好用的爬虫框架
Support
Quality
Security
License
Reuse
Some scrapy and web.py exmaples
Support
Quality
Security
License
Reuse
A DHT Spider allows you to sniff the torrents and magnets.You can download them directly.
Support
Quality
Security
License
Reuse
L
LeetCodeCN-Submissions-Crawlerby JiayangWu
Python 74 Version:Current License: No License (No License)
A crawler for submissions on leetcode-cn. 这是一个用来爬取力扣中国(LeetCode CN)提交代码的爬虫。
Support
Quality
Security
License
Reuse
k
keyword_based_Sina_weibo_crawlerby KaidiGuo
Python 74 Version:Current License: No License (No License)
A web crawler for Sina, search and retrieve microblogs that contain certain keywords 一个简单的python爬虫实践,爬取包含关键词的新浪微博
Support
Quality
Security
License
Reuse
Entry for js13k 2020
Support
Quality
Security
License
Reuse
基于go-zero实现的网盘系统
Support
Quality
Security
License
Reuse
Golang短视频去水印:抖音,皮皮虾,火山,微视,最右,快手,全民小视频,皮皮搞笑,西瓜视频,虎牙,梨视频,acfun,好看视频...
Support
Quality
Security
License
Reuse
database meta crawl tool
Support
Quality
Security
License
Reuse
A compact, flexible Java multi-threaded crawler framework (Ai Pa), built-in Jsoup, zero-cost hands-on.一款小巧、灵活的Java多线程爬虫框架(AiPa)内嵌Jsoup 零成本上手
Support
Quality
Security
License
Reuse
a simplified directed customizable website crawler
Support
Quality
Security
License
Reuse
HollyJS Moscow
Support
Quality
Security
License
Reuse
A simple Zoomeye written by python,more details click this link: http://blog.csdn.net/u011721501/article/details/41967847
Support
Quality
Security
License
Reuse
D
DJH-Spiderby djhworkhardeveryday
Jupyter Notebook 73 Version:Current License: No License (No License)
Python爬虫:基础,进阶,框架, csdn,糗事百科,百度贴吧.淘宝MM ,豆瓣电影排行榜 ,腾讯招聘网站,斗鱼主播,汽车网站,百度学术, 必应学术,百科 ,金融实体关系,微博(用户,微博,评论,社交网络),推特Twitter
Support
Quality
Security
License
Reuse
a quick start python mutil thread crawl
Support
Quality
Security
License
Reuse
分布式定向抓取集群
Support
Quality
Security
License
Reuse
Some classic web crawler projects.一些经典的爬虫
Support
Quality
Security
License
Reuse
A simple search engine based on the web crawler developed in Udacity's CS101 course.
Support
Quality
Security
License
Reuse
收集新浪微博数据
Support
Quality
Security
License
Reuse
python crawler spider
Support
Quality
Security
License
Reuse
Scrapy spider example for Scrapy Tutorial Series
Support
Quality
Security
License
Reuse
Samurai Email Discovery - SED is a email discovery framework that grabs emails via google dork, company name, or domain name.
Support
Quality
Security
License
Reuse
拉勾网爬虫 lagou spider
Support
Quality
Security
License
Reuse
A crawler to collect reviews and product information on Amazon.com
Support
Quality
Security
License
Reuse
playing around with the common crawl dataset
Support
Quality
Security
License
Reuse
A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
Support
Quality
Security
License
Reuse
微博爬虫:每天定时爬取微博热搜榜的内容,留下互联网人的记忆。
Support
Quality
Security
License
Reuse
A scrapy project can crawl search result of Google/Bing/Baidu
Support
Quality
Security
License
Reuse
MS17-010 multithreading scanner written in python.
Support
Quality
Security
License
Reuse
爬一些常用的电影网站的连接
Support
Quality
Security
License
Reuse
the scrapy_paper crawls the page file
Support
Quality
Security
License
Reuse
Yet Another Web Spider
Support
Quality
Security
License
Reuse
AlipaySpider on Scrapy(use chrome driver); 支付宝爬虫(基于Scrapy)
Support
Quality
Security
License
Reuse
Your preferred open source focused crawler for the deep web.
Support
Quality
Security
License
Reuse
Scrapy Selenium on Taobao Product
Support
Quality
Security
License
Reuse
QtWebKit-based web crawler
Support
Quality
Security
License
Reuse
Cross Platform C# Web crawler framework, headless browser, parallel crawler. Please star this project! +1.
Support
Quality
Security
License
Reuse
Инструмент сбора данных о разделах, товарах и позициях товаров в разделах Wildberries и других российских маркетплейсов
Support
Quality
Security
License
Reuse
A simple, command-line based RSS enclosure downloader, primarily intended for automatic, unattended downloading of podcasts.
Support
Quality
Security
License
Reuse
Multi-threaded implementation of redis written in rust
Support
Quality
Security
License
Reuse
利用urllib2加beautifulsoup爬取新浪微博
Support
Quality
Security
License
Reuse
Python crawler for quora.com
Support
Quality
Security
License
Reuse
SimFin's open source PDF crawler
Support
Quality
Security
License
Reuse
A RabbitMQ Scheduler for Scrapy
Support
Quality
Security
License
Reuse
自建免费IP代理池。
Support
Quality
Security
License
Reuse
Weighted PageRank implementation in Go
Support
Quality
Security
License
Reuse
这其实是一份学习笔记。包括学习记录、爬虫练习平台(网站)、自制工具脚本
Support
Quality
Security
License
Reuse
Python爬虫:基础,进阶,框架, csdn,糗事百科,百度贴吧.淘宝MM ,豆瓣电影排行榜 ,腾讯招聘网站,斗鱼主播,汽车网站,百度学术, 必应学术,百科 ,金融实体关系,微博(用户,微博,评论,社交网络),推特Twitter
Support
Quality
Security
License
Reuse
知乎爬虫,基于webmagic框架 .A java web spider base on webmagic.
Support
Quality
Security
License
Reuse
D
Differentialby LeiShi1313
一个可以自动生成PTGen,MediaInfo/BDInfo,截图,并且生成发布所需内容的快速发种工具
Python 75Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
f
fetchmanby DarkSand
fetchman is a simple crawler system/简单好用的爬虫框架
Python 74Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
s
scrapy-examplesby feiskyer
Some scrapy and web.py exmaples
Python 74Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
L
L-Spiderby LEXUGE
A DHT Spider allows you to sniff the torrents and magnets.You can download them directly.
Python 74Updated: 2 y ago License: Strong Copyleft (AGPL-3.0)
Support
Quality
Security
License
Reuse
L
LeetCodeCN-Submissions-Crawlerby JiayangWu
A crawler for submissions on leetcode-cn. 这是一个用来爬取力扣中国(LeetCode CN)提交代码的爬虫。
Python 74Updated: 1 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
k
keyword_based_Sina_weibo_crawlerby KaidiGuo
A web crawler for Sina, search and retrieve microblogs that contain certain keywords 一个简单的python爬虫实践,爬取包含关键词的新浪微博
Python 74Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
C
Support
Quality
Security
License
Reuse
c
Support
Quality
Security
License
Reuse
p
parse-videoby wujunwei928
Golang短视频去水印:抖音,皮皮虾,火山,微视,最右,快手,全民小视频,皮皮搞笑,西瓜视频,虎牙,梨视频,acfun,好看视频...
Go 74Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
d
Support
Quality
Security
License
Reuse
A
AiPaby onblog
A compact, flexible Java multi-threaded crawler framework (Ai Pa), built-in Jsoup, zero-cost hands-on.一款小巧、灵活的Java多线程爬虫框架(AiPa)内嵌Jsoup 零成本上手
Java 73Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
l
light-crawlerby zhang2333
a simplified directed customizable website crawler
JavaScript 73Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
c
crawler-cases-demoby pirateminds
HollyJS Moscow
JavaScript 73Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
S
SimpleZoomeyeby OneSourceCat
A simple Zoomeye written by python,more details click this link: http://blog.csdn.net/u011721501/article/details/41967847
PHP 73Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
D
DJH-Spiderby djhworkhardeveryday
Python爬虫:基础,进阶,框架, csdn,糗事百科,百度贴吧.淘宝MM ,豆瓣电影排行榜 ,腾讯招聘网站,斗鱼主播,汽车网站,百度学术, 必应学术,百科 ,金融实体关系,微博(用户,微博,评论,社交网络),推特Twitter
Jupyter Notebook 73Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
l
lrabbit_scrapyby litter-rabbit
a quick start python mutil thread crawl
Python 73Updated: 3 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
s
Support
Quality
Security
License
Reuse
c
crawler_examplesby WiseDoge
Some classic web crawler projects.一些经典的爬虫
Python 72Updated: 4 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
D
DaveDaveFindby ecmendenhall
A simple search engine based on the web crawler developed in Udacity's CS101 course.
Python 72Updated: 4 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
w
Support
Quality
Security
License
Reuse
s
Support
Quality
Security
License
Reuse
s
scrapy-spider-exampleby AccordBox
Scrapy spider example for Scrapy Tutorial Series
Python 72Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
S
Samuraiby OffXec
Samurai Email Discovery - SED is a email discovery framework that grabs emails via google dork, company name, or domain name.
Shell 72Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
Support
Quality
Security
License
Reuse
c
customer-review-crawlerby maifeng
A crawler to collect reviews and product information on Amazon.com
Java 71Updated: 4 y ago License: Permissive (Unlicense)
Support
Quality
Security
License
Reuse
c
common-crawlby matpalm
playing around with the common crawl dataset
Java 71Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
scrapy-wayback-machineby sangaline
A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
Python 71Updated: 4 y ago License: Permissive (ISC)
Support
Quality
Security
License
Reuse
w
weibo_Hot_Searchby Writeup001
微博爬虫:每天定时爬取微博热搜榜的内容,留下互联网人的记忆。
Python 71Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
seCrawlerby xtt129
A scrapy project can crawl search result of Google/Bing/Baidu
Python 71Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
m
ms17-010-m4ss-sc4nn3rby claudioviviani
MS17-010 multithreading scanner written in python.
Python 71Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
Support
Quality
Security
License
Reuse
p
paper_fileby ghostnothing
the scrapy_paper crawls the page file
HTML 71Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
T
Support
Quality
Security
License
Reuse
A
AlipaySpider-Scrapyby sunhailin-Leo
AlipaySpider on Scrapy(use chrome driver); 支付宝爬虫(基于Scrapy)
Python 70Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
v
venomby PreferredAI
Your preferred open source focused crawler for the deep web.
Java 70Updated: 2 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
S
ScrapySeleniumTestby Python3WebSpider
Scrapy Selenium on Taobao Product
Python 70Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
w
webkitcrawlerby 7ws
QtWebKit-based web crawler
Python 70Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
a
abotxby sjdirect
Cross Platform C# Web crawler framework, headless browser, parallel crawler. Please star this project! +1.
C# 70Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
w
wildsearch-crawlerby wondersell
Инструмент сбора данных о разделах, товарах и позициях товаров в разделах Wildberries и других российских маркетплейсов
Python 70Updated: 2 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
c
castgetby mlj
A simple, command-line based RSS enclosure downloader, primarily intended for automatic, unattended downloading of podcasts.
C 70Updated: 3 y ago License: Proprietary (Proprietary)
Support
Quality
Security
License
Reuse
r
redis-oxideby dpbriggs
Multi-threaded implementation of redis written in rust
Rust 70Updated: 4 y ago License: Strong Copyleft (GPL-3.0)
Support
Quality
Security
License
Reuse
s
sina_weibo_crawlerby yanshengli
利用urllib2加beautifulsoup爬取新浪微博
Python 69Updated: 5 y ago License: Permissive (Apache-2.0)
Support
Quality
Security
License
Reuse
Q
Quora-Crawlerby scku
Python crawler for quora.com
Python 69Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
p
pdf-crawlerby SimFin
SimFin's open source PDF crawler
Python 69Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
s
scrapy-rabbitmqby roycehaynes
A RabbitMQ Scheduler for Scrapy
Python 69Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
p
Support
Quality
Security
License
Reuse
p
pagerankby alixaxel
Weighted PageRank implementation in Go
Go 69Updated: 4 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
l
learning_spiderby RecluseXU
这其实是一份学习笔记。包括学习记录、爬虫练习平台(网站)、自制工具脚本
HTML 69Updated: 2 y ago License: Permissive (MIT)
Support
Quality
Security
License
Reuse
D
DJH-Spiderby jasonhavenD
Python爬虫:基础,进阶,框架, csdn,糗事百科,百度贴吧.淘宝MM ,豆瓣电影排行榜 ,腾讯招聘网站,斗鱼主播,汽车网站,百度学术, 必应学术,百科 ,金融实体关系,微博(用户,微博,评论,社交网络),推特Twitter
Jupyter Notebook 69Updated: 3 y ago License: No License (No License)
Support
Quality
Security
License
Reuse
z
zhihuWebSpiderby QiuMing
知乎爬虫,基于webmagic框架 .A java web spider base on webmagic.
Java 68Updated: 4 y ago License: No License (No License)
Support
Quality
Security
License
Reuse