scrapeulous | Cloud crawler functions | Crawler library
kandi X-RAY | scrapeulous Summary
kandi X-RAY | scrapeulous Summary
Cloud crawler functions for scrapeulous
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- get img url
- get img tag url
- Returns the nth parent of the given element .
- Insert string into str at index
scrapeulous Key Features
scrapeulous Examples and Code Snippets
Community Discussions
Trending Discussions on scrapeulous
QUESTION
I need to visit a hundred thousand or more urls and check if they redirect to a different final url.
I'm using https://www.scrapeulous.com to do this. But I'll need to write a simple custom function to make it work. Scrapeulous uses the got library. Which has documentation that on the followRedirects option that notes:
followRedirect
Type: boolean Default: true
Defines if redirect responses should be followed automatically.
Note that if a 303 is sent by the server in response to any request type (POST, DELETE, etc.), Got will automatically request the resource pointed to in the location header via GET. This is in accordance with the spec.
and also notes for Response.url:
url
Type: string
The request URL or the final URL after redirects.
I've tried the following code to no avail:
...ANSWER
Answered 2020-Nov-22 at 23:50EDIT: Have more clarity on this.
First, there are three types of redirects (per this answer):
- HTTP - as information in response headers (with code 301, 302, 3xx)
- HTML - as tag in HTML (wikipedia: Meta refresh)
- JavaScript - as code like window.location = new_url
With respect to the example domain, debianit.com. It redirects to experait.com via javascript. Specifically this script:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install scrapeulous
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page