php-spider | A configurable and extensible PHP web spider | Crawler library

 by   mvdbos PHP Version: v0.7.0 License: MIT

kandi X-RAY | php-spider Summary

kandi X-RAY | php-spider Summary

php-spider is a PHP library typically used in Automation, Crawler, PhantomJS applications. php-spider has no vulnerabilities, it has a Permissive License and it has medium support. However php-spider has 12 bugs. You can download it from GitHub.

A configurable and extensible PHP web spider
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              php-spider has a medium active ecosystem.
              It has 1291 star(s) with 237 fork(s). There are 85 watchers for this library.
              OutlinedDot
              It had no major release in the last 6 months.
              There are 3 open issues and 39 have been closed. On average issues are closed in 337 days. There are 1 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of php-spider is v0.7.0

            kandi-Quality Quality

              php-spider has 12 bugs (0 blocker, 0 critical, 12 major, 0 minor) and 42 code smells.

            kandi-Security Security

              php-spider has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              php-spider code analysis shows 0 unresolved vulnerabilities.
              There are 1 security hotspots that need review.

            kandi-License License

              php-spider is licensed under the MIT License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              php-spider releases are not available. You will need to build from source code and install.
              Installation instructions are not available. Examples and code snippets are available.
              php-spider saves you 1619 person hours of effort in developing the same functionality from scratch.
              It has 3596 lines of code, 222 functions and 84 files.
              It has low code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed php-spider and discovered the below as its top functions. This is intended to give you an instant insight into php-spider implemented functionality, and help decide if they suit your requirements.
            • Perform the crawler
            • Discovers all URIs .
            • Fetch a resource
            • Add a new uri
            • Checks if the given URI matches the allowed host .
            • Returns the crawler .
            • Completes the path .
            • Called before the request is executed .
            • Get all subscribed events
            • Get the event dispatcher .
            Get all kandi verified functions for this library.

            php-spider Key Features

            No Key Features are available at this moment for php-spider.

            php-spider Examples and Code Snippets

            No Code Snippets are available at this moment for php-spider.

            Community Discussions

            QUESTION

            Using php-spider, is there a standard Xpath that might discover the URIs on most web sites?
            Asked 2020-Apr-16 at 20:51

            I am using the wonderful script entitled php-spider with the goal of scraping the Title, Desc, H1, H2, H3, and H4 from a few web sites. As part of configuring the script, it is necessary to set an 'XpathExpressionDiscoverer' to instruct the script how to find additional hyperlinks on each page for crawling. I assume this refers to the standard Xpath query language.

            My goal is to find an XpathExpressionDiscoverer that will generally work for most web sites (rather than requiring me to customize it for each site).

            Here is what I have tried:

            I noticed the example provided by the author uses a very specific XpathExpressionDiscoverer to crawl the given example site:

            ...

            ANSWER

            Answered 2020-Apr-16 at 20:51

            The example code contains this line:

            Source https://stackoverflow.com/questions/61221263

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install php-spider

            You can download it from GitHub.
            PHP requires the Visual C runtime (CRT). The Microsoft Visual C++ Redistributable for Visual Studio 2019 is suitable for all these PHP versions, see visualstudio.microsoft.com. You MUST download the x86 CRT for PHP x86 builds and the x64 CRT for PHP x64 builds. The CRT installer supports the /quiet and /norestart command-line switches, so you can also script it.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/mvdbos/php-spider.git

          • CLI

            gh repo clone mvdbos/php-spider

          • sshUrl

            git@github.com:mvdbos/php-spider.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Explore Related Topics

            Consider Popular Crawler Libraries

            scrapy

            by scrapy

            cheerio

            by cheeriojs

            winston

            by winstonjs

            pyspider

            by binux

            colly

            by gocolly

            Try Top Libraries by mvdbos

            vdb-uri

            by mvdbosPHP

            node-skeleton-test

            by mvdbosJavaScript

            dotfiles

            by mvdbosPerl