spidermonkey | DEFUNCT : PHP Web Spider | Crawler library
kandi X-RAY | spidermonkey Summary
kandi X-RAY | spidermonkey Summary
PHP Web Scraping Engine. I started this project in January 2011. It was going to be an easy to use web scraper that anyone could configure. It has an attractive GUI interface using jQuery UI elements. The selectors can be entered using three different methods; the first is the tried and true regular expression method. The second, which is the easiest and most powerful to use, is CSS selectors. Someone could visit the pages they want to scrape, use their developer console to debug some selectors, and then toss them into the engine. The third method is a simplified regex syntax I call asterisk, where the user enteres the start string and the end string and an asterisk in the middle. There is also an easy to use configuration screen for data storage. The engine was going to be smart enough to build different mysql tables and even build the relationshipts between the data. Or, data could be stored in XML or JSON documents, and a hierarchy would be maintained. But, I never finished the project.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Read a tag
- Seek to xpath
- Perform a rolling request
- Parse the charset
- get inner text of node
- Get options for the curl request
- Loads a class
- Get singleton instance
- Get the HTTP status
- Get queue item
spidermonkey Key Features
spidermonkey Examples and Code Snippets
Community Discussions
Trending Discussions on spidermonkey
QUESTION
Consider this chunk of code
...ANSWER
Answered 2021-May-03 at 19:11If you want part of a math expression to be precalculated outside of a function invocation you can do something like this:
QUESTION
As we already know, one of the differences between an array and object is:
"If you want to supply specific keys, the only choice is an object. If you don't care about the keys, an array it is" (Read more here)
Besides, according to MDN's documentation:
Arrays cannot use strings as element indexes (as in an associative array) but must use integers
However, I was surprised that:
...ANSWER
Answered 2021-Apr-21 at 17:14An array in JavaScript is not a separate type, but a subclass of object (an instance of the class Array
, in fact). The array (and list) semantics are layered on top of the behavior that you get automatically with every object in JavaScript. Javascript objects are actually instances of the associative array abstract data type, also known as a dictionary, map, table, etc., and as such can have arbitrarily-named properties.
The rest of the section of the MDN docs you quoted is important:
Setting or accessing via non-integers using bracket notation (or dot notation) will not set or retrieve an element from the array list itself, but will set or access a variable associated with that array's object property collection. The array's object properties and list of array elements are separate, and the array's traversal and mutation operations cannot be applied to these named properties.
So sure, you can always set arbitrary properties on an array, because it's an object. When you do this, you might, depending on implementation, get a weird-looking result when you display the array in the console. But if you iterate over the array using the standard mechanisms (the for
...of
loop or .forEach
method), those properties are not included:
QUESTION
Doing a small check, it looks like neither V8 nor spidermonkey unroll loops, even if it is completely obvious, how long they are (literal as condition, declared locally):
...ANSWER
Answered 2021-Feb-06 at 21:30I recommend you read this answer as it explains it quite clearly. In a nutshell, unrolling won't mean that the code will run faster.
For example, if, instead of a simple counter++
, you had a function call (taken from the linked answer):
QUESTION
I'm building a babel plugin and can find numerous examples of already written plugins in the Babel repo.
What I can't find is a definitive API documentation for writing such a plugin- especially for operations I can perform on the resulting AST.
I have checked
- babel itself (which is also what's linked from the ASTExplorer README
- Acorn documentation
- Googled "Babel transform API"
- The Babel Handbook
- Mozilla Parser API and the ESTree API linked therein
Just to list a few of the places. None of which have even defined the ubiquitous .get
method I see called so often in existing plugins, to say nothing of other functions I can call on a path, node, scope, or binding.
Does a definitive source of documentation exist for Babel 7 transforms? If so, where is it?
...ANSWER
Answered 2020-Jul-26 at 23:26I'm not a "babel" expert, but after a few hours, that is what I've found out. There is no documentation about API rather than the actual source code.
As an example, I've decided to use this plugin, called babel-plugin-transform-spread. Open those links as we go further.
The first stop is AST Spec. In the source code of the plugin above I see some CallExpression
, which can easily be found in the spec. According to spec, this Type has a couple of properties (e.g. callee and arguments). And I can see a clear usage of them in the source code too. Nothing special at this point.
But you might want to ask: okay, but what about methods?
Let's have a look at ArrayExpression
for example. There are no methods in the spec. But in the source code, there are a lot of them, like .replaceWith()
. Where the heck did this come from? I found this API doc. Quite old, yes, but still helpful IMO. Try to find replaceWith
on this page and you will see some hints like babel-core.traverse.NodePath.prototype.replaceWith
.
Okay, the next step is to open babel's GitHub page and to find something about replaceWith in babel/packages/babel-traverse
. That leads us to this line. And right here you can see other related methods.
As an exercise, you can open babel handbook and try to find something else. getPrevSibling
for example. And again, open GitHub, open search, see the results, and here we go. Or findParent
, or insertAfter
, etc.
This method is not the easiest one, but without proper documentation, this is what we have to deal with. Unfortunately.
QUESTION
This issue seems to only affect Chrome/V8, and may not be reproducible in Firefox or other browsers. In summary, the execution time of a function callback increases by an order of magnitude or more if the function is called with a new callback anywhere else.
Simplified Proof-of-ConceptCalling test(callback)
arbitrarily many times works as expected, but once you call test(differentCallback)
, the execution time of the test
function increases dramatically no matter what callback is provided (i.e., another call to test(callback)
would suffer as well).
This example was updated to use arguments so as to not be optimized to an empty loop. Callback arguments a
and b
are summed and added to total
, which is logged.
ANSWER
Answered 2020-Jul-03 at 01:15V8 developer here. It's not a bug, it's just an optimization that V8 doesn't do. It's interesting to see that Firefox seems to do it...
FWIW, I don't see "ballooning to 400ms"; instead (similar to Jon Trent's comment) I see about 2.5ms at first, and then around 11ms.
Here's the explanation:
When you click only one button, then transition
only ever sees one callback. (Strictly speaking it's a new instance of the arrow function every time, but since they all stem from the same function in the source, they're "deduped" for type feedback tracking purposes. Also, strictly speaking it's one callback each for stateTransition
and transitionCondition
, but that just duplicates the situation; either one alone would reproduce it.) When transition
gets optimized, the optimizing compiler decides to inline the called function, because having seen only one function there in the past, it can make a high-confidence guess that it's also always going to be that one function in the future. Since the function does extremely little work, avoiding the overhead of calling it provides a huge performance boost.
Once the second button is clicked, transition
sees a second function. It must get deoptimized the first time this happens; since it's still hot it'll get reoptimized soon after, but this time the optimizer decides not to inline, because it's seen more than one function before, and inlining can be very expensive. The result is that from this point onwards, you'll see the time it takes to actually perform these calls. (The fact that both functions have identical source doesn't matter; checking that wouldn't be worth it because outside of toy examples that would almost never be the case.)
There's a workaround, but it's something of a hack, and I don't recommend putting hacks into user code to account for engine behavior. V8 does support "polymorphic inlining", but (currently) only if it can deduce the call target from some object's type. So if you construct "config" objects that have the right functions installed as methods on their prototype, you can get V8 to inline them. Like so:
QUESTION
Why is my while loop getting logged this way? Is it because the internal workings of V8 and SpiderMonkey differ?
...ANSWER
Answered 2020-Mar-15 at 18:31Yes, the internal workings differ. I believe the difference you are noticing here is actually because in WebKit browsers, like Chrome and Safari, console.log
is asynchronous, whereas in Firefox (and Node.js) it is strictly synchronous.
I first read about this in the book Async JavaScript by Trevor Burnham. You can find the relevant section of the book by going to this Google Books search link for Async JavaScript and the relevant page should be the first response at the top.
WebKit's console.log has surprised many a developer by behaving asynchronously.
To help understand what is going on, try simply entering this into the console:
QUESTION
I try to build SpiderMonkey under Windows. I follow the documentation under https://wiki.mozilla.org/JavaScript:New_to_SpiderMonkey
I have installed the prerequirement from https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Windows_Prerequisites
Differently I use the current VS 15.9.11 instead the old 15.8.
If I try to call configure
inside the Mozilla build shell then it can't find the c compiler:
ANSWER
Answered 2020-Feb-05 at 10:30Instructions from the answer above were moved to a new location https://theqwertiest.github.io/foo_spider_monkey_panel/docs/for_developers/building_spidermonkey
PS: esr68 branch (as well as master branch) has a slightly different build procedure and requires additional patching to be compatible with MSVC. The linked instructions will be updated once I start using the corresponding branch.
PPS: I had to post an answer instead of comment due to the lack of rep.
PPPS: Never expected anyone to actually use this guide =)
QUESTION
On chrome/node (v8 in general i suppose), the following gives an error message:
...Uncaught TypeError: f is not iterable
ANSWER
Answered 2020-Jan-31 at 18:13V8 developer here. This looks like a bug. Please file a bug at crbug.com/v8/new.
QUESTION
In the following codepen the "stop earth" button initiates an infinite loop.
I was expecting that the loop will prevent the browser from initating the rendering pipieline but I was surprised to see that in Chrome the gif continues smoothly.
When testing this on Firefox the gif stops immediately.
...ANSWER
Answered 2020-Jan-21 at 18:30What this experiment tells you is that in Chrome, JavaScript execution and gif animation/drawing are happening on separate threads (or possibly even separate processes). Just because JavaScript is single-threaded doesn't mean that the entire rest of the browser also has to run on that same thread. Your infinite loop will still prevent additional events from being processed, because control never returns to the event loop -- but the event loop is a JavaScript concept, and gifs don't need to care about it.
Here is an updated experiment to demonstrate that:
QUESTION
I am working with exploitations of Javascript in Firefox. I am using gdb to set break points in SpiderMonkey JS engine and want to break at the point a specific allocation is made and observe the heap state. How should I set the break point?
I have tried something like inserting a Math.cos call. For example,
...ANSWER
Answered 2020-Jan-16 at 23:39I added an option --disable-e10s when launching Firefox, i.e running ./mach run --disable-e10s --debug
, it now breaks on all breakpoints in the script.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install spidermonkey
PHP requires the Visual C runtime (CRT). The Microsoft Visual C++ Redistributable for Visual Studio 2019 is suitable for all these PHP versions, see visualstudio.microsoft.com. You MUST download the x86 CRT for PHP x86 builds and the x64 CRT for PHP x64 builds. The CRT installer supports the /quiet and /norestart command-line switches, so you can also script it.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page