cheeriogs | Cheerio for Google Apps Script | Crawler library

by tani JavaScript Version: v1.0.0-rc.10 License: MIT

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | cheeriogs Summary

cheeriogs is a JavaScript library typically used in Automation, Crawler applications. cheeriogs has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. You can download it from GitHub.

Cheerio for Google Apps Script

Support

Quality

Security

License

Reuse

Support

cheeriogs has a low active ecosystem.

It has 320 star(s) with 34 fork(s). There are 6 watchers for this library.

It had no major release in the last 6 months.

There are 2 open issues and 19 have been closed. On average issues are closed in 135 days. There are 1 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of cheeriogs is v1.0.0-rc.10

Quality

cheeriogs has 0 bugs and 0 code smells.

Security

cheeriogs has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

cheeriogs code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

cheeriogs is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

cheeriogs releases are not available. You will need to build from source code and install.

Installation instructions are not available. Examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of cheeriogs

Get all kandi verified functions for this library.

cheeriogs Key Features

No Key Features are available at this moment for cheeriogs.

cheeriogs Examples and Code Snippets

No Code Snippets are available at this moment for cheeriogs.

Community Discussions

Trending Discussions on cheeriogs

In XPATH I use '''...and''' to add an additional criterion, in cheerio is there this option?

How to work with 'NOT Contains' in cheerio?

How to protect the positions to place the data in the worksheet when some values are not retrieved?

How to collect only the second text value of data returned from an XPATH using CheerioGS?

Scraping a url value using contains and Cheeriogs

Extract text from a line that is between two elements using Cheeriogs

Scraping data from a table from a specific title value and filter specific lines (Google App Script)

Number of UrlFetchApp calls used

Send the values collected by web scraping using CheerioGS to the spreadsheet

Exception: argument too large: value Google Script

QUESTION

In XPATH I use '''...and''' to add an additional criterion, in cheerio is there this option?

Asked 2022-Mar-05 at 03:07

I use the cheeriogs library to work in Google App Script:
id library: 1ReeQ6WO8kKNxoaA_O0XEQ589cIrRvEBA9qcWpNqdOP17i47u6N9M5Xh0

Using =IMPORTXML('url','xpath') I make the call with this XPATH:

//div[contains(@class,'match-card') and ../../td[@class='score-time ']/a[contains(@href, 'matches')]]

The idea is to collect the div that contain the @class with the word match-card
BUT
he needs to have td linked to @class='score-time' and a contains the @href with the word matches

I tried to find a way to do this with CHEERIOGS but it always returns blank, my attempts were:

...

ANSWER

Answered 2022-Mar-04 at 05:38

In your situation, how about the following modified script?

Modified script:

Source https://stackoverflow.com/questions/71346261

QUESTION

How to work with 'NOT Contains' in cheerio?

Asked 2022-Mar-01 at 23:56

I use the cheeriogs library to work in Google App Script:
Script ID: 1ReeQ6WO8kKNxoaA_O0XEQ589cIrRvEBA9qcWpNqdOP17i47u6N9M5Xh0
https://github.com/tani/cheeriogs

My current code trying to use not contains looks like this:

...

ANSWER

Answered 2022-Mar-01 at 23:56

You need the : before contains since it's a pseudo, so it's

a:not(:contains(text))

Source https://stackoverflow.com/questions/71311641

QUESTION

How to protect the positions to place the data in the worksheet when some values are not retrieved?

Asked 2022-Feb-20 at 12:38

I'm using the CheerioGS library:

ID → 1ReeQ6WO8kKNxoaA_O0XEQ589cIrRvEBA9qcWpNqdOP17i47u6N9M5Xh0
Project → https://github.com/tani/cheeriogs

My full code (I added it in its entirety so that they can use it in tests):

...

ANSWER

Answered 2022-Feb-19 at 23:12

Instead of using Array.prototype.forEach and Range.setValue for filling cell by cell one column at a time, build an Array of Array of values, then add all the values at once by using Range.setValues.

In order to do this, instead of grabbing one cell (td) from all tables (tbody) at once, grab the whole table (tbody) and check if the required cell / content exists , if not add an empty string ('') to the corresponding position in the Array that is being build.

Resources

https://developers.google.com/apps-script/guides/support/best-practices?hl=en

Source https://stackoverflow.com/questions/71189703

QUESTION

How to collect only the second text value of data returned from an XPATH using CheerioGS?

Asked 2022-Feb-20 at 09:06

The sitemap (https://futebolnatv.com.br/jogos-hoje/) looks like this:

...

ANSWER

Answered 2022-Feb-19 at 21:44

You need to change your selector from

$(value).text().trim()

$(value).contents().last().text().trim()

Explanation: instead of retrieving text of whole matched element, you need to get all its nodes first (via contents()), then get text node you need (via last()). Rest of code is unchanged.

Reference:

Source https://stackoverflow.com/questions/71188935

QUESTION

Scraping a url value using contains and Cheeriogs

Asked 2021-Oct-07 at 21:33

I use the Cheeriogs library for scraping:

https://github.com/tani/cheeriogs

This is the element I need to collect the value href:

...

ANSWER

Answered 2021-Oct-05 at 08:25

In your situation, how about the following selectors?

From:

Source https://stackoverflow.com/questions/69445174

QUESTION

Extract text from a line that is between two elements using Cheeriogs

Asked 2021-Sep-16 at 13:49

The Cheeriogs library (1ReeQ6WO8kKNxoaA_O0XEQ589cIrRvEBA9qcWpNqdOP17i47u6N9M5Xh0) for Google App Script is in this GitHub repository:
https://github.com/tani/cheeriogs

I'm trying to collect the line of text that contains the value:

...

ANSWER

Answered 2021-Sep-13 at 21:06

Will this suffice?

Source https://stackoverflow.com/questions/69167110

QUESTION

Scraping data from a table from a specific title value and filter specific lines (Google App Script)

Asked 2021-Aug-19 at 02:57

Documentation for CherrioGS:
https://github.com/tani/cheeriogs

The idea is to collect only data from the table with the name Argentinos Jrs and that lines with the value Away on International duty in the info column are not saved.

Note: I really need to specify according to the value Argentinos Jrs and remove Away on International duty, because the position of this table is not fixed and the values in lines too.

The expected result in this example I'm looking for is this:

...

ANSWER

Answered 2021-Aug-19 at 02:57

function PaginaDoJogo() {
  const sheet = SpreadsheetApp.getActive().getSheetByName('Dados Importados');
  const url = 'https://www.sportsgambler.com/injuries/football/argentina-superliga/';
  const response = UrlFetchApp.fetch(url);
  const content = response.getContentText();
  const match = content.match(/Argentinos Jrs[\s\S]+?/);
  const regExp = /(.+?)<\/span>[\s\S]+?(.+?)<\/span>[\s\S]+?(.+?)<\/span>[\s\S]+?<\/div>/g;
  const values = [];
  while ((r = regExp.exec(match[0])) !== null) {
    // console.log(r[1], r[2], r[3]);
    if (r[1] !== 'Name' && r[2] !== 'Away on International duty') {
      values.push([r[1], r[3]]);
    }
  }
  sheet.getRange(2, 1, values.length, 2).setValues(values);
}

Source https://stackoverflow.com/questions/68840508

QUESTION

Number of UrlFetchApp calls used

Asked 2021-Aug-16 at 16:21

I have a question regarding the structural part of calls, there is a limitation of 1,000 daily calls to UrlFetchApp, in the model used in my example code, how many UrlFetchApp calls are used?

Only one and from it are worked on each of the four var lines below or is a four UrlFetchApp call needed?

Documentation:
https://developers.google.com/apps-script/reference/url-fetch/url-fetch-app

Add Info, documentation for CherrioGS:
https://github.com/tani/cheeriogs

...

ANSWER

Answered 2021-Aug-16 at 16:03

As Ouroborus mentioned, just one every time PaginaDoJoJo() is called.

The statement from Cheerio only accesses the result from that one UrlFetchApp.fetch call which was in contentText right now. The ones below it now accesses the $ which was the result of Cheerio

Thanks for showing Cheerio. It seems very helpful in scraping a site and might use it as well in the future.

Source https://stackoverflow.com/questions/68804662

QUESTION

Send the values collected by web scraping using CheerioGS to the spreadsheet

Asked 2021-Aug-12 at 02:18

CheerioGS project:
https://github.com/tani/cheeriogs

I would like to know how I can send the lineup values to Column A of the spreadsheet and substitute for Column B of the spreadsheet, I am learning to work with CheerioGS

So far I've learned to save the results in Logger.log(), but I don't understand how to send them to the spreadsheet cells!

...

ANSWER

Answered 2021-Aug-12 at 02:18

Here an Example:

Source https://stackoverflow.com/questions/68749926

QUESTION

Exception: argument too large: value Google Script

Asked 2020-Jul-14 at 14:24

I'm trying to scrap a website & put the value in cache so I don't hit the daily limit of UrlFetchApp

Here is the script I did:

...

ANSWER

Answered 2020-Jul-14 at 14:24

As written in the official documentation,

The maximum length of a key is 250 characters. The maximum amount of data that can be stored per key is 100KB.

If the size of the data put in cache exceeds any of the above limitations, the error

Exception: argument too large

is shown. In your case, value exceeds 100KB. Solution would be to cache only necessary data or don't cache at all depending on your specific needs.

Source https://stackoverflow.com/questions/62896266

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install cheeriogs

You can download it from GitHub.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: