51job | 下载51job招聘信息,公司简介,岗位简介,经纬度
kandi X-RAY | 51job Summary
kandi X-RAY | 51job Summary
address为 原网站 中心即可。.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of 51job
51job Key Features
51job Examples and Code Snippets
Community Discussions
Trending Discussions on 51job
QUESTION
I am trying to build a database with rvest. Since I have much data to download, I tried to write several functions that would allow me to interrupt the scraping process and to restart it where I left it. However, while the functions work more or less, whenever I manually interrupt them, I loose the output. Does anyone know a solution that would allow me to stop the function without loosing the dataframe that the loop is building ? I would be glad for any advice!
Some urls that I am trying to scrape data from:
...ANSWER
Answered 2020-Feb-19 at 21:54I come across this problem often in webscraping. The key is to store the intermediate results in an environment where they are accessible if your function throws an error. The obvious place is the global environment, but this depends on how you are using your function. If it is part of a package, then you don't want to write to the global workspace. In that case you can have a "storage" environment as part of the package.
Perhaps the neatest way to do this is to delete the intermediate object after the loop is complete, so it will only ever be visible / accessible if the loop throws an error.
Here is a function that demonstrates the principle:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install 51job
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page