LinkedIn_Scraper | Selenium based automated program | Machine Learning library
kandi X-RAY | LinkedIn_Scraper Summary
kandi X-RAY | LinkedIn_Scraper Summary
An automated console gui program that scrapes information procedurally using selenium and parsel as described below.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of LinkedIn_Scraper
LinkedIn_Scraper Key Features
LinkedIn_Scraper Examples and Code Snippets
Community Discussions
Trending Discussions on LinkedIn_Scraper
QUESTION
I am currently working on a web scraper that will take urls as inputs, find the page, scrape it, then return results in a CSV. The scraper works well for single URL's at a time. But unfortunately whenever it writes a new line to the scrape results CSV it also appends the previous url's scrape results in each column. I need a loop that will essentially create new class variables inside the loop so that this doesn't happen. Something like that does this: Takes list of urls, then also creates unique class instance.
...ANSWER
Answered 2021-Mar-08 at 06:13Could you try instantiating a new driver
each time? That should reset counters in driver
for you.
QUESTION
import os
from selenium import webdriver
import time
from linkedin_scraper import actions
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.common import exceptions
from selenium.common.exceptions import StaleElementReferenceException
from selenium.webdriver.chrome.options import Options
from credentials import email,password
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome("driver/chromedriver", options=chrome_options)
# email = os.getenv("LINKEDIN_USER")
# password = os.getenv("LINKEDIN_PASSWORD")
actions.login(driver, email, password) # if email and password isnt given, it'll prompt in terminal
urls = open('C:/Users/reddy/AppsTek/scraping/LinkedIn Scraping/LinkedIn Scraping1/urls3.csv')
for u in urls:
try:
driver.get(u)
companies = []
element = driver.find_element_by_class_name('pv-profile-section__toggle-detail-icon')
if element:
driver.execute_script("arguments[0].click();", element)
_ = WebDriverWait(driver, 3).until(EC.presence_of_element_located((By.ID, "experience-section")))
all_urls = driver.find_elements_by_css_selector("div > a")
for elem in all_urls:
text = elem.text
company = elem.get_property('href')
if "linkedin.com/company" in company:
z = company + 'about/'
companies.append(z)
else:
_ = WebDriverWait(driver, 3).until(EC.presence_of_element_located((By.ID, "experience-section")))
all_urls = driver.find_elements_by_css_selector("div > a")
for elem in all_urls:
text = elem.text
company = elem.get_property('href')
if "linkedin.com/company" in company:
z = company + 'about/'
companies.append(z)
print(companies)
except:
print('Nothing found')
...ANSWER
Answered 2020-Dec-24 at 00:12Next time you use try except, try to make use of the exception error that's being thrown instead of just silently handling it. In your case you were getting a NoSuchElementException
and you didn't see it.
To handle that exception you can use find_elements_by_class_name
instead of find_element_by_class_name
that returns a list and check if that list includes any elements. Slight modifications fixed your code.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install LinkedIn_Scraper
You can use LinkedIn_Scraper like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page