linkedin_scraper | A library that scrapes Linkedin for user data | Scraper library
kandi X-RAY | linkedin_scraper Summary
kandi X-RAY | linkedin_scraper Summary
A library that scrapes Linkedin for user data
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Login
- Login with a given cookie
- Prompt the user for an email address
- Scrape the user
- Scrape the logged in
- Scrape not logged in
- Click an element by its class name
- Adds an article to the list
- Adds an achievement to the experiment
- Add a contact
- Add an education
- Add an experience to the experiment
- Add an interest
- Set the location
- Scrape logged in
- Scrape the logged in page
- Get a list of people
- Parse employee
- Gets the text under the given title
- Return the text under the given title
linkedin_scraper Key Features
linkedin_scraper Examples and Code Snippets
class MainBackgroundThread(QThread):
def __init__(self, keyword, sector):
QThread.__init__(self)
self.keyword, self.sector = keyword, sector
def run(self):
main(self.keyword, self.sector)
Community Discussions
Trending Discussions on linkedin_scraper
QUESTION
I am scraping the company names as well as the company leads from LinkedIn Sales Navigator, While I get the names of companies in my output, I fail to get the company leads ie. Names of People from the navigator. Here's the code for the same.
...ANSWER
Answered 2021-Sep-01 at 12:10First I think this should be a css_selector not class name
Replace this :-
QUESTION
I am trying to scrape details of a few companies and their leads from the Linkedin Sales Navigator. To login, I have created a textfile named config.txt which has the username and password. The problem is that, It logins successfully, only to display another login page.
So, For eg: If I login through https://www.linkedin.com/checkpoint/rm/sign-in-another-account it logins successfully but then straightaway gives me another login page like: https://www.linkedin.com/sales/login
If I repeat the process for the 2nd url, then ideally it should give me the homepage of salesnavigator, but it again gives me the same page ie. https://www.linkedin.com/sales/login
Here's my code for the same:
...ANSWER
Answered 2021-Aug-06 at 10:07As Ben said in the comment that Linkedin uses bot detector and for the same reason you are unable to login. For this reason you have used some additional chrome options.
The following code snippet will solve your problem
QUESTION
I am currently working on a web scraper that will take urls as inputs, find the page, scrape it, then return results in a CSV. The scraper works well for single URL's at a time. But unfortunately whenever it writes a new line to the scrape results CSV it also appends the previous url's scrape results in each column. I need a loop that will essentially create new class variables inside the loop so that this doesn't happen. Something like that does this: Takes list of urls, then also creates unique class instance.
...ANSWER
Answered 2021-Mar-08 at 06:13Could you try instantiating a new driver
each time? That should reset counters in driver
for you.
QUESTION
import os
from selenium import webdriver
import time
from linkedin_scraper import actions
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.common import exceptions
from selenium.common.exceptions import StaleElementReferenceException
from selenium.webdriver.chrome.options import Options
from credentials import email,password
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome("driver/chromedriver", options=chrome_options)
# email = os.getenv("LINKEDIN_USER")
# password = os.getenv("LINKEDIN_PASSWORD")
actions.login(driver, email, password) # if email and password isnt given, it'll prompt in terminal
urls = open('C:/Users/reddy/AppsTek/scraping/LinkedIn Scraping/LinkedIn Scraping1/urls3.csv')
for u in urls:
try:
driver.get(u)
companies = []
element = driver.find_element_by_class_name('pv-profile-section__toggle-detail-icon')
if element:
driver.execute_script("arguments[0].click();", element)
_ = WebDriverWait(driver, 3).until(EC.presence_of_element_located((By.ID, "experience-section")))
all_urls = driver.find_elements_by_css_selector("div > a")
for elem in all_urls:
text = elem.text
company = elem.get_property('href')
if "linkedin.com/company" in company:
z = company + 'about/'
companies.append(z)
else:
_ = WebDriverWait(driver, 3).until(EC.presence_of_element_located((By.ID, "experience-section")))
all_urls = driver.find_elements_by_css_selector("div > a")
for elem in all_urls:
text = elem.text
company = elem.get_property('href')
if "linkedin.com/company" in company:
z = company + 'about/'
companies.append(z)
print(companies)
except:
print('Nothing found')
...ANSWER
Answered 2020-Dec-24 at 00:12Next time you use try except, try to make use of the exception error that's being thrown instead of just silently handling it. In your case you were getting a NoSuchElementException
and you didn't see it.
To handle that exception you can use find_elements_by_class_name
instead of find_element_by_class_name
that returns a list and check if that list includes any elements. Slight modifications fixed your code.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install linkedin_scraper
First, you must set your chromedriver location by.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page