Web scraping with selenium, python and google chrome

  python, python-3.x, selenium, web-scraping

I am trying to scrape links with Selenium and google chrome driver. I have managed to write up a script that scrapes the links for each page and navigate to the next one.

The goal is to save the links into a file of each page. I seem to be having some trouble trying to save the links to the file and I’m also not too confident that the loop is getting the fresh links for each page that it visits. Also the last part of the code where it cannot find the next page button seems to be not working properly either.

import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

DRIVER_PATH = 'chromedriver'
driver = webdriver.Chrome(executable_path=DRIVER_PATH)
driver.get('https://google.com')

#Next Page Function
def nextPage():
    nextPageBtn = driver.find_element_by_id('pnnext')
    nextPageBtn.click()

#Search Box Google
searchbox = driver.find_element_by_class_name('gLFyf')

time.sleep(3)

#Send Text to input
searchbox.send_keys('cheap cat food')

time.sleep(2)

#Google Search Button
searchbtn = driver.find_element_by_class_name('gNO89b')
searchbtn.click()

time.sleep(5)

#Scrapes links off current page
def scrapeLinks():

    all_links = driver.find_elements_by_css_selector('.yuRUbf a')

    for link in all_links:
        href = link.get_attribute('href')
 
        #Write links to file
        f = open("testlinks.txt", "a") 
        f.write(f"{href}n")
        f.close()

        #Check if next page available
        if driver.find_element_by_css_selector('#pnnext'):
            nextPage()

            time.sleep(5)

            scrapeLinks()

        #Last page. Quit session.
        else:
            time.sleep(5)

            scrapeLinks()

            

            driver.quit()
  

scrapeLinks()

Source: Python Questions

LEAVE A COMMENT