Category : web-crawler

While scraping data from ali_baba i am getting issue i.e issue is that i want to scrape product_name, price, quantiy and company name but i find that all data lies in same xpath as a whole i want get it in different column like price in price table etc. import scrapy from .. items import ..

Read more

While executing the command: scrapyd-deploy default I’m runnning into an error saying: File"/home/user/miniconda3/envs/quickcompany/lib/python3.8/site-packages/scrapyd_client/deploy.py", line 23, in <module> from scrapy.utils.http import basic_auth_header ModuleNotFoundError: No module named ‘scrapy.utils.http’ I have tried uninstalling and resinstalling the relevant libraries. Also tried using both the github and packaged versions of scrapyd-client. Source: Python..

Read more

import urllib.parse import scrapy from scrapy.http import Request class VORnamen(scrapy.Spider): name = "namen" start_urls = ["https://www.govdata.de/web/guest/daten/-/details/liste-der-haufigen-vornamen-2012 "] def parse(self, response): for href in response.css(‘div#all_results h3 a::attr(href)’).extract(): yield Request( url=response.urljoin(href), callback=self.parse_article ) def parse_article(self, response): for href in response.css(‘div.download_wrapper a[href$=".csv"]::attr(href)’).extract(): yield Request( url=response.urljoin(href), callback=self.save_csv ) def save_csv(self, response): path = response.url.split(‘/’)[-1] self.logger.info(‘Saving CSV %s’, path) with ..

Read more

Trying to run scrapyd but run into an error saying: ib/python3.8/site-packages/scrapyd_client/deploy.py", line 23, in <module> from scrapy.utils.http import basic_auth_header ModuleNotFoundError: No module named ‘scrapy.utils.http’ The command I’ve used to launch scrapyd is scrapyd-deploy local with the settings assigned to local. All the libraries including scrapy, scrapyd and scrapyd-client are installed in the system. Source: Python ..

Read more