python - Scrapy CrawlSpider Not Crawling -
I am currently creating spider to remove musical instruments and their data.
For this I am using the crawl slider, and the end result will have to take all this data and put it in mongoDB documents by the model name. I did not complete, obviously, and not found at this point.
Edit: I was able to fix the error and it is running: but now the crawler crawls the '0' pages and returns the CSV file does not have any data that it outputs to Does. What can be the problem?
What do I have here:
# - * - Coding: UTF-8 - * - Scrap import from scrap contrab. Slider import crawl slider, the Squeml.selector import Scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor scrapy.item import selector before class FenderSpider from (CrawlSpider): name = "fender" allowed_domains = [ "example.org/"] start_urls = ( 'Http://www.example.org/fender/?ob=model_asc#results', rule) = (rule (SgmlLinkExtractor (allow = ( 'before & amp; amp; pn = *',)), callback = 'Parse_item'),) DRF Parse_item (self-response): item = scrapy.Item () before [ 'data'] = response.xpath ( '// span [@ class = "itemResult"] / text () ). Remove () Return Item
Here is my Items file:
# - * - Coding: UTF-8 - * - # Define the model here Please refer to your scrapped items # # documentation: # http://doc.scrapy.org/en/latest/topics/items.html Import Scattery Class MDBIT (Scrappy Item): # Define fields for your items here Do: # name = scrapy.Field () name = 'MdbItem' items = scrapy.Field () # company = scrapy.Field () # model = scrapy.Field () # model_name = scrapy. () #instrument_type = scrapy.Field () # year = scrapy.Field () # serial = scrapy.Field () # sku = scrapy.Field ()
all working It is fine, but no data has been removed. I do not understand why
Can anyone help? I am just learning both dragon and scrapper, so I am a newbie.
Replace
> Crolspder
Which has already been imported from scrapy.contrib.spiders
.
Comments
Post a Comment