Skip to main content

Scrapy CrawlerRunner load new spiders in runtime

I'm running scrapy spiders from a script after every N hours, my spiders folder may get refreshed after every N hours while the CrawlerRunner is still working. My problem is how can I load the new Spiders from the spiders folder inside running CrawlerRunner?

 def startProcess():
    configure_logging()
    runner = CrawlerRunner(get_project_settings())
    task = LoopingCall(lambda: def_process(runner))
    task.start(60 *60)
    reactor.run()

 def def_process(runner:CrawlerRunner):
      if(new_spider()):
         runner.spider_loader.from_settings(get_project_settings()) #not working
      process() // in a loop that yield runner.crawl('spider.name', sett=sett)

I tried runner.spider_loader.from_settings(get_project_settings()) but it's not working and I also tried runner.create_crawler(SpiderClass) but how can I add the crawler that returned from this method to the CrawlerRunner so I can execute it as yield runner.crawl("new_spider_name", _config=config)

Now yield runner.crawl("new_spider_name", _config=config) gives Spider not found exception



source https://stackoverflow.com/questions/75748924/scrapy-crawlerrunner-load-new-spiders-in-runtime

Comments