Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add possibility to configure Run with entire Downloader, ItemPipeline and Processor instances #67

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

pavlokomarov
Copy link

@pavlokomarov pavlokomarov commented Oct 2, 2022

The purpose of this PR is to create re-usable configurations for Downloader, ItemPipeline and Processor objects.

I use roach with Symfony bundle https://github.com/Ne-Lexa/roach-php-bundle and now I have to duplicate pipeline and middleware configurations for different spiders in my application. I think, that my proposal can avoid duplicating this:

    roach.run.first:
        parent: roach.run.base
        arguments:
            $downloaderMiddleware:
                - '@roach.downloader_middleware.first'
                - '@roach.downloader_middleware.second'
            $itemProcessors:
                - '@roach.item_processor.first'
                - '@roach.item_processor.second'
                
    roach.run.second:
        parent: roach.run.base
        arguments:
            $downloaderMiddleware:
                - '@roach.downloader_middleware.first'
                - '@roach.downloader_middleware.second'
            $itemProcessors:
                - '@roach.item_processor.first'
                - '@roach.item_processor.second'

and allows this:

    roach.downloader: ~
    roach.item_pipeline: ~

    roach.run.first:
        ~
        arguments:
            $downloader: '@roach.downloader'
            $itemPipepine: '@roach.item_pipeline'
   roach.run.second:
        ~
        arguments:
            $downloader: '@roach.downloader'
            $itemPipepine: '@roach.item_pipeline'

To avoid BC-break I have extended Run fields and add a possibility to configure the Engine both with an array of middlewares/processors and an entire instances of Downloader/ItemPipeline/Processor.

… and Processor objects

add possibility to configure Run with entire Downloader, ItemPipeline and Processor objects
@pavlokomarov pavlokomarov changed the title Add possibility to configure Run with entire Downloader, ItemPipeline and Processor objects Add possibility to configure Run with entire Downloader, ItemPipeline and Processor instances Oct 2, 2022
@ksassnowski ksassnowski self-requested a review as a code owner March 10, 2023 15:38
@ksassnowski ksassnowski force-pushed the main branch 3 times, most recently from cce5cf5 to 33eb25e Compare March 27, 2023 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant