Skip to content

Fix rare case when crawler can get deadlocked #1694

@Pijukatel

Description

@Pijukatel

Crawler from this test https://github.com/apify/apify-sdk-python/blob/master/tests/integration/actor/test_crawlers_with_storages.py#L80 can become deadlocked never finishing

[8](https://github.com/apify/apify-sdk-python/actions/runs/21358210249/job/61470616028?pr=747#step:6:369)
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> Status: RUNNING, Message: 
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:23.291Z ACTOR: Pulling container image of build zi7siweXp20fGFNlp from registry.
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:25.353Z ACTOR: Creating container.
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:25.539Z ACTOR: Starting container.
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:25.541Z ACTOR: Running under "LIMITED_PERMISSIONS" permission level.
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:27.756Z [apify._configuration] WARN  Actor is running on the Apify platform, `disable_browser_sandbox` was changed to True.
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:27.763Z [apify] INFO  Initializing Actor ({"apify_sdk_version": "3.1.1", "apify_client_version": "2.4.0", "crawlee_version": "1.3.0", "python_version": "3.10.19", "os": "linux"})
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.364Z [ParselCrawler] INFO  Current request statistics:
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.366Z ┌───────────────────────────────┬──────┐
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.368Z │ requests_finished             │ 0    │
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.370Z │ requests_failed               │ 0    │
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.372Z │ retry_histogram               │ [0]  │
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.374Z │ request_avg_failed_duration   │ None │
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.376Z │ request_avg_finished_duration │ None │
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.378Z │ requests_finished_per_minute  │ 0    │
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.380Z │ requests_failed_per_minute    │ 0    │
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.381Z │ request_total_duration        │ 0s   │
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.383Z │ requests_total                │ 0    │
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.385Z │ crawler_runtime               │ 0s   │
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.387Z └───────────────────────────────┴──────┘
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.431Z [crawlee._autoscaling.autoscaled_pool] INFO  current_concurrency = 0; desired_concurrency = 10; cpu = 0; mem = 0; event_loop = 0.0; client_info = 0.0
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:28.436Z [ParselCrawler] INFO  Crawled 0/1 pages, 0 failed requests, desired concurrency 10.
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:29.737Z [ParselCrawler] WARN  Retrying request to http://localhost:8080/ due to: Some error. File "/usr/src/app/src/main.py", line 26, in default_handler,     raise RuntimeError('Some error')
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:29.829Z [ParselCrawler] WARN  Retrying request to http://localhost:8080/ due to: Some error. File "/usr/src/app/src/main.py", line 26, in default_handler,     raise RuntimeError('Some error')
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:29.879Z [ParselCrawler] WARN  Retrying request to http://localhost:8080/ due to: Some error. File "/usr/src/app/src/main.py", line 26, in default_handler,     raise RuntimeError('Some error')
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:30.056Z [ParselCrawler] ERROR Request to http://localhost:8080/ failed and reached maximum retries
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:30.058Z  Traceback (most recent call last):
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:30.060Z   File "/usr/local/lib/python3.10/site-packages/crawlee/crawlers/_basic/_context_pipeline.py", line 114, in __call__
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:30.063Z     await final_context_consumer(cast('TCrawlingContext', crawling_context))
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:30.064Z   File "/usr/local/lib/python3.10/site-packages/crawlee/_utils/wait.py", line 37, in wait_for
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:30.066Z     return await asyncio.wait_for(operation(), timeout.total_seconds())
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:30.068Z   File "/usr/local/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:30.070Z     return fut.result()
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:30.072Z   File "/usr/local/lib/python3.10/site-packages/crawlee/router.py", line 108, in __call__
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:30.074Z     return await user_defined_handler(context)
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:30.076Z   File "/usr/src/app/src/main.py", line 26, in default_handler
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:30.077Z     raise RuntimeError('Some error')
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:30.079Z RuntimeError: Some error
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:38.599Z [ParselCrawler] INFO  Crawled 0/1 pages, 1 failed requests, desired concurrency 10.
[apify.python-sdk-tests-crawler-max-retries-generated-1U0Ejtq0 runId:VwSGval1HMpakuQEE] -> 2026-01-26T13:00:48.655Z [ParselCrawler] INFO  Crawled 0/1 pages, 1 failed requests, desired concurrency 10.

... Crawler not finishing and not doing any requests forever. 

Metadata

Metadata

Assignees

Labels

t-toolingIssues with this label are in the ownership of the tooling team.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions