Crawler: robots.txt handling #9

Closed
opened 2026-04-23 02:28:04 +02:00 by myrmidex · 0 comments
Owner

Fetch, parse (Disallow/Allow, User-agent, Crawl-delay), cache per domain with TTL. Honor on every fetch. Use existing PHP library if viable, else minimal implementation.

Fetch, parse (Disallow/Allow, User-agent, Crawl-delay), cache per domain with TTL. Honor on every fetch. Use existing PHP library if viable, else minimal implementation.
myrmidex added this to the v0.1 milestone 2026-04-23 02:28:04 +02:00
myrmidex added the
enhancement
label 2026-04-26 01:28:09 +02:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lvl0/trove#9
No description provided.