Commit graph

28 commits

Author SHA1 Message Date
1538ceeb6e 11 - Gate ProcessCrawlJob with per-domain politeness lock 2026-04-27 01:25:46 +02:00
7171348370 11 - Add PolitenessService and crawler delay config 2026-04-27 00:59:45 +02:00
69aa5d9d3e 10 - Add /bot page with crawler identity and opt-out instructions 2026-04-27 00:41:12 +02:00
c80be24e6e chore - Extract mockFetchPageAction helper in ProcessCrawlJobTest 2026-04-27 00:24:38 +02:00
3297c4bb3b 14 - Fix ProcessCrawlJob outcome write and status mapping bugs 2026-04-27 00:18:34 +02:00
720e4bcc1f 14 - Implement ProcessCrawlJob orchestration with retry logic 2026-04-26 23:50:57 +02:00
2a586ecac4 14 - Add PageCrawlObserver and ProcessCrawlJob skeleton 2026-04-26 21:15:07 +02:00
118de0023a 14 - Simplify page_crawls schema (queue moves to Redis) 2026-04-26 20:58:07 +02:00
6c0e1fe12d chore - Simplify call-site tests now that RegisterDiscoveredPageAction is unit-tested 2026-04-26 20:22:24 +02:00
649aeb3627 chore - Extract RegisterDiscoveredPageAction for shared Page::firstOrCreate logic 2026-04-26 20:18:18 +02:00
dda5b0f770 12 - Apply pr-reviewer follow-ups: validation, link filters, readonly VO, docs 2026-04-26 19:49:08 +02:00
35e1147823 12 - Add HTML content extraction (title, text, links, word count) 2026-04-26 19:35:04 +02:00
1b7fbbfd0c 12 - Add FetchPageAction with Http::fake-driven outcome paths 2026-04-26 17:56:13 +02:00
bb7906e193 12 - Make FetchResult fields nullable and add missing crawler config 2026-04-26 16:50:43 +02:00
a9f2d689ae 12 - Add crawler config and FetchResult value object 2026-04-26 16:45:07 +02:00
abbcedf2e7 12 - Add Rejected case to CrawlOutcomeEnum and PageStatusEnum 2026-04-26 16:35:46 +02:00
6f75be7328 8 - Tighten UrlService validation and add observer integration tests 2026-04-26 16:09:28 +02:00
de14ae3ad4 8 - Wire PageObserver to enqueue page_crawls on Page creation 2026-04-26 15:56:38 +02:00
81209125a1 8 - Add UrlService with host extraction method 2026-04-26 14:52:40 +02:00
f2c1fab4e4 7 - Add int casts on PageCrawl and tests for cascade-delete + pending scope 2026-04-26 14:23:13 +02:00
fe8ca7fc10 7 - Add page_crawls migration, PageCrawl model, factory, and Page relationships 2026-04-26 14:15:49 +02:00
9dd6d84d65 7 - Add CrawlOutcomeEnum for crawl attempt outcomes 2026-04-26 13:06:22 +02:00
b1b7adeacd 7 - Add language column to pages for crawler-detected language 2026-04-26 12:53:21 +02:00
43837a99db 5 - Add UrlSubmissionForm Livewire component with rate limiting 2026-04-26 11:58:51 +02:00
6b610b699e 4 - Drop status promotion in UrlDiscoveredListener; defer to keywords listener
Some checks failed
CI / ci (push) Failing after 3h0m0s
2026-04-26 03:52:12 +02:00
3ad473f4a1 4 - Add UrlDiscoveredListener wiring fediverse polling to pages graph 2026-04-26 03:31:32 +02:00
424ad2ff78 4 - Add Page and PageLink models with factories and unit tests 2026-04-26 02:51:49 +02:00
1fe6ae5cff 1 - Install Laravel 13 with Livewire 2026-04-23 03:13:33 +02:00