Convert Belga to real RSS feed parsing #78

Closed
opened 2026-03-08 10:12:57 +01:00 by myrmidex · 1 comment
Owner

Summary

Belga News Agency offers an RSS feed at https://www.belganewsagency.eu/feed. Currently Belga is configured as type: 'website' (fixed in #37) and uses HTML scraping. Convert it to use the RSS parsing infrastructure built in #37.

Tasks

  • Update Belga provider config to type: 'rss' with RSS feed URL
  • Verify existing Belga articles still parse correctly via RSS
  • Delete Belga HTML parser classes (BelgaHomepageParser, BelgaHomepageParserAdapter, BelgaArticleParser, BelgaArticlePageParser) — recoverable from git history if ever needed
  • Write tests

Dependencies

  • #37 (RSS parsing infrastructure)
## Summary Belga News Agency offers an RSS feed at `https://www.belganewsagency.eu/feed`. Currently Belga is configured as `type: 'website'` (fixed in #37) and uses HTML scraping. Convert it to use the RSS parsing infrastructure built in #37. ## Tasks - [ ] Update Belga provider config to `type: 'rss'` with RSS feed URL - [ ] Verify existing Belga articles still parse correctly via RSS - [ ] Delete Belga HTML parser classes (BelgaHomepageParser, BelgaHomepageParserAdapter, BelgaArticleParser, BelgaArticlePageParser) — recoverable from git history if ever needed - [ ] Write tests ## Dependencies - #37 (RSS parsing infrastructure)
myrmidex added this to the v1.2.0 milestone 2026-03-08 10:12:57 +01:00
myrmidex self-assigned this 2026-03-08 16:15:18 +01:00
Author
Owner

Implemented in 0bb1072.

Changes:

  • Config: belga.typerss, URL → https://www.belganewsagency.eu/feed, removed homepage parser
  • Deleted: BelgaHomepageParser, BelgaHomepageParserAdapter
  • Updated: FeedFactory::belga() state, test expectations in CreateFeedActionTest and FeedsControllerTest
  • Added: Belga-specific RSS test in ArticleFetcherRssTest
  • Verified live: both Belga (10 items) and Guardian (121 items) RSS feeds return valid XML

Note: BelgaArticleParser and BelgaArticlePageParser were kept — they're still needed by ArticleFetcher::fetchArticleData() to extract content from individual article pages. Only the homepage discovery parsers were removed.

Implemented in `0bb1072`. **Changes:** - Config: `belga.type` → `rss`, URL → `https://www.belganewsagency.eu/feed`, removed `homepage` parser - Deleted: `BelgaHomepageParser`, `BelgaHomepageParserAdapter` - Updated: `FeedFactory::belga()` state, test expectations in `CreateFeedActionTest` and `FeedsControllerTest` - Added: Belga-specific RSS test in `ArticleFetcherRssTest` - Verified live: both Belga (10 items) and Guardian (121 items) RSS feeds return valid XML **Note:** `BelgaArticleParser` and `BelgaArticlePageParser` were kept — they're still needed by `ArticleFetcher::fetchArticleData()` to extract content from individual article pages. Only the homepage discovery parsers were removed.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lvl0/fedi-feed-router#78
No description provided.