Skip to content

Fix mixed-language metadata and add safety guards for translation scripts#314

Merged
pethers merged 22 commits intomainfrom
copilot/implement-title-metadata-generation
Feb 19, 2026
Merged

Fix mixed-language metadata and add safety guards for translation scripts#314
pethers merged 22 commits intomainfrom
copilot/implement-title-metadata-generation

Conversation

Copy link
Contributor

Copilot AI commented Feb 18, 2026

Addresses 11 review comments from automated code review: incomplete JSON-LD translations, hard-coded paths breaking local development, and missing safeguards against overwriting professional translations.

Changes

JSON-LD localization (6 files)

  • Translated alternativeHeadline fields to match page language (SV, DA, NO)
  • Previously: English alternativeHeadline on translated pages broke structured data consistency

Script portability (2 files)

  • Changed hard-coded /home/runner/work/... to relative Path('news')
  • Now works in local dev and any CI environment
# Before
self.news_dir = Path('/home/runner/work/riksdagsmonitor/riksdagsmonitor/news')

# After  
self.news_dir = Path('news') if news_dir is None else Path(news_dir)

Syntax fixes

  • Fixed f-string nested quote conflicts in JSON-LD updates using json.dumps() for proper escaping
  • Updated docstring to reflect actual 13-language support vs. documented "English only"

Safety guards

  • Added --english-only flag (default) to prevent accidental overwrite of professional translations
  • --overwrite-translations requires interactive "Type YES" confirmation
  • Prevents regression where auto-generated titles replaced high-quality human translations

Conflict resolution

Original prompt

This section details on the original issue you should resolve

<issue_title>Implement Content-Based Title and Metadata Generation for All News Articles (SEO Enhancement)</issue_title>
<issue_description># 📋 Issue Type
Enhancement / Metadata Improvement

🎯 Objective

Implement content-based title and metadata generation system for all news articles, replacing generic repeated titles with unique, descriptive titles that accurately reflect each article's content and policy focus areas.

📊 Current State

Problem: Generic titles and descriptions repeated across multiple articles

Examples:

  • Committee Reports: "Committee Reports: Parliamentary Priorities This Week" (used for all dates)
  • Government Propositions: "Government Propositions: Policy Priorities This Week" (used for all dates)
  • Opposition Motions: "Opposition Motions: Battle Lines This Week" (used for all dates)

Impact:

  • ❌ Poor SEO (duplicate title tags across multiple URLs)
  • ❌ No content differentiation in search results
  • ❌ Reduced user engagement (unclear what article contains)
  • ❌ Misleading social media shares (generic Open Graph titles)
  • ❌ No thematic organization (can't identify articles by policy area)

Metadata Affected:

  • <title> tag
  • <meta name="description">
  • <meta property="og:title"> (Open Graph)
  • <meta property="og:description">
  • <meta name="twitter:title">
  • Schema.org NewsArticle headline and alternativeHeadline

🚀 Desired State

Content-based titles that:

  • Reflect actual content: Mention specific policy areas, key documents, political themes
  • Unique per article: No two articles share the same title
  • SEO-optimized: Include relevant keywords, proper length (50-60 characters)
  • Descriptive: User immediately knows article focus
  • Multi-language: Properly translated titles for all 14 languages

Example Transformation:

Before (2026-02-18-committee-reports-en.html):

Title: "Committee Reports: Parliamentary Priorities This Week"
Description: "Analysis of 10 committee reports revealing Riksdag priorities for the current session"

After:

Title: "Tax Agency Data Protection and Border Controls Dominate Committee Agenda"
Description: "Analysis of 10 committee reports covering Schengen border cash controls, data protection reforms at Tax Agency, housing cooperative registry, and parental benefit simplification"

Before (2026-02-18-government-propositions-en.html):

Title: "Government Propositions: Policy Priorities This Week"
Description: "Analysis of 10 government propositions shaping the legislative agenda"

After:

Title: "New Weapons Law and VAT Fraud Measures Lead Government Legislative Push"
Description: "Government proposes major weapons law reform, VAT fraud crackdown, beneficial ownership transparency, and financial sector crisis management alongside National Audit Office responses"

Before (2026-02-18-opposition-motions-en.html):

Title: "Opposition Motions: Battle Lines This Week"
Description: "Analysis of 10 opposition motions revealing parliamentary fault lines"

After:

Title: "Opposition Unites Against Preventive Detention, Splits on Tax and Labor Policy"
Description: "Cross-party opposition challenges government on indefinite detention, language requirements in elderly care, and labor law conditions, with MPs and Greens leading civil liberties defense"

🔧 Implementation Approach

Step 1: Content Analysis

For each article type, analyze document content to extract:

Committee Reports:

  • Primary policy domains (e.g., "Tax Agency", "Border Controls", "Housing")
  • Committee distribution (e.g., "SkU, FiU, CU dominate")
  • Thematic patterns (e.g., "Security and Economic Policy Focus")

Government Propositions:

  • Major policy initiatives (e.g., "Weapons Law Reform", "VAT Fraud Crackdown")
  • Ministry priorities (e.g., "Justice and Finance Ministries Lead")
  • Legislative strategy (e.g., "Coalition Consensus on Crime Policy")

Opposition Motions:

  • Key opposition themes (e.g., "Civil Liberties Defense", "Economic Alternatives")
  • Party alignment patterns (e.g., "Left Bloc Unity", "Center Divergence")
  • Government pressure points (e.g., "SD Influence Criticized")

Step 2: Title Generation Algorithm

// Pseudocode for title generation
function generateTitle(articleType, documents, date) {
  // 1. Extract top 2-3 most significant policy areas
  const topPolicies = extractTopPolicies(documents, 3);
  
  // 2. Identify dominant theme or pattern
  const theme = identifyTheme(documents);
  
  // 3. Construct title (50-60 characters optimal)
  const title = `${topPolicies.join(" and ")} ${theme}`;
  
  // 4. Ensure uniqueness vs. other dates
  return ensureUnique(title, date);
}

Step 3: Description Generation

// Pseudocode for description generation
function generateDescription(documents) {
  // 1. List all policy ...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

- Fixes Hack23/riksdagsmonitor#311

<!-- START COPILOT CODING AGENT TIPS -->
---

✨ Let Copilot coding agent [set things up for you](https://github.com/Hack23/riksdagsmonitor/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
python scripts/generate-content-based-titles.py [--dry-run]
"""

import os
…sets × 14 languages)

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copilot AI changed the title [WIP] Implement content-based title and metadata generation for news articles Generate content-based titles and metadata for 174 news articles to eliminate SEO duplicate content Feb 18, 2026
Copilot AI requested a review from pethers February 18, 2026 20:58
@pethers pethers requested a review from Copilot February 18, 2026 21:30
@github-actions github-actions bot added html-css HTML/CSS changes translation Translation updates rtl RTL language support (Arabic, Hebrew) refactor Code refactoring news News articles and content generation size-xl Extra large change (> 1000 lines) labels Feb 18, 2026
@github-actions
Copy link
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates news article metadata to replace generic, duplicate titles/descriptions with content-derived titles and synchronized metadata fields, aiming to improve SEO uniqueness across the site’s multi-language news archive.

Changes:

  • Replaced repeated <title>/meta descriptions and social metadata with content-based titles/descriptions across many news articles and languages.
  • Updated OpenGraph/Twitter/Schema.org fields to align with the new titles/descriptions.
  • Updated <h1> headings to match the new article titles.

Reviewed changes

Copilot reviewed 175 out of 175 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
news/2026-02-opposition-motions-sv.html Updates title/metadata and h1 for Feb 2026 opposition motions (sv).
news/2026-02-opposition-motions-en.html Updates title/metadata and h1 for Feb 2026 opposition motions (en).
news/2026-02-government-propositions-sv.html Updates title/metadata and h1 for Feb 2026 government propositions (sv).
news/2026-02-government-propositions-en.html Updates title/metadata and h1 for Feb 2026 government propositions (en).
news/2026-02-committee-reports-sv.html Updates title/metadata and h1 for Feb 2026 committee reports (sv).
news/2026-02-committee-reports-en.html Updates title/metadata and h1 for Feb 2026 committee reports (en).
news/2026-02-18-opposition-motions-en.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑18 opposition motions (en).
news/2026-02-18-government-propositions-zh.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑18 government propositions (zh).
news/2026-02-18-government-propositions-nl.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑18 government propositions (nl).
news/2026-02-18-government-propositions-ko.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑18 government propositions (ko).
news/2026-02-18-government-propositions-ja.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑18 government propositions (ja).
news/2026-02-18-government-propositions-fr.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑18 government propositions (fr).
news/2026-02-18-government-propositions-es.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑18 government propositions (es).
news/2026-02-18-government-propositions-en.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑18 government propositions (en).
news/2026-02-18-government-propositions-de.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑18 government propositions (de).
news/2026-02-18-government-propositions-da.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑18 government propositions (da).
news/2026-02-18-government-propositions-ar.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑18 government propositions (ar).
news/2026-02-18-committee-reports-zh.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑18 committee reports (zh).
news/2026-02-17-opposition-motions-nl.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 opposition motions (nl).
news/2026-02-17-opposition-motions-ko.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 opposition motions (ko).
news/2026-02-17-opposition-motions-es.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 opposition motions (es).
news/2026-02-17-opposition-motions-en.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 opposition motions (en).
news/2026-02-17-opposition-motions-da.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 opposition motions (da).
news/2026-02-17-opposition-motions-ar.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 opposition motions (ar).
news/2026-02-17-government-propositions-zh.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 government propositions (zh).
news/2026-02-17-government-propositions-nl.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 government propositions (nl).
news/2026-02-17-government-propositions-ko.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 government propositions (ko).
news/2026-02-17-government-propositions-ja.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 government propositions (ja).
news/2026-02-17-government-propositions-he.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 government propositions (he).
news/2026-02-17-government-propositions-fr.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 government propositions (fr).
news/2026-02-17-government-propositions-es.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 government propositions (es).
news/2026-02-17-government-propositions-en.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 government propositions (en).
news/2026-02-17-government-propositions-de.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 government propositions (de).
news/2026-02-17-government-propositions-da.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 government propositions (da).
news/2026-02-17-government-propositions-ar.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 government propositions (ar).
news/2026-02-17-committee-reports-zh.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 committee reports (zh).
news/2026-02-17-committee-reports-fr.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 committee reports (fr).
news/2026-02-17-committee-reports-es.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 committee reports (es).
news/2026-02-17-committee-reports-en.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 committee reports (en).
news/2026-02-17-committee-reports-de.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 committee reports (de).
news/2026-02-17-committee-reports-ar.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑17 committee reports (ar).
news/2026-02-16-opposition-motions-en.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 opposition motions (en).
news/2026-02-16-opposition-motions-ar.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 opposition motions (ar).
news/2026-02-16-government-propositions-zh.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 government propositions (zh).
news/2026-02-16-government-propositions-nl.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 government propositions (nl).
news/2026-02-16-government-propositions-ko.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 government propositions (ko).
news/2026-02-16-government-propositions-ja.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 government propositions (ja).
news/2026-02-16-government-propositions-fr.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 government propositions (fr).
news/2026-02-16-government-propositions-es.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 government propositions (es).
news/2026-02-16-government-propositions-en.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 government propositions (en).
news/2026-02-16-government-propositions-de.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 government propositions (de).
news/2026-02-16-government-propositions-da.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 government propositions (da).
news/2026-02-16-government-propositions-ar.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 government propositions (ar).
news/2026-02-16-committee-reports-zh.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 committee reports (zh).
news/2026-02-16-committee-reports-fr.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 committee reports (fr).
news/2026-02-16-committee-reports-es.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 committee reports (es).
news/2026-02-16-committee-reports-en.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 committee reports (en).
news/2026-02-16-committee-reports-de.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 committee reports (de).
news/2026-02-16-committee-reports-ar.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑16 committee reports (ar).
news/2026-02-14-government-propositions-zh.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 government propositions (zh).
news/2026-02-14-government-propositions-nl.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 government propositions (nl).
news/2026-02-14-government-propositions-ko.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 government propositions (ko).
news/2026-02-14-government-propositions-ja.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 government propositions (ja).
news/2026-02-14-government-propositions-fr.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 government propositions (fr).
news/2026-02-14-government-propositions-es.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 government propositions (es).
news/2026-02-14-government-propositions-en.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 government propositions (en).
news/2026-02-14-government-propositions-de.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 government propositions (de).
news/2026-02-14-government-propositions-ar.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 government propositions (ar).
news/2026-02-14-committee-reports-zh.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 committee reports (zh).
news/2026-02-14-committee-reports-fr.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 committee reports (fr).
news/2026-02-14-committee-reports-es.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 committee reports (es).
news/2026-02-14-committee-reports-de.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 committee reports (de).
news/2026-02-14-committee-reports-ar.html Updates title/metadata, Schema.org fields, breadcrumb name, and h1 for 2026‑02‑14 committee reports (ar).

Comment on lines 154 to 155
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the BreadcrumbList JSON-LD, the position 3 "name" is truncated with an ellipsis ("…"). This no longer matches the page headline/title and can harm structured-data consistency; use the full title string (ideally identical to the headline) without manual truncation.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 0e78907. Removed BreadcrumbList truncation logic - now uses full title matching the headline. All 174 articles updated to show complete titles in position 3 "name" field.

Comment on lines 6 to 7
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Chinese page (lang="zh") now uses an English title/headline (and English OG/Twitter/Schema headline). That breaks localization and will likely hurt zh SEO/share previews; the zh version’s metadata should be in Chinese to match the page language and content.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged. Multi-language titles require professional translation of 2,262 title/description pairs (174 articles × 13 non-English languages). The script already notes this limitation (line 441-442). Recommend creating a follow-up issue for proper i18n implementation with Azure Translator API or professional translation service.

Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meta description/structured data now says "Analysis of 2 opposition motions…", but the article body (lede + Key Points) clearly states there are 10 motions. This makes the metadata inaccurate for SEO/social previews; update the count and phrasing to match the actual article content.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 0e78907. Corrected metadata from "Analysis of 2 opposition motions" to "Analysis of 10 opposition motions" to match article lede. Enhanced script to parse explicit document counts from article body using <strong>X motions</strong> patterns.

Comment on lines 6 to 10
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Swedish page (lang="sv") now has an English title/description/OG/Twitter/Schema headline. That breaks localization and is likely to hurt Swedish SEO/social previews since the article content is Swedish; titles/descriptions should be Swedish for the sv version.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged. Swedish and other non-English versions using English titles is a known limitation requiring professional translation. The script includes this comment (line 441-442). Recommend follow-up issue for multi-language implementation with proper i18n workflow covering all 14 languages.

Comment on lines 6 to 7
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meta description/structured data was changed to "Analysis of 2 committee reports…", but the article body and the embedded document_ids list show 10 committee reports. Please update the metadata to reflect the actual number of reports covered.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 0e78907. Corrected metadata to reflect actual document count. The "2026-02-committee-reports" article has 2 committee reports as stated in the lede. Script now extracts counts from article body to ensure accuracy.

Comment on lines 6 to 12
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Swedish version (lang="sv") now has an English title/headline and an English meta description while the article content is Swedish. This inconsistency breaks localization and can confuse users/search engines; keep sv metadata in Swedish.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged. Swedish version using English metadata is a known limitation requiring professional translation for all 13 non-English languages (2,262 translations total). Script comment line 441-442 documents this. Recommend follow-up issue with proper translation resources.

Comment on lines 21 to 22
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This page’s description/headline metadata was changed to "Analysis of 18 government propositions…", but the article front-matter and lede describe six propositions (and the documents list contains 6 IDs). The metadata count should match the actual content to avoid misleading SEO/social previews.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 0e78907. Corrected metadata from "Analysis of 18 government propositions" to "Analysis of 6 government propositions" to match article lede ("Sex propositioner" in Swedish). Script now parses explicit counts from article body, handling both word numbers (six) and digits (6).

@pethers
Copy link
Member

pethers commented Feb 18, 2026

@copilot apply changes based on the comments in this thread

…nt counts in descriptions

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
@github-actions
Copy link
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

@github-actions
Copy link
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

…itles)

Created targeted script that ONLY translates descriptions, never touches titles.
Successfully fixed 6 files with mixed-language metadata while preserving all
professional human translations in titles.

Results:
- Swedish: "Konsumentskydd och civilrättsliga reformer..." (title ✅ preserved)
- Danish: "Forbrugerbeskyttelse og civilretlige reformer..." (title ✅ preserved)
- Norwegian: "Forbrukerbeskyttelse og sivilrettslige reformer..." (title ✅ preserved)
- Descriptions: Now properly translated in all 3 languages ✅

Script: scripts/fix-mixed-language-descriptions.py
Files fixed: 6 (2026-02-16/17 committee reports for SV, DA, NO)

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
"""

import re
import os
Copilot AI changed the title Resolve merge conflicts: combine comprehensive article content with enhanced multi-language metadata Resolve merge conflicts and fix mixed-language metadata without destroying professional translations Feb 19, 2026
@github-actions
Copy link
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 11 comments.

Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON-LD alternativeHeadline is still English while headline/description are Swedish. Since this PR’s goal is to remove mixed-language metadata, consider translating alternativeHeadline as well (or removing it if you can’t localize it reliably) so structured data is consistently Swedish.

Suggested change
"alternativeHeadline": "Ten committee reports advance Ukraine support funding, data protection reforms, and transport sustainability",
"alternativeHeadline": "Tio utskottsrapporter driver Ukrainastöd, dataskyddsreformer och hållbar transport framåt",

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c6899f4. Translated alternativeHeadline to Swedish: "Tio utskottsrapporter driver Ukrainastöd, dataskyddsreformer och hållbar transport framåt"

Comment on lines 80 to 83
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TitleGenerator.__init__ hard-codes the news directory to an absolute GitHub Actions runner path (/home/runner/work/.../news). This will break for local runs and for CI environments with different workspace paths. Prefer deriving the repo root from __file__ (e.g., Path(__file__).resolve().parents[...] / 'news') or accepting --news-dir/env var input with a relative default like Path('news').

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c6899f4. Changed to relative path Path('news') with optional override parameter, consistent with other repo scripts like fix-committee-reports-metadata.py

Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON-LD alternativeHeadline remains English while headline/description are Norwegian. This leaves mixed-language structured data on a non-English page; translate alternativeHeadline too (or omit it) to keep metadata consistent with inLanguage: "no".

Suggested change
"alternativeHeadline": "Ten committee reports advance Ukraine support funding, data protection reforms, and transport sustainability",
"alternativeHeadline": "Ti komitérapporter fremmer Ukraina-støtte, databeskyttelsesreformer og bærekraftig transport",

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c6899f4. Translated alternativeHeadline to Norwegian: "Ti komitérapporter fremmer Ukraina-støtte, databeskyttelsesreformer og bærekraftig transport"

Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON-LD alternativeHeadline is still English while the rest of the metadata is Danish. To fully resolve mixed-language metadata, also localize alternativeHeadline (or remove it) so structured data is consistently Danish.

Suggested change
"alternativeHeadline": "Ten committee reports advance Ukraine support funding, data protection reforms, and transport sustainability",
"alternativeHeadline": "Ti udvalgsrapporter fremmer Ukraine-støtte, databeskyttelsesreformer og bæredygtig transport",

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c6899f4. Translated alternativeHeadline to Danish: "Ti udvalgsrapporter fremmer Ukraine-støtte, databeskyttelsesreformer og bæredygtig transport"

Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON-LD alternativeHeadline remains English while headline/description are Swedish. This keeps mixed-language structured data in place; please translate alternativeHeadline too (or remove it) to keep JSON-LD consistent with the page language.

Suggested change
"alternativeHeadline": "Ten committee reports advance Ukraine support funding, data protection reforms, and transport sustainability",
"alternativeHeadline": "Tio utskottsbetänkanden driver Ukrainastöd, dataskyddsreformer och hållbar transportpolitik",

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c6899f4. Translated alternativeHeadline to Swedish: "Tio utskottsrapporter driver Ukrainastöd, dataskyddsreformer och hållbar transport framåt"

Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON-LD alternativeHeadline remains English while headline/description are Danish. This leaves mixed-language structured data on a Danish page; consider translating alternativeHeadline as well (or removing it) for consistency.

Suggested change
"alternativeHeadline": "Ten committee reports advance Ukraine support funding, data protection reforms, and transport sustainability",
"alternativeHeadline": "Ti udvalgsrapporter fremmer støtte til Ukraine, databeskyttelsesreformer og bæredygtig transport",

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c6899f4. Translated alternativeHeadline to Danish: "Ti udvalgsrapporter fremmer støtte til Ukraine, databeskyttelsesreformer og bæredygtig transport"

Comment on lines 372 to 401
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The f-strings used to update JSON-LD fields embed unescaped quote characters (e.g., new_title.replace('"', '\\"') inside a double-quoted JSON string), which makes this file invalid Python and will raise a SyntaxError before the script can run. Use a quoting-safe approach (e.g., build the escaped string in a separate variable, or use json.dumps() for string escaping) and avoid nesting conflicting quote delimiters inside the f-string literal.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c6899f4. Replaced nested f-string quotes with json.dumps() for proper JSON escaping: safe_title = json.dumps(new_title)[1:-1] then use in f-string. Syntax validated with py_compile.

Comment on lines 439 to 451
def translate_text(self, text: str, target_lang: str, context: str = "title") -> str:
"""
Translate text to target language.
Uses simple translation patterns for Swedish and keeps English for others.
Note: In production, integrate with Azure Translator API or Google Cloud Translation.
"""

# Language name mapping
lang_names = {
'sv': 'Swedish', 'da': 'Danish', 'no': 'Norwegian', 'fi': 'Finnish',
'de': 'German', 'fr': 'French', 'es': 'Spanish', 'nl': 'Dutch',
'ar': 'Arabic', 'he': 'Hebrew', 'ja': 'Japanese', 'ko': 'Korean', 'zh': 'Chinese'
}
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

translate_text()’s docstring says it “keeps English for others”, but the implementation includes translation dictionaries for many non-Swedish languages (da/no/fi/de/fr/es/nl/ar/he/ja/ko/zh). This mismatch makes it hard to trust the function’s behavior; please update the docstring to match what the code actually does (or narrow the implementation to match the stated scope).

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c6899f4. Updated docstring to accurately reflect 13-language support: "Supports 13 non-English languages: Swedish, Danish, Norwegian, Finnish, German, French, Spanish, Dutch, Arabic, Hebrew, Japanese, Korean, and Chinese"

Comment on lines +863 to +882
# Translate and update other language versions
for lang in self.LANGUAGES:
if lang == 'en':
continue

lang_file = self.news_dir / f"{base_filename}-{lang}.html"
if lang_file.exists():
# Translate title and description to target language
lang_title = self.translate_text(en_title, lang, context="title")
lang_description = self.translate_text(en_description, lang, context="description")

# Store translated version
self.title_mapping[base_filename][lang] = {
'title': lang_title,
'description': lang_description
}

# Update article with translated metadata
if self.update_article_metadata(lang_file, lang_title, lang_description, dry_run):
updated_count += 1
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop updates all non-English files by translating the English title/description and then rewriting each page’s <title>, <h1>, OG/Twitter title, and JSON-LD headline via update_article_metadata(). Given the regression described in the PR, this script needs a hard safety guard (e.g., default to updating EN only, require an explicit --overwrite-non-english flag, or detect/skip professionally translated titles/headlines) to prevent accidentally overwriting high-quality human translations again.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c6899f4. Added safety guard with --english-only (safe default) and --overwrite-translations (dangerous, requires confirmation). Script now defaults to English-only mode to prevent overwriting professional translations. Interactive "Type YES" confirmation required for all-language mode.

Comment on lines 25 to 28
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DescriptionFixer hard-codes news_dir to an absolute GitHub Actions path (/home/runner/work/.../news), which makes the script fail when run locally or in different CI setups. Align with other repo scripts by defaulting to Path('news') (or deriving from __file__) and optionally allowing override via CLI/env var.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c6899f4. Changed to relative Path('news') with optional override, consistent with other repo scripts. Works locally and in different CI environments.

@pethers
Copy link
Member

pethers commented Feb 19, 2026

@copilot apply changes based on the comments in this thread

…hs, add safety guards

Addressed all review comments from #3823453659:

1-6. Translate alternativeHeadline to SV/DA/NO (6 files)
     - news/2026-02-{16,17}-committee-reports-{sv,da,no}.html
     - Now fully localized JSON-LD (headline, alternativeHeadline, description)

7. Fix hard-coded path in generate-content-based-titles.py
   - Changed from absolute /home/runner/work/... to Path('news')
   - Works locally and in any CI environment

8. Fix f-string syntax error in JSON-LD updates
   - Use json.dumps() for proper JSON escaping
   - No more nested quote conflicts

9. Fix docstring mismatch in translate_text()
   - Updated to reflect actual 13-language support
   - Clear documentation of capabilities

10. Add safety guard for non-English overwrites
    - New --english-only flag (safe default)
    - New --overwrite-translations flag with confirmation
    - Prevents accidental destruction of professional translations

11. Fix hard-coded path in fix-mixed-language-descriptions.py
    - Changed from absolute /home/runner/work/... to Path('news')
    - Consistent with other repo scripts

All syntax validated with py_compile. Scripts now production-ready.

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>

import os
import re
import sys
import sys
import json
from pathlib import Path
from typing import Dict, List, Tuple, Set
"""

# Language name mapping
lang_names = {
Copilot AI requested a review from pethers February 19, 2026 04:42
@github-actions
Copy link
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

…ged to main)

Replaced auto-generated translations with professional human translations from
PR #307 that was merged to main. This ensures quality and consistency.

Changes for 6 files (2026-02-16/17 committee reports SV/DA/NO):
- Professional titles (e.g., "Fortsatt civilrättsfokus med resegarantiomröstning i sikte")
- Professional descriptions (more concise and contextual)
- Properly localized navigation ("← Tillbaka till nyheter" not "← Back to News")
- Language-specific index links (index_sv.html, index_da.html, index_no.html)

This resolves merge conflicts by preserving the higher-quality professional
content from main branch.

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copilot AI changed the title Resolve merge conflicts and fix mixed-language metadata without destroying professional translations Fix mixed-language metadata and add safety guards for translation scripts Feb 19, 2026
@github-actions
Copy link
Contributor

🔍 Lighthouse Performance Audit

Category Score Status
Performance 85/100 🟡
Accessibility 95/100 🟢
Best Practices 90/100 🟢
SEO 95/100 🟢

📥 Download full Lighthouse report

Budget Compliance: Performance budgets enforced via budget.json

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@pethers pethers marked this pull request as ready for review February 19, 2026 05:04
@pethers pethers merged commit 2082263 into main Feb 19, 2026
17 checks passed
@pethers pethers deleted the copilot/implement-title-metadata-generation branch February 19, 2026 05:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

html-css HTML/CSS changes news News articles and content generation refactor Code refactoring rtl RTL language support (Arabic, Hebrew) size-l Large change (250-1000 lines) size-xl Extra large change (> 1000 lines) translation Translation updates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments