Skip to content

changed the output csv format to get all data#1941

Open
niveditasing wants to merge 2 commits intodatacommonsorg:masterfrom
niveditasing:Changed_output_format
Open

changed the output csv format to get all data#1941
niveditasing wants to merge 2 commits intodatacommonsorg:masterfrom
niveditasing:Changed_output_format

Conversation

@niveditasing
Copy link
Copy Markdown
Contributor

@niveditasing niveditasing commented Apr 1, 2026

Code Changes -
Switched to a more robust groupby and concat method that uses an Outer Join.

The processing logic now supports "sparse data," meaning it can handle the incomplete 2024 entries without triggering row deletions.

Each Statistical Variable is now processed as a subset and joined horizontally, ensuring that a lack of data in one category does not affect the data availability of another.

Tmcf changes- As the output format is changing tmcf is changed for all rows

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the Eurostat life expectancy data processing pipeline by updating the preprocess.py script and simplifying the associated TMCF file. The script has been rewritten to transform input data into a long-format CSV, dynamically generating StatVar DCIDs from age and sex dimensions and mapping geographic codes to standardized DCIDs. Consequently, the TMCF has been reduced to a single template node to accommodate this new data structure. I have no feedback to provide.

@niveditasing niveditasing requested a review from saanikaaa April 3, 2026 10:19
@HarishC727
Copy link
Copy Markdown
Contributor

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants