Changed output format and tmcf for import Eurostatdata_lifeexpectancy by niveditasing · Pull Request #1943 · datacommonsorg/data

niveditasing · 2026-04-02T11:48:43Z

No description provided.

gemini-code-assist

Code Review

This pull request refactors the Eurostat life expectancy data pipeline by simplifying the TMCF file into a single template node and overhauling the preprocessing script. The Python script has been updated to produce a long-format CSV, improve StatVar and place mapping logic, and remove the automated download functionality. Feedback includes removing an unused re import, cleaning up internal notes within the code comments, and simplifying a redundant argument in a string split operation.

gemini-code-assist · 2026-04-02T11:54:46Z

scripts/eurostat/regional_statistics_by_nuts/life_expectancy/preprocess.py

 import pandas as pd
-from six.moves import urllib
+import numpy as np
 import re


The re module is imported but no longer explicitly used in the refactored code. Pandas' str.extract handles regex internally without requiring this import.

gemini-code-assist · 2026-04-02T11:54:46Z

scripts/eurostat/regional_statistics_by_nuts/life_expectancy/preprocess.py

+    # But wait, freq is stripped in the original preprocess.py?
+    # data['unit,sex,age,geo\time'] = data['unit,sex,age,geo\time'].str.slice(2)
+    # Let's check the first column content


These comments appear to be internal notes or monologue and should be removed to maintain code cleanliness.

gemini-code-assist · 2026-04-02T11:54:47Z

scripts/eurostat/regional_statistics_by_nuts/life_expectancy/preprocess.py

+    if pd.isna(entry) or entry == ':':
+        return np.nan
+    if isinstance(entry, str):
+        entry = entry.split(' ', maxsplit=-1)[0]
+        if entry == ':':
+            return np.nan
+        try:
+            return float(entry)
+        except ValueError:
+            return np.nan
+    return entry


The maxsplit=-1 argument in split() is the default behavior and can be omitted for brevity. The logic for handling the ':' character and converting to float is maintained for correctness.

Suggested change

if pd.isna(entry) or entry == ':':

return np.nan

if isinstance(entry, str):

entry = entry.split(' ', maxsplit=-1)[0]

if entry == ':':

return np.nan

try:

return float(entry)

except ValueError:

return np.nan

return entry

def obtain_value(entry):

"""Extract value from entry."""

if pd.isna(entry) or entry == ':':

return np.nan

if isinstance(entry, str):

entry = entry.split(' ')[0]

if entry == ':':

return np.nan

try:

return float(entry)

except ValueError:

return np.nan

return entry

niveditasing added 5 commits April 1, 2026 09:55

changed the output csv format to get all data

8f4d3df

changed the output csv format to get all data

b415f56

added a logic to skip html file returend by the server

25b5bc4

fixed mistakenly changed file

b822349

fixed code

556d21a

gemini-code-assist bot reviewed Apr 2, 2026

View reviewed changes

code fix

7927668

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changed output format and tmcf for import Eurostatdata_lifeexpectancy#1943

Changed output format and tmcf for import Eurostatdata_lifeexpectancy#1943
niveditasing wants to merge 6 commits intodatacommonsorg:masterfrom
niveditasing:Changed_output_format_and_tmcf

niveditasing commented Apr 2, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 2, 2026

Uh oh!

gemini-code-assist bot Apr 2, 2026

Uh oh!

gemini-code-assist bot Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

niveditasing commented Apr 2, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant