Add fields missing from facets in observation response #694
Add fields missing from facets in observation response #694kmoscoe merged 84 commits intodatacommonsorg:masterfrom
Conversation
Co-authored-by: Christie Ellks <calinc@google.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request updates the Python and REST observation API documentation by adding scaling_factor, unit, and is_dc_aggregate fields to facet objects and transitioning examples to median household income data. The review feedback highlights multiple syntax errors in the updated examples, such as missing commas and brackets, and suggests using camelCase for the new fields to ensure consistency with existing API conventions.
| "scaling_factor": "..." | ||
| "unit": "..." |
There was a problem hiding this comment.
The JSON example is missing commas at the end of the lines, which makes the structure invalid. Additionally, please note that the preceding line (observationPeriod) also requires a comma to accommodate these new fields.
| "scaling_factor": "..." | |
| "unit": "..." | |
| "scaling_factor": "...", | |
| "unit": "..." |
| "provenanceUrl": "...", | ||
| "measurementMethod": "...", | ||
| "observationPeriod": "..." | ||
| "scaling_factor": "..." |
There was a problem hiding this comment.
The new fields scaling_factor and is_dc_aggregate use snake_case, which is inconsistent with the existing camelCase fields in the same object (e.g., importName, measurementMethod, observationPeriod). Unless there is a specific requirement for snake_case in these new fields, it is recommended to use camelCase (scalingFactor, isDcAggregate) to maintain consistency across the API documentation and response structures.
| "scaling_factor": "..." | ||
| "unit": "..." |
| "scaling_factor": "..." | ||
| "unit": "..." |
| "scaling_factor": "..." | ||
| "unit": "..." |
| "scaling_factor": "..." | ||
| "unit": "..." |
| "scaling_factor": "..." | ||
| "unit": "..." |
| | observations | list of objects | Date and value pairs for the observations made in the time period | | ||
| | facets | object | Various properties of reported facets, where available, including the provenance of the data, etc. | | ||
| | facets | object | Various properties of reported facets, where available. | | ||
| | facets.importName | string | The name of the [provenance](/data_model.html#sources) or [dataset](/data_model.html#sources). | |
There was a problem hiding this comment.
would this ever be the dataset name? (I think it's only provenance, but not sure of any counterexamples)
There was a problem hiding this comment.
You're right that it is technically the "provenance" name. However, in many (most?) the provenance and dataset are the same. Since "provenance" is often misunderstood, I used "dataset" as a more intuitive name. But if you think it's bad to make that shortcut, I can change it.
| | facets.observationPeriod | string | The time period over which the observations were recorded, in [ISO 8601 duration format](https://docs.digi.com/resources/documentation/digidocs/90001488-13/reference/r_iso_8601_duration_format.htm). Not returned if unset. | | ||
| | facets.scaling_factor | integer | The denominator used in variables representing percentages or ratios. Not returned if unset. | | ||
| | facets.unit | string | The unit of measurement used. Not returned if unset. | | ||
| | facets.is_dc_aggregate | boolean | Set to true for variables that are auto-generated by Data Commons to aggregate data by place hierarchies. Not returned if false. | |
There was a problem hiding this comment.
place hierarchy is only one type of aggregation (for example we also aggregate multiple stat vars together or aggregate events into time series)
There was a problem hiding this comment.
This is a good point. Is there a typical time interval we use for these? Or does it vary, and it's encoded in the variable name? Can you give me an example of one? (I'll also include an example of a place-based one.)
| "provenanceUrl": "...", | ||
| "measurementMethod": "...", | ||
| "observationPeriod": "..." | ||
| "importName": "<var>DATASET_NAME</var>", |
There was a problem hiding this comment.
(see other comment) not sure if we're explicitly distinguishing between provenance and dataset
There was a problem hiding this comment.
Yeah, see my above comment. Provenance is such a confusing concept that I hate to use it. But if you think the inaccuracy is bad, I will fix it.
This PR corrects an oversight that neglected to include 3 extra optional fields in the facet returned in the observation API response. It also changes one of the examples (in both REST and Python) to a different kind of statvar that shows units.