Conversation
bshand
left a comment
There was a problem hiding this comment.
I think this needs an ensure block, or a little reworking to simplify the logic. Further details in the code comment.
lib/ndr_import/table.rb
Outdated
| # Keep a copy of the original column mappings for use later if columns are mutated | ||
| original_columns = @columns.deep_dup | ||
| @columns_mutated = false | ||
| last_col = last_column_to_transform | ||
| skip_footer_lines(lines, footer_lines).each do |line| | ||
| line.is_a?(Array) ? process_line(line[0..last_col], &block) : process_line(line, &block) | ||
| end | ||
|
|
||
| # @columns may have been mutated where column is a regular expression. | ||
| # We want to restore `@columns` back to its original state, so the column regexp | ||
| # will work on the next file | ||
| @columns = original_columns if @columns_mutated | ||
| # Also ensure that @masked_mappings are recalculated if @columns has been mutated | ||
| @masked_mappings = nil if @columns_mutated | ||
|
|
There was a problem hiding this comment.
This feels clunky.
Can we rather set @original_columns = @columns in the initializer, and then always set @columns = @original_columns.deep_dup in this method?
Alternatively, the last block (lines 62-67) should be in an ensure block, in case the earlier code fails.
Is it worth keeping the @columns_mutated flag? It seems like a lot of extra complexity for a small performance optimisation. Do we ever call #transform inside a tight loop?
There was a problem hiding this comment.
Thanks @bshand - some changes made in e72a86a
- Removed
@columns_mutatedflag - Set
@original_columnsin the initializer.
and then always set @columns = @original_columns.deep_dup in this method?
I needed to use .deep_dup in the initializer when setting @original_columns, otherwise @original_columns was mutated when @columns was mutated.
There was a problem hiding this comment.
Hmm, this is breaking some downstream tests when pointing at this branch, I'll investigate
| # @columns may have been mutated where column is a regular expression. | ||
| # We want to restore `@columns` back to its original state, so the column regexp | ||
| # will work on the next file | ||
| @columns = @original_columns |
There was a problem hiding this comment.
This should be @columns = @original_columns.deep_dup otherwise it will work twice but not 3 times.
Maybe also move this to immediately after @notifier.try(:started) ?
There was a problem hiding this comment.
Yes, I've moved it up to @notifier.try(:started) locally, but lot of tests are erroring.
Despite instance_variable_set("@#{key}", options[key], it seems like @columns is not set in the initializer, so @original_columns is nil.
There was a problem hiding this comment.
A downstream issue appears to be loading NdrImport::Table classes via YAML; this doesn't hit the initializer.
This PR fixes two bugs.
vcf_file_metadatawas not defined as anattr_readeronVcf::Table.Vcf::Tableupdated accordingly.@columnsremained mutated after the table was transformed, so would then cause a header row error on subsequent files.The fix was to reset
@columnsback to its original state after the table has been transformed. We also ensure that@masked_mappingsare reculated if@columnshave been mutated. The test added inUniversalImporterHelperTestproves that mutated@columnsdo not persist beyond the transforming of a table.