mrregister: fix race condition in displacement update#3277
mrregister: fix race condition in displacement update#3277
Conversation
| } | ||
|
|
||
| private: | ||
| default_type &overall_max_norm_squared; |
There was a problem hiding this comment.
warning: member 'overall_max_norm_squared' of type 'default_type &' (aka 'double &') is a reference [cppcoreguidelines-avoid-const-or-ref-data-members]
default_type &overall_max_norm_squared;
^There was a problem hiding this comment.
I know we've used this approach throughout the code base, but I think it may have been criticised at some point. Is there something better, even if only for my own knowledge when I need to do similar in the future? My naive instinct would be std::shared_ptr<default_type>, but there's lots in the language I don't know about.
There was a problem hiding this comment.
In the code above, I did stick with using references to conform to our existing conventions, but perhaps we should change this. My opinion is that we should use raw pointers for this sort of thing. In particular, we should use const pointers:
// Cannot modify the pointer once set, but can modify value
default_type* const overall_max_norm_squared;
The "modern C++" recommendation is to avoid raw pointers only if they're owning, but I think it's perfectly legitimate to use them non-owning versions.
Shared pointers would be a mistake here I think. std::shared_ptr is about ownership, in this case the functor isn't an owner of the resource but just a reader.
There was a problem hiding this comment.
std::shared_ptris about ownership
Would this not be a circumstance in which to use std::weak_ptr<>? Wouldn't obviate mutexing but would better implicitly communicate that it's not accessed unless .lock() is called.
(again, happy to refactor all instances of such at a later date if a design pattern is accepted)
There was a problem hiding this comment.
std::weak_ptr would make sense if we have a std::shared_ptr already. In this case, we're dealing a local stack variable, so smart pointers aren't in the picture. If max_norm_squared was encapsulated in a shared pointer, then a weak_ptr may makes sense.
|
Even if the fault was discovered on |
|
OK will rebase this onto |
Replace the shared max norm lambda in scaling-and-squaring with a thread local functor reduction and mutex-protected merge.
7b81eb1 to
88ffd23
Compare
The following race condition was identified with TSAN:
The cause was this piece of code:
Here,
max_normis a local stack variable. The lambda captures it by reference thus the variable is subject to reads/writes from different threads without any synchronisation. This PR fixes the problem by replacing the shared max norm lambda with a thread local functor reduction and mutex-protected merge. Also rather than computing as square root inside the functor, we compute the squared norm and only at the end evaluate the square root.