Set UTF-8 locale in integration_test_setup.sh for all shell tests#28994
Draft
fmeum wants to merge 4 commits intobazelbuild:masterfrom
Draft
Set UTF-8 locale in integration_test_setup.sh for all shell tests#28994fmeum wants to merge 4 commits intobazelbuild:masterfrom
fmeum wants to merge 4 commits intobazelbuild:masterfrom
Conversation
Move the LC_ALL export from individual test files (loading_phase_test.sh, unicode_test.sh) into the shared integration_test_setup.sh so that all shell integration tests handle multi-byte characters correctly. Use C.UTF-8 on Linux and MSYS2 (Windows) where it is supported, and en_US.UTF-8 on macOS where C.UTF-8 is not available. The previous per-file fix used en_US.UTF-8 for all non-Linux platforms, but en_US.UTF-8 may not be available on MSYS2. Fixes bazelbuild#28924 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
The previous commit set LC_ALL=C.UTF-8 globally for all integration tests, which may have changed locale-dependent behavior (collation, numeric formatting) in tests that don't deal with Unicode. Instead: - Set only LC_CTYPE in integration_test_setup.sh to enable UTF-8 character classification without affecting other locale categories. - Keep LC_ALL in loading_phase_test.sh and unicode_test.sh which specifically test Unicode handling, but fix the platform detection: use is_darwin (not is_linux) so that Windows/MSYS2 gets C.UTF-8 instead of en_US.UTF-8, which may not be available on MSYS2. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The global LC_CTYPE/LC_ALL in integration_test_setup.sh may affect tests that don't deal with Unicode. Instead, set LC_ALL only in the specific test files that have Unicode tests: - loading_phase_test.sh (already had it, fix platform detection) - unicode_test.sh (already had it, fix platform detection) - ui_test.sh (new: for test_fancy_symbol_encoding) - target_pattern_file_test.sh (new: for test_target_pattern_file_unicode) The key platform detection fix: use is_darwin (not is_linux) to distinguish macOS from other platforms, so Windows/MSYS2 gets C.UTF-8 instead of en_US.UTF-8. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Revert to the same platform detection as the confirmed-working fix in loading_phase_test.sh: use is_linux to select C.UTF-8, fall through to en_US.UTF-8 for macOS and Windows/MSYS2 (where en_US.UTF-8 is available and confirmed working by kotlaja). This also reverts loading_phase_test.sh and unicode_test.sh to their original code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thank you for contributing to the Bazel repository! This pull request has been marked as stale since it has not had any activity in the last 30 days. It will be closed in the next 30 days unless any other activity occurs. If you think this PR is still relevant and should stay open, please post any comment here and the PR will no longer be marked as stale. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
LC_ALLexport from individual test files (loading_phase_test.sh,unicode_test.sh) into the sharedintegration_test_setup.shso that all shell integration tests handle multi-byte characters correctly.C.UTF-8on Linux and MSYS2 (Windows), anden_US.UTF-8on macOS (whereC.UTF-8is not available).en_US.UTF-8for all non-Linux platforms, buten_US.UTF-8may not be available on MSYS2, which could explain why the same fix didn't work forui_test.shon Windows.Fixes #28924
Test plan
ui_test.sh(test_fancy_symbol_encoding)target_pattern_file_test.sh(test_target_pattern_file_unicode)loading_phase_test.sh(Unicode tests)unicode_test.sh