Skip to content

Set UTF-8 locale in integration_test_setup.sh for all shell tests#28994

Draft
fmeum wants to merge 4 commits intobazelbuild:masterfrom
fmeum:fix-windows-unicode-tests
Draft

Set UTF-8 locale in integration_test_setup.sh for all shell tests#28994
fmeum wants to merge 4 commits intobazelbuild:masterfrom
fmeum:fix-windows-unicode-tests

Conversation

@fmeum
Copy link
Copy Markdown
Collaborator

@fmeum fmeum commented Mar 13, 2026

Summary

  • Move the LC_ALL export from individual test files (loading_phase_test.sh, unicode_test.sh) into the shared integration_test_setup.sh so that all shell integration tests handle multi-byte characters correctly.
  • Use C.UTF-8 on Linux and MSYS2 (Windows), and en_US.UTF-8 on macOS (where C.UTF-8 is not available).
  • The previous per-file fix used en_US.UTF-8 for all non-Linux platforms, but en_US.UTF-8 may not be available on MSYS2, which could explain why the same fix didn't work for ui_test.sh on Windows.

Fixes #28924

Test plan

  • Observe Windows CI logs for ui_test.sh (test_fancy_symbol_encoding)
  • Observe Windows CI logs for target_pattern_file_test.sh (test_target_pattern_file_unicode)
  • Observe Windows CI logs for loading_phase_test.sh (Unicode tests)
  • Observe Windows CI logs for unicode_test.sh
  • Verify no regressions on Linux/macOS

Move the LC_ALL export from individual test files
(loading_phase_test.sh, unicode_test.sh) into the shared
integration_test_setup.sh so that all shell integration tests
handle multi-byte characters correctly.

Use C.UTF-8 on Linux and MSYS2 (Windows) where it is supported,
and en_US.UTF-8 on macOS where C.UTF-8 is not available.

The previous per-file fix used en_US.UTF-8 for all non-Linux
platforms, but en_US.UTF-8 may not be available on MSYS2.

Fixes bazelbuild#28924

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@google-cla
Copy link
Copy Markdown

google-cla Bot commented Mar 13, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

fmeum and others added 3 commits March 13, 2026 20:02
The previous commit set LC_ALL=C.UTF-8 globally for all integration
tests, which may have changed locale-dependent behavior (collation,
numeric formatting) in tests that don't deal with Unicode.

Instead:
- Set only LC_CTYPE in integration_test_setup.sh to enable UTF-8
  character classification without affecting other locale categories.
- Keep LC_ALL in loading_phase_test.sh and unicode_test.sh which
  specifically test Unicode handling, but fix the platform detection:
  use is_darwin (not is_linux) so that Windows/MSYS2 gets C.UTF-8
  instead of en_US.UTF-8, which may not be available on MSYS2.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The global LC_CTYPE/LC_ALL in integration_test_setup.sh may affect
tests that don't deal with Unicode. Instead, set LC_ALL only in the
specific test files that have Unicode tests:

- loading_phase_test.sh (already had it, fix platform detection)
- unicode_test.sh (already had it, fix platform detection)
- ui_test.sh (new: for test_fancy_symbol_encoding)
- target_pattern_file_test.sh (new: for test_target_pattern_file_unicode)

The key platform detection fix: use is_darwin (not is_linux) to
distinguish macOS from other platforms, so Windows/MSYS2 gets
C.UTF-8 instead of en_US.UTF-8.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Revert to the same platform detection as the confirmed-working fix in
loading_phase_test.sh: use is_linux to select C.UTF-8, fall through
to en_US.UTF-8 for macOS and Windows/MSYS2 (where en_US.UTF-8 is
available and confirmed working by kotlaja).

This also reverts loading_phase_test.sh and unicode_test.sh to their
original code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Thank you for contributing to the Bazel repository! This pull request has been marked as stale since it has not had any activity in the last 30 days. It will be closed in the next 30 days unless any other activity occurs. If you think this PR is still relevant and should stay open, please post any comment here and the PR will no longer be marked as stale.

@github-actions github-actions Bot added the stale Issues or PRs that are stale (no activity for 30 days) label Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale Issues or PRs that are stale (no activity for 30 days)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bazel CI] Windows Integration Failure: Encoding Mismatch in test_actions_write_utf8_path

1 participant