[fix](test) Make test_analyze_long_string Case 5 stable against sample rows randomness#64408
Open
yujun777 wants to merge 1 commit into
Open
[fix](test) Make test_analyze_long_string Case 5 stable against sample rows randomness#64408yujun777 wants to merge 1 commit into
yujun777 wants to merge 1 commit into
Conversation
…e rows randomness
Case 5 forces the DUJ1 template via debug point and uses `sample rows 3`
on 5 rows where only 1 had a long big_str value exceeding
statistics_max_string_column_length (1024). With only 1/5 rows exceeding
the limit, the sample had ~40% chance of missing the long row, causing
the assert_true guard to never fire and big_str to complete without a
skip message.
Key changes:
- Change all 5 rows in Case 5 to use repeat('z', 2048) for big_str so
the long-string guard triggers regardless of which rows are sampled
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
morrySnow
approved these changes
Jun 11, 2026
Contributor
|
PR approved by at least one committer and no changes requested. |
Contributor
|
PR approved by anyone and no changes requested. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
related PR: #62686
Case 5 (DUJ1 template) uses
sample rows 3on a table with 5 rows, butonly 1 row had a long string exceeding statistics_max_string_column_length
(1024). With
sample rows 3reading only 3 of 5 rows, there was a ~40%chance that the long row was missed and the assert_true guard never fired.
When missed, big_str completed normally with an empty skip message,
causing the test assertion to fail:
expected skip reason visible for col big_str, got msg=
==> expected: but was:
The fix makes all 5 rows have repeat('z', 2048) for big_str, so the
long-string guard always triggers regardless of which rows are sampled.
No other cases are affected: Case 1/2/6 use full-table analyze, Case 3
uses sample percent 100, and Case 4 explicitly expects the guard NOT
to apply (partition path).