-
Notifications
You must be signed in to change notification settings - Fork 5k
fix: update stream compatibility tests and improve result validation #33961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @Tony2h, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the reliability and accuracy of stream compatibility tests by introducing a more sophisticated validation mechanism. Instead of merely checking row counts, the tests now directly compare processed stream data with expected outcomes. Additionally, the changes refactor test utility functions for better flexibility and simplify the test environment setup by removing unnecessary Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request updates the stream compatibility tests by refactoring how stream results are checked, introducing a more powerful checkStreamResults method that compares actual query results against expected results. The changes also improve logging and remove obsolete code. My feedback focuses on improving the robustness and maintainability of the new test logic. I've identified a potential bug in the retry logic and suggested a refactoring to make it safer and cleaner. I also noted a minor duplication of a shell command.
I am having trouble creating individual review comments. Click here to see my feedback.
test/cases/18-StreamProcessing/23-Compatibility/test_new_stream_compatibility.py (294-339)
This new method is a great addition for more robust result validation. However, there are a couple of areas for improvement:
- Potential
UnboundLocalError: Ifretry_timesis 0, theforloop will not execute, andresult1andresult2will not be defined, leading to anUnboundLocalErroron line 327. This is a potential bug. - Code Duplication: The data extraction and logging calls are repeated after the loop. This can be avoided to improve maintainability.
- Fragile Parsing: The
extract_datafunction usesline.startswith('=' * 10)to find the data section. This is brittle and will break if the number of=characters in thetaosclient output changes. Using a regular expression would be more robust.
Here is a suggested refactoring that addresses these points by fixing the potential bug, removing duplication, and making the parsing more robust.
def checkStreamResults(self, stream: StreamItem, library_path="/usr/lib", retry_times=300):
command1 = f"LD_LIBRARY_PATH={library_path} taos -s '{stream.res_query}'"
command2 = f"LD_LIBRARY_PATH={library_path} taos -s '{stream.exp_query}'"
def extract_data(output):
lines = output.strip().split('\n')
data_lines = []
start_collecting = False
for line in lines:
if re.match(r'^=+$', line.strip()): # More robust separator check
start_collecting = True
continue
if start_collecting:
if line.startswith('Query OK'):
break
data_lines.append(line.strip())
return data_lines
for _ in range(retry_times):
result1 = subprocess.run(command1, shell=True, text=True, capture_output=True)
result2 = subprocess.run(command2, shell=True, text=True, capture_output=True)
if result1.returncode == 0 and result2.returncode == 0:
res_data = extract_data(result1.stdout)
exp_data = extract_data(result2.stdout)
tdLog.info(f"Stream {stream.id} result data: {res_data}")
tdLog.info(f"Stream {stream.id} expect data: {exp_data}")
if res_data == exp_data:
tdLog.info(f"Stream {stream.id} result matches expected data.")
return
time.sleep(1)
if 'result1' not in locals():
raise Exception(f"Stream result check for stream {stream.id} did not run, retry_times might be <= 0.")
if result1.returncode != 0 or result2.returncode != 0:
tdLog.error("Stream result retrieval failed.")
tdLog.error(f"Error1:\n{result1.stderr}")
tdLog.error(f"Error2:\n{result2.stderr}")
raise Exception("Stream result retrieval failed.")
# Re-extract data from last attempt for the assert message
res_data = extract_data(result1.stdout)
exp_data = extract_data(result2.stdout)
assert res_data == exp_data, f"Stream {stream.id} result mismatch! \nResult: {res_data}\nExpect: {exp_data}"test/cases/18-StreamProcessing/23-Compatibility/test_new_stream_compatibility.py (249-250)
The command to start taosd is constructed twice, once for logging and once for execution with os.system. This is slightly inefficient and could lead to inconsistencies if one is updated and the other is forgotten. It's better to define the command in a variable and reuse it.
command = f"rm -rf {dataPath}/* && nohup /usr/bin/taosd -c {cPath} &"
tdLog.info(f"start taosd: {command}")
os.system(command)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request updates stream compatibility tests to improve result validation and refactors the test infrastructure. The main focus is replacing manual stream result checking with a more robust validation approach that compares actual stream results against expected query results.
- Refactored result validation to use a new
checkStreamResultsmethod that compares full data output rather than just row counts - Updated library path handling to use build-specific paths instead of hardcoded
/usr/lib - Simplified process management by replacing framework calls with direct system commands
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| result1 = subprocess.run(command1, shell=True, text=True, capture_output=True) | ||
| result2 = subprocess.run(command2, shell=True, text=True, capture_output=True) | ||
| if result1.returncode == 0 and result2.returncode == 0: | ||
| res_data = extract_data(result1.stdout) | ||
| exp_data = extract_data(result2.stdout) | ||
|
|
||
| tdLog.info(f"Stream {stream.id} result data: {res_data}") | ||
| tdLog.info(f"Stream {stream.id} expect data: {exp_data}") | ||
|
|
||
| if res_data == exp_data: | ||
| tdLog.info(f"Stream {stream.id} result matches expected data.") | ||
| return | ||
| time.sleep(1) | ||
|
|
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method executes two subprocess commands on every retry iteration, which could be inefficient when retry_times is 300 (the default). Consider adding exponential backoff or increasing the sleep interval after multiple failed attempts to reduce unnecessary system load during extended wait periods.
| time.sleep(1) | ||
|
|
||
| if result1.returncode != 0 or result2.returncode != 0: | ||
| tdLog.error("Stream result retrieval failed.") |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variables result1 and result2 may not be defined at this point if the retry loop exits without entering the conditional block. This occurs when all retries fail with a non-zero return code but the loop completes without entering the if block at line 315. This will cause a NameError when trying to access result1.returncode or result1.stderr.
| test_stream_compatibility.res_count;""", | ||
| exp_query="""select _wstart, _wend, sum(v1) as sum_v1, avg(v2) as | ||
| exp_query="""select _wstart ts, _wend te, sum(v1) sum_v1, avg(v2) | ||
| avg_v2 from test_stream_compatibility.ctb1 count_window(3) |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent SQL query format. The stream with id=0 uses aliases without the AS keyword (e.g., _wstart ts), but streams with id=1 and id=2 use full aliases with AS keyword and commas (e.g., _wstart, _wend). This inconsistency means stream id=0's expected query will have different column names than the result query, potentially causing validation failures.
| assert self.checkStreamResultRows("res_state", 3) | ||
| assert self.checkStreamResultRows("res_inter", 3) | ||
|
|
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing library_path parameter. The method checkStreamResultRows is called without passing the library_path parameter, which means it will use the default /usr/lib instead of the correct library path for the old version being tested. This inconsistency could cause the method to use the wrong library.
| tdLog.info(f"Successfully installed enterprise package from {package_path}") | ||
|
|
||
| os.system(f"pkill -9 taosd") | ||
| os.system(f"pkill taosd") |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change from pkill -9 to pkill removes the forceful termination signal (SIGKILL). While this is generally better practice to allow graceful shutdown, if the process doesn't respond to SIGTERM, it may hang indefinitely. Consider adding a timeout mechanism or a fallback to pkill -9 if graceful shutdown fails.
| self.prepareDataOnOldVersion(base_version) | ||
|
|
||
| tdCb.killAllDnodes() | ||
| os.system(f"pkill taosd") |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent process killing approach. Line 97 uses os.system(f"pkill taosd") while line 245 uses the same command, but the original code at line 99 used tdCb.killAllDnodes(). Using the framework's method would be more consistent and may provide better error handling. The direct pkill command should be replaced with the framework's method or the framework method should be updated at line 99.
| start_collecting = False | ||
| for line in lines: | ||
| if line.startswith('=' * 10): # Separator line | ||
| start_collecting = True |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data extraction logic assumes separator lines start with at least 10 equal signs. If the taos CLI output format changes or uses fewer equal signs in separator lines, the extraction will fail silently by returning empty data. Consider making the separator pattern more robust or documenting the expected output format.
| res_data = extract_data(result1.stdout) | ||
| exp_data = extract_data(result2.stdout) | ||
|
|
||
| tdLog.info(f"Stream {stream.id} result data: {res_data}") | ||
| tdLog.info(f"Stream {stream.id} expect data: {exp_data}") | ||
|
|
||
| assert res_data == exp_data, f"Stream {stream.id} result mismatch! \nResult: {res_data}\nExpect: {exp_data}" |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential variable scope issue. The variables res_data and exp_data are extracted inside the retry loop (lines 316-317) and then used outside the loop (lines 335-336, 338). If the loop exits after all retries without success but with return code 0, the assertion at line 338 will use stale data from the last iteration rather than indicating a timeout failure. Consider raising an exception for timeout instead of proceeding to assertion.
… in compatibility tests
… stream compatibility tests
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.