Skip to content

Comments

FEAT: mssql-python driver changes to support bulk copy logging.#430

Open
subrata-ms wants to merge 9 commits intomainfrom
subrata-ms/bulkcopy_py
Open

FEAT: mssql-python driver changes to support bulk copy logging.#430
subrata-ms wants to merge 9 commits intomainfrom
subrata-ms/bulkcopy_py

Conversation

@subrata-ms
Copy link
Contributor

@subrata-ms subrata-ms commented Feb 12, 2026

Work Item / Issue Reference

AB#41882


Summary

This pull request introduces significant improvements to the logging system in the mssql_python package, focusing on performance optimization, better integration with Rust (py-core), and enhanced test coverage. The changes optimize how logging levels are checked, ensure that logging from Rust code is handled correctly, and add thorough tests for these new features.

Key changes:

Logging Performance and Behavior Improvements

  • Switched the default logging level from CRITICAL to WARNING, ensuring that warnings and errors are visible by default without explicit configuration. Also, a default stderr handler is now added for warnings and errors. [1] [2]
  • Introduced cached logging level checks (_cached_level and _is_debug_enabled) for fast, low-overhead decision-making in logging calls, with a new is_debug_enabled property for quick checks. [1] [2] [3]
  • Updated all logging calls in _bulkcopy (in cursor.py) to use the new fast check for debug logging, and to avoid unnecessary logger passing to Rust unless logging is enabled. [1] [2] [3]

Rust (py-core) Logging Integration

  • Added a new py_core_log method to the logger, allowing Rust-side code to emit logs with custom source locations and levels, and ensuring these logs are formatted and filtered consistently with Python logs.
  • Enhanced the CSV log formatter to recognize and format logs originating from py-core distinctly.

Robustness and Error Handling

  • Ensured that logging always allows WARNING and ERROR messages, even when the logger is otherwise disabled, both in Python and Rust-originated logs. [1] [2]
  • Improved error logging in _bulkcopy for various failure cases, such as missing dependencies or invalid parameters. [1] [2] [3]

Test Suite Enhancements

  • Added comprehensive tests for the new is_debug_enabled property and for py_core_log, covering level filtering, custom source locations, CSV formatting, error fallback, and integration with different Rust files.
  • Updated test cleanup utilities to maintain consistency with the new cached logging level logic. [1] [2]

These changes collectively improve logging performance, reliability, and cross-language integration, while ensuring robust test coverage for the new features.
High Level Design
image

Success Test Log
image

Failure Test Log
image

@github-actions
Copy link

github-actions bot commented Feb 12, 2026

📊 Code Coverage Report

🔥 Diff Coverage

62%


🎯 Overall Coverage

76%


📈 Total Lines Covered: 5514 out of 7188
📁 Project: mssql-python


Diff Coverage

Diff: main...HEAD, staged and unstaged changes

  • mssql_python/logging.py (62.5%): Missing lines 165-166,175-176,359-361,364,366,377-378,380-383

Summary

  • Total: 40 lines
  • Missing: 15 lines
  • Coverage: 62%

mssql_python/logging.py

Lines 161-170

  161         class CSVFormatter(logging.Formatter):
  162             def format(self, record):
  163                 # Check if this is from py-core (via py_core_log method)
  164                 if hasattr(record, "funcName") and record.funcName == "py-core":
! 165                     source = "py-core"
! 166                     message = record.getMessage()
  167                 else:
  168                     # Extract source from message (e.g., [Python] or [DDBC])
  169                     msg = record.getMessage()
  170                     if msg.startswith("[") and "]" in msg:

Lines 171-180

  171                         end_bracket = msg.index("]")
  172                         source = msg[1:end_bracket]
  173                         message = msg[end_bracket + 2 :].strip()  # Skip '] '
  174                     else:
! 175                         source = "Unknown"
! 176                         message = msg
  177 
  178                 # Format timestamp with milliseconds using period separator
  179                 timestamp = self.formatTime(record, "%Y-%m-%d %H:%M:%S")
  180                 timestamp_with_ms = f"{timestamp}.{int(record.msecs):03d}"

Lines 355-370

  355             msg: Message string (already formatted)
  356             filename: Source filename (e.g., 'cursor.rs')
  357             lineno: Line number in source file
  358         """
! 359         try:
! 360             # Fast level check using cached level (same optimization as _log method)
! 361             # Exception: Always allow WARNING and ERROR messages through
  362             if level < self._cached_level and level < logging.WARNING:
  363                 return
! 364 
  365             # Create a custom LogRecord with Rust source location
! 366             import logging as log_module
  367 
  368             record = log_module.LogRecord(
  369                 name=self._logger.name,
  370                 level=level,

Lines 373-387

  373                 msg=msg,
  374                 args=(),
  375                 exc_info=None,
  376                 func="py-core",
! 377                 sinfo=None,
! 378             )
  379             self._logger.handle(record)
! 380         except Exception:
! 381             # Fallback - use regular logging
! 382             try:
! 383                 self._logger.log(level, msg)
  384             except:
  385                 pass
  386 
  387     def _log(self, level: int, msg: str, add_prefix: bool = True, *args, **kwargs):


📋 Files Needing Attention

📉 Files with overall lowest coverage (click to expand)
mssql_python.pybind.logger_bridge.hpp: 58.8%
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.row.py: 66.2%
mssql_python.pybind.ddbc_bindings.cpp: 69.3%
mssql_python.pybind.ddbc_bindings.h: 69.7%
mssql_python.pybind.connection.connection.cpp: 75.3%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.pybind.connection.connection_pool.cpp: 79.6%
mssql_python.logging.py: 81.7%
mssql_python.cursor.py: 84.7%

🔗 Quick Links

⚙️ Build Summary 📋 Coverage Details

View Azure DevOps Build

Browse Full Coverage Report

@subrata-ms subrata-ms changed the title bulk copy changes for logging FEAT: mssql-python driver changes to support bulk copy logging. Feb 16, 2026
@github-actions github-actions bot added the pr-size: medium Moderate update size label Feb 16, 2026
@subrata-ms subrata-ms marked this pull request as ready for review February 16, 2026 14:23
Copilot AI review requested due to automatic review settings February 16, 2026 14:23
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enhances the bulk copy operation in mssql_python with comprehensive logging infrastructure to improve troubleshooting and observability. The changes introduce performance-optimized logging with cached level checks, a new method for Rust core integration, and extensive debug logging throughout the bulk copy workflow.

Changes:

  • Added cached logging level (_cached_level and _is_debug_enabled) to avoid repeated isEnabledFor() calls in performance-critical paths
  • Implemented py_core_log() method to support structured logging from Rust/TDS core with custom source location tracking
  • Enhanced _bulkcopy method with detailed debug and info logging at all major stages (validation, connection setup, execution, cleanup), all gated by is_debug_enabled checks for minimal performance impact when disabled

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
mssql_python/logging.py Adds cached logging level for fast checks, implements py_core_log for Rust integration, updates _log to use cached level and proper stacklevel, and updates CSVFormatter to handle py-core records
mssql_python/cursor.py Adds comprehensive logging throughout _bulkcopy method including parameter validation, connection setup, column mappings, execution, results, and cleanup - all conditionally gated by is_debug_enabled

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@bewithgaurav
Copy link
Collaborator

@subrata-ms - I think there are some tests which need to be fixed, let me know once done, will review it

python_logger=logger if is_logging else None, # Only pass logger if enabled
)

if is_logging:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen if we dont have the is_logging check?
The note about is_logging says that its the fast path check.

But could we simply not do this check before logger.info and just call logger info? I am trying to understand the applicability of this check for bulk copy before calling logger.info. If its applicable here, then is there a plan to have this path everywhere else as well?
It would nice to see this code align with other parts of the mssql-python

# Create the underlying Python logger
self._logger = logging.getLogger("mssql_python")
self._logger.setLevel(logging.CRITICAL) # Disabled by default
self._logger.setLevel(logging.WARNING) # Allow WARNING and ERROR by default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change in behavior needed for a bulk copy logger integration? Should this change not be a part of a separate improvement to change the default level?


@property
def is_debug_enabled(self) -> bool:
"""Fast check if debug logging is enabled (cached for performance)"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the performance hit noticeable ? I am wondering the gains that you saw? In logging the overhead due to I/O seems large enough that this optimization may or may not make sense. I will let this be a team call though, however I am interested in knowing any impactful perf improvements. Else I would err on the side of a simpler code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-size: medium Moderate update size

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants