add tracker to support swanlab#213
Conversation
Cover SwanLabTracker behavioral tests (23 cases) and dispatch system tests (38 cases): register_tracker, dispatch, dispatch_hyperparams, clear_trackers, set_rank, auto_init_from_env. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
There was a problem hiding this comment.
Code Review
This pull request introduces a unified experiment tracking dispatch system for Twinkle training metrics, adding support for SwanLab and Weights & Biases (W&B) trackers along with environment-variable auto-initialization. The review feedback highlights several critical issues and improvement opportunities: a potential TypeError in WandbTracker when custom settings are passed, distributed training issues caused by initializing trackers at import time rather than lazily, a lack of idempotency in dispatch_hyperparams when adapter_name is None (which would flood tracking servers), and a silent failure in SwanLabTracker if the output directory does not exist. Additionally, suggestions were made to fully support lazy initialization in list_trackers and dispatch, and to correct the type hint for raw metrics.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| settings = None | ||
| proxy = kwargs.pop("wandb_proxy", None) or os.environ.get("WANDB_PROXY") | ||
| if proxy: | ||
| settings = wandb.Settings(https_proxy=proxy) |
There was a problem hiding this comment.
If settings is passed in kwargs when instantiating WandbTracker, it will cause a TypeError because settings is also explicitly passed as a keyword argument to wandb.init (line 43) without being popped from kwargs first. Popping settings from kwargs beforehand resolves this issue and allows users to safely pass custom configurations.
| settings = None | |
| proxy = kwargs.pop("wandb_proxy", None) or os.environ.get("WANDB_PROXY") | |
| if proxy: | |
| settings = wandb.Settings(https_proxy=proxy) | |
| settings = kwargs.pop("settings", None) | |
| proxy = kwargs.pop("wandb_proxy", None) or os.environ.get("WANDB_PROXY") | |
| if proxy: | |
| settings = wandb.Settings(https_proxy=proxy) |
| # Run auto-init once at import time (before user code or atexit runs) | ||
| _auto_init_from_env() |
There was a problem hiding this comment.
Initializing trackers at import time is problematic for distributed training. When multiple processes (e.g., in DDP or Ray) import twinkle.tracker, they will all trigger _auto_init_from_env() simultaneously. Since set_rank has not been called yet, _rank is 0 on all processes, causing every rank to initialize its own tracking run. Removing the import-time call and running auto-initialization lazily ensures it only runs on rank 0 after the distributed environment is set up.
| # Run auto-init once at import time (before user code or atexit runs) | |
| _auto_init_from_env() | |
| # Auto-init is run lazily on first tracker registration or dispatch |
| def dispatch_hyperparams(params: Dict[str, Any], adapter_name: Optional[str] = None) -> None: | ||
| """Send hyperparameters to all registered trackers (call once at training start). | ||
|
|
||
| Idempotent per ``(adapter_name,)`` — repeated calls with the same | ||
| *adapter_name* are silently ignored so that this can safely be called | ||
| from ``calculate_metrics`` on its first invocation without | ||
| flooding trackers with redundant config updates. | ||
|
|
||
| Args: | ||
| params: Flat or nested dict of hyperparameters (e.g. model config, | ||
| training args, LoRA config). | ||
| adapter_name: Optional adapter identifier. If omitted, the params | ||
| are dispatched unconditionally on every call. | ||
| """ | ||
| if not _trackers or _rank != 0: | ||
| return | ||
|
|
||
| # Idempotency guard: only dispatch once per adapter | ||
| if adapter_name is not None: | ||
| if adapter_name in _hparams_dispatched: | ||
| return | ||
| _hparams_dispatched.add(adapter_name) |
There was a problem hiding this comment.
This function has two issues:
- It needs to call
_auto_init_from_env()to support lazy initialization. - If
adapter_nameisNone(which is the default for full fine-tuning without LoRA), the idempotency guard is skipped entirely. This causesdispatch_hyperparamsto flood tracking servers with redundant config updates on every single training step wherecalculate_metricsis called. Using a default key like"_default_"whenadapter_nameisNoneprevents this flooding.
def dispatch_hyperparams(params: Dict[str, Any], adapter_name: Optional[str] = None) -> None:
"""Send hyperparameters to all registered trackers (call once at training start).
Idempotent per ``(adapter_name,)`` — repeated calls with the same
*adapter_name* are silently ignored so that this can safely be called
from ``calculate_metrics`` on its first invocation without
flooding trackers with redundant config updates.
Args:
params: Flat or nested dict of hyperparameters (e.g. model config,
training args, LoRA config).
adapter_name: Optional adapter identifier. If omitted, the params
are dispatched unconditionally on every call.
"""
_auto_init_from_env()
if not _trackers or _rank != 0:
return
# Idempotency guard: only dispatch once per adapter (or once globally if None)
guard_key = adapter_name if adapter_name is not None else "_default_"
if guard_key in _hparams_dispatched:
return
_hparams_dispatched.add(guard_key)| def _save_experiment_info(self, output_dir: str) -> None: | ||
| try: | ||
| info = {"swanlab_experiment_url": self._run.get_run().url} | ||
| out = Path(output_dir) / "swanlab_config.json" | ||
| out.write_text(json.dumps(info, indent=2)) | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
If the specified output_dir does not exist, out.write_text will raise a FileNotFoundError because Python does not automatically create parent directories. Although this is wrapped in a try...except block, it results in a silent failure where the experiment info is not saved. Creating the parent directories first ensures robustness.
| def _save_experiment_info(self, output_dir: str) -> None: | |
| try: | |
| info = {"swanlab_experiment_url": self._run.get_run().url} | |
| out = Path(output_dir) / "swanlab_config.json" | |
| out.write_text(json.dumps(info, indent=2)) | |
| except Exception: | |
| pass | |
| def _save_experiment_info(self, output_dir: str) -> None: | |
| try: | |
| info = {"swanlab_experiment_url": self._run.get_run().url} | |
| out = Path(output_dir) / "swanlab_config.json" | |
| out.parent.mkdir(parents=True, exist_ok=True) | |
| out.write_text(json.dumps(info, indent=2)) | |
| except Exception: | |
| pass |
| def list_trackers() -> List[ExperimentTracker]: | ||
| """Return a snapshot of currently registered trackers.""" | ||
| return list(_trackers) |
There was a problem hiding this comment.
To support lazy initialization of trackers from environment variables, _auto_init_from_env() should be called when listing trackers.
| def list_trackers() -> List[ExperimentTracker]: | |
| """Return a snapshot of currently registered trackers.""" | |
| return list(_trackers) | |
| def list_trackers() -> List[ExperimentTracker]: | |
| """Return a snapshot of currently registered trackers.""" | |
| _auto_init_from_env() | |
| return list(_trackers) |
| def dispatch(data: Dict[str, float], step: int) -> None: | ||
| """Send computed metrics to all registered trackers. | ||
|
|
||
| Metric values are normalized to ``float`` via :func:`clean_metrics` | ||
| before dispatching. Only the rank-0 process performs the dispatch; | ||
| all other ranks return immediately with no overhead. | ||
|
|
||
| Args: | ||
| data: Raw metric dict (may contain strings, ints, floats). | ||
| step: Current training step (``cur_step`` from optimizer group). | ||
| """ | ||
| if not _trackers: | ||
| return |
There was a problem hiding this comment.
To support lazy initialization of trackers, _auto_init_from_env() should be called inside dispatch. Additionally, the type hint for data should be Dict[str, Any] since raw metrics can contain strings, ints, or floats before being normalized by clean_metrics.
| def dispatch(data: Dict[str, float], step: int) -> None: | |
| """Send computed metrics to all registered trackers. | |
| Metric values are normalized to ``float`` via :func:`clean_metrics` | |
| before dispatching. Only the rank-0 process performs the dispatch; | |
| all other ranks return immediately with no overhead. | |
| Args: | |
| data: Raw metric dict (may contain strings, ints, floats). | |
| step: Current training step (``cur_step`` from optimizer group). | |
| """ | |
| if not _trackers: | |
| return | |
| def dispatch(data: Dict[str, Any], step: int) -> None: | |
| """Send computed metrics to all registered trackers. | |
| Metric values are normalized to ``float`` via :func:`clean_metrics` | |
| before dispatching. Only the rank-0 process performs the dispatch; | |
| all other ranks return immediately with no overhead. | |
| Args: | |
| data: Raw metric dict (may contain strings, ints, floats). | |
| step: Current training step (``cur_step`` from optimizer group). | |
| """ | |
| _auto_init_from_env() | |
| if not _trackers: | |
| return |
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with Sisyphus Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with Sisyphus Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…otency, settings kwarg, mkdir Ultraworked with Sisyphus Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Ultraworked with Sisyphus Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
PR type
PR information
add tracker submodule.
Experiment results
Paste your experiment result here(if needed).