[T-PAMI'25] MADiff: Motion-Aware Mamba Diffusion Models for Hand Trajectory Prediction on Egocentric Videos
This repository trains and evaluates MADiff-style hand trajectory forecasting models on H2O or EgoPAT3D-style preprocessed data.
Thanks to Haoran Yang for helping organize the code.
Use Python 3.8+ and install the runtime packages with:
pip install -r requirements.txtIf you need a CUDA-specific PyTorch build, install the matching PyTorch wheel first, then run the command above for the remaining packages.
The default scripts run the H2O backend.
- H2O config:
configs/h2o.yml - EgoPAT3D config:
configs/egopat3d.yml - H2O default data root:
/data - EgoPAT3D default data root:
/data - Download the H2O-PT and EgoPAT3D-DT datasets from the dataset instructions in oppo-us-research/USST.
- Download the preprocessed files and MADiff pretrained weights from SJTU Pan.
- Evaluation checkpoints expected by
run_val_traj.py:- H2O:
./diffip_weights/checkpoint_h2o.pth.tar - EgoPAT3D:
./diffip_weights/checkpoint_egopat3d.pth.tar
- H2O:
With the default configs, H2O data is read from /data/h2o_dataset, and EgoPAT3D data is read from /data/EgoPAT3D-postproc.
After downloading the pretrained weights, place or rename them to the checkpoint paths listed above.
Edit the YAML config files or pass --extra_args through run_train.py if your data or checkpoint paths differ.
bash train.shEquivalent direct command:
python run_train.py --dataset_backend h2oTrain EgoPAT3D instead:
python run_train.py --dataset_backend egopat3dbash val_traj.shEquivalent direct command:
python run_val_traj.py --dataset_backend h2oEvaluate EgoPAT3D instead:
python run_val_traj.py --dataset_backend egopat3dSelect GPU ids used by the wrapper:
python run_train.py --cuda_devices 0 --dataset_backend h2oForward extra training arguments to traineval.py:
python run_train.py --dataset_backend h2o --extra_args "--epochs 100 --lr 0.0001"Show available options:
python traineval.py --helpIf this work is useful for your work, kindly cite our paper:
@article{ma2024madiff,
title={MADiff: Motion-Aware Mamba Diffusion Models for Hand Trajectory Prediction on Egocentric Videos},
author={Ma, Junyi and Chen, Xieyuanli and Bao, Wentao and Xu, Jingyi and Wang, Hesheng},
journal={arXiv preprint arXiv:2409.02638},
year={2024}
}This repository is released under the MIT License. See LICENSE for details.