SDSTrack β RGB-E Checkpoint for VisEvent
This repository contains the RGB-E (RGB-Event) checkpoint for SDSTrack, a self-distillation symmetric adapter learning tracker for multi-modal single object tracking (CVPR 2024).
Model Details
| Attribute | Value |
|---|---|
| Tracker | SDSTrack (Self-Distillation Symmetric Adapter Learning) |
| Backbone | ViT-B (Vision Transformer, base size) with MAE pretraining |
| Modality | RGB-E (RGB + Event camera) |
| Dataset | VisEvent |
| Config | cvpr2024_rgbe |
| Training epochs | 50 |
| Batch size | 16 |
| Learning rate | 1e-4 |
| Upstream commit | 822d985 (SDSTrack@main) |
Reproduction Results
This checkpoint was independently reproduced as part of the EvTrack project (Pattern Recognition course design, Topic #65).
Corrected metrics (MATLAB-equivalent protocol, absent frames excluded):
| Metric | Paper (CVPR 2024) | Reproduction | Delta |
|---|---|---|---|
| Success AUC | ~0.597 | 0.5829 | -1.4% |
| Precision @ 20px | ~0.767 | 0.7506 | -1.6% |
| SR @ 0.50 | β | 0.6929 | β |
Evaluation details:
- 319/320 VisEvent test sequences evaluated
- 1 sequence (
00331_UAV_outdoor5) excluded β target absent in first frame - See EvTrack experiments/sdstrack for full reproduction docs
Files
| File | Size | Description |
|---|---|---|
SDSTrack_cvpr2024_rgbe.pth.tar |
~490 MB | Trained checkpoint for RGB-E evaluation on VisEvent |
results/vis_event_test/ |
~13 MB | Tracker predictions (320 .txt files, one per sequence) |
Checksums
| Algorithm | Hash |
|---|---|
| SHA256 | b573dec59e9537204efbc131dccae047e27aeb41a26af7fbd4af222c8eaf0b74 |
| MD5 | bd0c98b7a2ea898d8cfdc3942158b9fa |
Verify with:
sha256sum SDSTrack_cvpr2024_rgbe.pth.tar
md5sum SDSTrack_cvpr2024_rgbe.pth.tar
Usage
Loading the checkpoint in Python
from huggingface_hub import hf_hub_download
import torch
checkpoint_path = hf_hub_download(
repo_id="krisspy39/sdstrack-rgbe",
filename="SDSTrack_cvpr2024_rgbe.pth.tar",
repo_type="model"
)
checkpoint = torch.load(checkpoint_path, map_location="cpu", weights_only=False)
Evaluation with upstream code
- Clone SDSTrack:
git clone https://github.com/hoqolo/SDSTrack.git
cd SDSTrack
Download the pretrained OSTrack foundation model to
./pretrained/vitb_256_mae_ce_32x4_ep300/OSTrack_ep0300.pth.tarSymlink or copy this checkpoint to
./models/SDSTrack_cvpr2024_rgbe.pth.tarRun evaluation:
python ./RGBE_workspace/test_rgbe_mgpus.py \
--script_name sdstrack \
--num_gpus 1 \
--threads 4 \
--epoch 50 \
--yaml_name cvpr2024_rgbe
Note: The upstream code requires PyTorch 1.11 + Python 3.8. For PyTorch 2.x compatibility patches, see EvTrack/sdstrack_eval.py.
Dataset
This checkpoint is trained and evaluated on VisEvent, a large-scale RGB-Event single object tracking benchmark.
- Train: 120 sequences
- Test: 320 sequences
- Data format: Each sequence contains
vis_imgs/(RGB frames),event_imgs/(event frames),groundtruth.txt, andabsent_label.txt
The VisEvent dataset is also available as a webdataset on Hugging Face: krisspy39/visevent
Citation
If you use this model or the SDSTrack tracker, please cite:
@inproceedings{hou2024sdstrack,
title={Self-Distillation Symmetric Adapter Learning for Multi-Modal Object Tracking},
author={Hou, Xiaojun and others},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2024}
}
License
This checkpoint is provided for research purposes. Please refer to the original SDSTrack repository for licensing terms.
Acknowledgments
- Original SDSTrack implementation by hoqolo
- VisEvent dataset by wangxiao5791509
- This checkpoint was reproduced as part of a university Pattern Recognition course project (Topic #65: Event-camera-based object tracking)