SDSTrack β€” RGB-E Checkpoint for VisEvent

This repository contains the RGB-E (RGB-Event) checkpoint for SDSTrack, a self-distillation symmetric adapter learning tracker for multi-modal single object tracking (CVPR 2024).

Model Details

Attribute Value
Tracker SDSTrack (Self-Distillation Symmetric Adapter Learning)
Backbone ViT-B (Vision Transformer, base size) with MAE pretraining
Modality RGB-E (RGB + Event camera)
Dataset VisEvent
Config cvpr2024_rgbe
Training epochs 50
Batch size 16
Learning rate 1e-4
Upstream commit 822d985 (SDSTrack@main)

Reproduction Results

This checkpoint was independently reproduced as part of the EvTrack project (Pattern Recognition course design, Topic #65).

Corrected metrics (MATLAB-equivalent protocol, absent frames excluded):

Metric Paper (CVPR 2024) Reproduction Delta
Success AUC ~0.597 0.5829 -1.4%
Precision @ 20px ~0.767 0.7506 -1.6%
SR @ 0.50 β€” 0.6929 β€”

Evaluation details:

  • 319/320 VisEvent test sequences evaluated
  • 1 sequence (00331_UAV_outdoor5) excluded β€” target absent in first frame
  • See EvTrack experiments/sdstrack for full reproduction docs

Files

File Size Description
SDSTrack_cvpr2024_rgbe.pth.tar ~490 MB Trained checkpoint for RGB-E evaluation on VisEvent
results/vis_event_test/ ~13 MB Tracker predictions (320 .txt files, one per sequence)

Checksums

Algorithm Hash
SHA256 b573dec59e9537204efbc131dccae047e27aeb41a26af7fbd4af222c8eaf0b74
MD5 bd0c98b7a2ea898d8cfdc3942158b9fa

Verify with:

sha256sum SDSTrack_cvpr2024_rgbe.pth.tar
md5sum SDSTrack_cvpr2024_rgbe.pth.tar

Usage

Loading the checkpoint in Python

from huggingface_hub import hf_hub_download
import torch

checkpoint_path = hf_hub_download(
    repo_id="krisspy39/sdstrack-rgbe",
    filename="SDSTrack_cvpr2024_rgbe.pth.tar",
    repo_type="model"
)

checkpoint = torch.load(checkpoint_path, map_location="cpu", weights_only=False)

Evaluation with upstream code

  1. Clone SDSTrack:
git clone https://github.com/hoqolo/SDSTrack.git
cd SDSTrack
  1. Download the pretrained OSTrack foundation model to ./pretrained/vitb_256_mae_ce_32x4_ep300/OSTrack_ep0300.pth.tar

  2. Symlink or copy this checkpoint to ./models/SDSTrack_cvpr2024_rgbe.pth.tar

  3. Run evaluation:

python ./RGBE_workspace/test_rgbe_mgpus.py \
  --script_name sdstrack \
  --num_gpus 1 \
  --threads 4 \
  --epoch 50 \
  --yaml_name cvpr2024_rgbe

Note: The upstream code requires PyTorch 1.11 + Python 3.8. For PyTorch 2.x compatibility patches, see EvTrack/sdstrack_eval.py.

Dataset

This checkpoint is trained and evaluated on VisEvent, a large-scale RGB-Event single object tracking benchmark.

  • Train: 120 sequences
  • Test: 320 sequences
  • Data format: Each sequence contains vis_imgs/ (RGB frames), event_imgs/ (event frames), groundtruth.txt, and absent_label.txt

The VisEvent dataset is also available as a webdataset on Hugging Face: krisspy39/visevent

Citation

If you use this model or the SDSTrack tracker, please cite:

@inproceedings{hou2024sdstrack,
  title={Self-Distillation Symmetric Adapter Learning for Multi-Modal Object Tracking},
  author={Hou, Xiaojun and others},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2024}
}

License

This checkpoint is provided for research purposes. Please refer to the original SDSTrack repository for licensing terms.

Acknowledgments

  • Original SDSTrack implementation by hoqolo
  • VisEvent dataset by wangxiao5791509
  • This checkpoint was reproduced as part of a university Pattern Recognition course project (Topic #65: Event-camera-based object tracking)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including krisspy39/sdstrack-rgbe