Robust and Real-time Deep Tracking via Multi Scale Domain Adaptation

 Xinyu Wang¹      Hanxi Li¹*      Yi Li²      Fumin Shen³      Fatih Porikli

Jiangxi Normal University, China¹

Toyota Research Institute of North America, USA²

University of Electronic Science and Technology of China, China³

Australian National University, Australia⁴

1. Abstract

Visual tracking is a fundamental problem in computer vision. Recently, some deep-learning-based tracking algorithms have been achieving record-breaking performances. However, due to the high complexity of deep learning, most deep trackers suffer from low tracking speed, and thus are impractical in many real-world applications. Some new deep trackers with smaller network structure achieve high efficiency while at the cost of significant decrease on precision. In this paper, we propose to transfer the feature for image classification to the visual tracking domain via convolutional channel reductions. The channel reduction could be simply viewed as an additional convolutional layer with the specific task. It not only extracts useful information for object tracking but also significantly increases the tracking speed. To better accommodate the useful feature of the target in different scales, the adapation filters are designed with different sizes. The yielded visual tracker is real-time and also illustrates the state-of-the-art accuracies in the experiment involving two well-adopted benchmarks with more than 100 test videos.

2. Downloads

Robust and Real-time Deep Tracking via Multi Scale Domain Adaptation

Xinyu Wang, Hanxi Li*, Yi Li, Fumin Shen, Fatih Porikli

IEEE International Conference on Multimedia and EXPO (ICME) 2017, Hongkong

[Paper]     [Code]

[Our results on OTB-100 and VOT-2016]

 

3. Evaluate Results

Location Error of OTB50
Success Rate of OTB50

4. Conclusion and Future Works

In this work, we propose a simple yet effective algorithm to transferring the features in the classification domain to the visual tracking domain. The yielded visual tracker, termed MSDAT, is real-time and achieves the comparable tracking accuracies to the state-of-the-art deep trackers. The experiment verifies the validity of the proposed domain adaptation. Admittedly, updating the neural network online can lift the tracking accuracy significantly. However, the exsting online updating scheme results in dramatical speed reduction. One possible future direction could be to simultaneously update the KCF model and a certain part of the neural network (e.g. the last convolution layer). In this way, one could strike the balance between accuracy and efficiency and thus better tracker could be obtained. Another direction is to replace the KCF tracker with hashing models which could be trained and conducted efficiently.

5. Reference

If you feel this research is helpful, please consider cite our paper.

@inproceedings{wang2017robust,
  title={Robust and Real-time Deep Tracking Via Multi-Scale Domain Adaptation},
  author={Wang, Xinyu and Li, Hanxi and Li, Yi and Shen, Fumin and Porikli, Fatih},
  booktitle={Proceeding of the IEEE International Conference on Multimedia and Expo (ICME)},
  year={2017}
}