DroneMOT: Drone-based Multi-Object Tracking Considering Detection Difficulties and Simultaneous Moving of Drones and Objects

Proceedings of ICRA 2024

 

Peng Wang1, Yongcai Wang1,*, Deying Li1

1 School of Information, Renmin University of China, Beijing, 100872

 

image-20240529183359317 image-20240529183422179

 

Overview

This paper proposes DroneMOT, which firstly proposes a Dual-domain Integrated Attention (DIA) module that considers the fast movements of drones to enhance the drone-based object detection and feature embedding for smallsized, blurred, and occluded objects. Then, an innovative Motion-Driven Association (MDA) scheme is introduced, considering the concurrent movements of both the drone and the objects. Within MDA, an Adaptive Feature Synchronization (AFS) technique is presented to update the object features seen from different angles. Additionally, a Dual Motion-based Prediction (DMP) method is employed to forecast the object positions. Finally, both the refined feature embeddings and the predicted positions are integrated to enhance the object association. Comprehensive evaluations on VisDrone2019-MOT and UAVDT datasets show that DroneMOT provides substantial performance improvements over the state-of-the-art in the domain of MOT on drones.

System Architecture

DroneMOT is primarily split into two modules: the network module for detection and feature embedding, and the data-association module based on the result of the network module. The image ItRW×H×3 captured by the moving drone at the tth frame is fed into the network along with the previous frame image It1 . The results of the network module, represented by Ot={o1,o2,···,oi,···,oM} consist of M detections where oi=(bi,si,fi). Here, bi represents the bounding box (x,y,w,h), si is the detection score, and fi is the feature embedding vectors. The data association module takes the detections Ot and all N stored trajectories of the objects Tt1={T1,T2,···,Tj,···,TN} as inputs, where Tj={oj1,oj3,···,ojt1} , and ojt1represents the detection associated with the trajectory j in the t1th frame. The goal of the data association module is to match each detection with a trajectory, treat the unmatched detections as the new trajectories, and ultimately produce the final tracking results Tt.

image-20240530120858787

Contributions

Evaluations

 

image-20240530120945596

image-20240530121007711

image-20240530121121040

image-20240530121121031

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China Grant No. 61972404, 12071478; Public Computing Cloud, Renmin University of China; Blockchain Laboratory, Metaverse Research Center, Renmin University of China.