SOD-MTGAN: Small Object Detection via Multi-Task Generative Adversarial Network
Type
Conference PaperKAUST Department
Visual Computing Center (VCC)Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Electrical Engineering Program
Date
2018-10-06Online Publication Date
2018-10-06Print Publication Date
2018Permanent link to this record
http://hdl.handle.net/10754/630628
Metadata
Show full item recordAbstract
Object detection is a fundamental and important problem in computer vision. Although impressive results have been achieved on large/medium sized objects in large-scale detection benchmarks (e.g. the COCO dataset), the performance on small objects is far from satisfactory. The reason is that small objects lack sufficient detailed appearance information, which can distinguish them from the background or similar objects. To deal with the small object detection problem, we propose an end-to-end multi-task generative adversarial network (MTGAN). In the MTGAN, the generator is a super-resolution network, which can up-sample small blurred images into fine-scale ones and recover detailed information for more accurate detection. The discriminator is a multi-task network, which describes each super-resolved image patch with a real/fake score, object category scores, and bounding box regression offsets. Furthermore, to make the generator recover more details for easier detection, the classification and regression losses in the discriminator are back-propagated into the generator during training. Extensive experiments on the challenging COCO dataset demonstrate the effectiveness of the proposed method in restoring a clear super-resolved image from a blurred small one, and show that the detection performance, especially for small sized objects, improves over state-of-the-art methods.Citation
Bai Y, Zhang Y, Ding M, Ghanem B (2018) SOD-MTGAN: Small Object Detection via Multi-Task Generative Adversarial Network. Lecture Notes in Computer Science: 210–226. Available: http://dx.doi.org/10.1007/978-3-030-01261-8_13.Sponsors
This work was supported mainly by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research and by Natural Science Foundation of China, Grant No. 61603372.Publisher
Springer NatureConference/Event name
15th European Conference on Computer Vision, ECCV 2018Additional Links
https://link.springer.com/chapter/10.1007%2F978-3-030-01261-8_13ae974a485f413a2113503eed53cd6c53
10.1007/978-3-030-01261-8_13