Building Detection based on Faster RCNN with Distributional Soft Actor-Critic with Three Refinements
Keywords:
Deep learning, Distributional soft actor critic, Fine region proposal network, Recurrent convolutional neural networks, Variance-based mechanism.Abstract
This research presents a comprehensive framework for building detection in high-resolution images, integrating advanced techniques from computer vision and reinforcement learning. The methodology employs the Faster Region-based Convolutional Neural Network (RCNN) architecture for efficient feature extraction and region proposal generation, enhancing the accuracy of building detection. A novel Fine Region Proposal Network (FRPN) adapts region proposals based on image characteristics, dynamically adjusting candidate regions for improved efficiency. The study introduces three refinements to the Distributional Soft Actor Critic (DSAC-T) algorithm, addressing stability and sensitivity concerns. These enhancements involve fine-tuning the critic gradient, incorporating twin value distribution learning, and introducing a variance-based mechanism for return clipping the target. Rigorous assessments on demanding datasets, such as the Massachusetts and WHU building dataset, provide compelling evidence of the efficacy of the proposed framework. The proposed approach demonstrates superior performance in building detection, achieving an average precision of 69.48% and an average recall of 84.29% on the Massachusetts dataset, and an average precision of 65.82% and an average recall of 81.52% on the WHU dataset. Thus, the research contributes to the field by providing a robust solution for building detection, leveraging state-of-the-art techniques for improved performance in diverse urban and suburban environments.References
X. Hou, Y. Bai, Y. Li, C. Shang and Q. Shen, High-resolution triplet network with dynamic multiscale feature for change detection on satellite images, ISPRS Journal of Photogrammetry and Remote Sensing, 177, 2021, 103-115.
J. Li, X. Huang, L. Tu, T. Zhang and L. Wang, A review of building detection from very high resolution optical remote sensing images. GIScience & Remote Sensing, 59(1), 2022, 1199-1225.
F. Chen, N. Wang, B. Yu and L. Wang, Res2-Unet, a new deep architecture for building detection from high spatial resolution images, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15, 2022, 1494-1501.
T. Ujiie, M. Hiromoto and T. Sato, Approximated prediction strategy for reducing power consumption of convolutional neural network processor, IEEE Conference on Computer Vision and Pattern Recognition Workshop, Las Vegas, USA, 2016.
D. Erhan, C. Szegedy, A. Toshev and D. Anguelov, Scalable object detection using deep neural networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, 2155-2162.
S. Ren, K. He, R. Girshick and J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 2016, 1137-1149.
A. Greenwald, K. Hall and R. Serrano, Correlated Q-learning, Twentieth International Conference on Machine Learning (ICML), Washington, USA, 3, 2003, 242-249.
M. M. Hassan, M. G. R. Alam, M. Z. Uddin, S. Huda, A. Almogren and G. Fortino, Human emotion recognition using deep belief network architecture, Information Fusion, 51, 2019, 10-18.
G. Zuo, T. Du and J. Lu, Double DQN method for object detection. Proceedings of the Chinese Automation Congress (CAC), Jinan, China, 2017, 6727-6732.
X. Zhou, Deep-Q-Network-Facilitated Object Detection, Project Report, Stanford University, USA, 2021.
M. Samiei and R. Li, Object detection with deep reinforcement learning, arXiv:2208.04511, 2022.
S. Zheng and H. Wang, Real-time visual object tracking based on reinforcement learning with twin delayed deep deterministic algorithm, 9th International Conference Intelligence Science and Big Data Engineering. Visual Data Engineering, Nanjing, China, 2019, 165-177.
J. Duan, S. E. Li, Y. Guan, Q. Sun and B. Cheng, Hierarchical reinforcement learning for self‐driving decision‐making without reliance on labelled driving data, IET Intelligent Transport Systems, 14(5), 2020, 297-305.
C. J. C. H. Watkins, Learning from Delayed Rewards, PhD Thesis, King’s College, Cambridge, UK, 1989.
R. S. Sutton and A. G. Barto, Reinforcement Learning: An introduction, MIT Press, 2018.
H. Van Hasselt, A. Guez and D. Silver, Deep reinforcement learning with double q-learning, Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, USA, 30(1), 2016, 1-7.
H. Hasselt, Double q-learning, 23rd Advances in Neural Information Processing Systems (NeurIPS 2010), Vancouver, Canada, 2010, 2613–2621.
S. Fujimoto, H. Hoof and D. Meger, Addressing function approximation error in actor-critic methods, Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 2018, 1587-1596.
T. Haarnoja, A. Zhou, K. Hartikainen, G. Tucker, S. Ha, J. Tan, V. Kumar, H. Zhu, A. Gupta, P. Abbeel and S. Levine, Soft actor-critic algorithms and applications, arXiv:1812.05905, 2018.
T. Haarnoja, A. Zhou, P. Abbeel and S. Levine, Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor, Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 2018, 1861-1870.
J. Duan, Y. Guan, S. E. Li, Y. Ren, Q. Sun and B. Cheng, Distributional soft actor-critic: Off-policy reinforcement learning for addressing value estimation errors, IEEE Transactions on Neural Networks and Learning Systems, 33(11), 2021, 6584-6598.
R. Girshick, Fast R-CNN, Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 2015, 1440-1448.
Y. Liu, L. Gross, Z. Li, X. Li, X. Fan and W. Qi, Automatic building extraction on high-resolution remote sensing imagery using deep convolutional encoder-decoder with spatial pyramid pooling, IEEE Access, 7, 2019, 128774-128786.
S. Wang, X. Hou and X. Zhao, Automatic building extraction from high-resolution aerial imagery via fully convolutional encoder-decoder network with non-local block, IEEE Access, 8, 2020, 7313-7322.
Q. Zhu, Z. Li, Y. Zhang and Q. Guan, Building extraction from high spatial resolution remote sensing images via multiscale-aware and segmentation-prior conditional random fields, Remote Sensing, 12(23), 2020, 3983.
S. Ji, S. Wei and M. Lu, A scale robust convolutional neural network for automatic building extraction from aerial and satellite imagery, International Journal of Remote Sensing, 40(9), 2019, 3308-3322.
S. Ji, S. Wei and M. Lu, Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set, IEEE Transactions on Geoscience and Remote Sensing, 57(1), 2018, 574-586.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:Authors hold and retain copyright, and grant the journal right of first publication, with the work after publication simultaneously licensed under a Creative Commons Attribution 4.0 License CC BY that permits any use, reproduction and distribution of the work and article without further permission provided that the original work is properly cited.
Authors are permitted and encouraged to post their work online in institutional repositories, website and other social media before and after publication, as it can lead to productive exchanges, as well as earlier and greater citation of published work.





