I-ROD: An Ensemble CNNs for Object Detection in Unconstrained Road Scenarios

Sep 1, 2024·

Abhishek Mukhopadhyay

Harshitha Belagavi Rajaprakash

Prashant T Gaikwad

Imon Mukherjee

Pradipta Biswas

· 1 min read

Image credit:

Abstract

Solving the problem of object detection in complex and unstructured environments is crucial for enhancing the safety and efficiency of autonomous system. This paper introduces a semantic segmentation model capable of accurate object detection in complex backgrounds by integrating multiple Convolutional Neural Networks (CNNs). The system incorporates two distinct types of segmentation models, an encoder-decoder architecture for acquiring abstract feature representations and a dilated convolutional branch to tackle variations in object sizes. The model employs a dynamic fusion mechanism based on confidence scores from each branch, allowing it to adapt to varying and dynamic situations. The model is evaluated on the Indian Driving Dataset (IDD), featuring unstructured road environments, and the Cityscape dataset. Comparative pixel-wise analysis shows the proposed model outperforming four other state-of-the-art segmentation models by 12.91% on the IDD and by 19.7% over the second-best model on the Cityscape dataset in terms of F1 score. Furthermore, an extensive ablation study validates the efficacy of the ensemble approach and underscores the effectiveness of categorical cross-entropy as the chosen loss function.

Type

Journal article

Publication

Signal, Image and Video Processing Journal, Springer

Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.

Create your slides in Markdown - click the Slides button to check out the example.

Add the publication’s full text or supplementary notes here. You can use rich formatting such as including code, math, and images.

Last updated on Sep 1, 2024