본문
3D Vision and image processing
3D Vision
Acquiring 3D information from the environment is crucial for many applications. While active sensors such as LiDAR is the common choice for this purpose, alternative solution from stereo camera has the potential to be more cost-efficient. Typical shortcoming with stereo vision algorithms however, is in its accuracy as well as computation time. For that reason, we are working on a real-time stereo matching solution based on a recently growing field of Graph Neural Network (GNN). Currently, we utilize stereo vision for the following research applications: 3D Object Detection. For some time, point cloud based 3D object detection was far ahead of their image counterpart. However, stereo based detection is getting more accurate and that gap is shrinking now. Using GNN, we are trying to imitate the success of point cloud learning and aim to build a real-time stereo based end-to-end multitasking network for 3D object detection and stereo prediction (Fig. 1).
SLAM. We built dense SLAM algorithm (Fig. 2) using the dense stereo prediction. However, dense mapping of the environments requires large memory consumption. Coupled with lane semantic segmentation and 3D object detection, we are working towards building lane and object level SLAM algorithm.
Robust monocular perception pipeline for connected autonomous vehicle Vision Sensors are preferred in multiple applications such as intelligent infrastructure and autonomous vehicles due to their ability to capture surrounding information while being cost effective for mass-scale deployment. They, however face challenges from different conditions such as lens distortion, environmental variations such as haze, snow, fog etc. changes. Present state of the art (SoTA) vision algorithms for vision tasks such as object detection, lane detection, vehicle re-identification and others rely upon convolution neural networks (CNNs) which are sensitive towards domain variations arising due to environmental and sensor variations. Towards obtaining accurate information from vision sensors that retain performance under different deployment conditions we are focusing and working upon both the front end (camera ISP) and back end (High Level Vision tasks) to determine and demonstrate performance efficacy.
Learning Camera ISP : Learning the transformation from RAW to RGB domain using CNNs with emphasis on obtaining DSLR quality images from a single low cost camera sensor.
Image Denoising : Noise removal (Pixel, Patch, Homogeneous and Non Homogeneous) using CNNs.
3D Monocular Object Detection : Using monocular cameras to determine 3D bounding box for enclosing an object accurately.
Lane Detection : Using predefined anchor points to estimate lane line characteristics, in absence or visual degradation of lane markers.
Applications : Results :
LiDAR De-noise (snow, rain, etc)
Lidar sensors have merits of generating high resolution imaging quickly day and night, however, its performance is very limited in adverse weather conditions such as snow, rain, and dense fog. Therefore, we proposed a new intensity-based filter that differs from the existing distance-based filter that causes low speed. This method shows overwhelming performance in speed and accuracy by removing only snow particle, not important environmental features. It was possible by deriving the intensity criteria for snow removal based on the analysis of the properties of laser light and snow particle.
<Reference> [1] Ji-il Park, Jihyuk Park, and Kyung-Soo Kim, “Fast and Accurate De-Snowingalgorithm of Lidar Point Clouds”
Image de-noising process of the model-based approach for object detection in snowfall environment
Sensors for autonomous vehicles are used in outdoor environments. Environmental conditions such as snow and rain generated in the outdoors become sensor noise. It causes a decrease in an algorithm performance of the sensor. Consequently, the control of sensor noise generated by environmental conditions is one of the solutions to improve the algorithm performance applied to the sensors of the autonomous vehicle.
<Reference> [1] Kim, Jin-Hwan, Jae-Young Sim, and Chang-Su Kim. "Video deraining and desnowing using temporal correlation and low-rank matrix completion."IEEE Transactions on Image Processing24.9 (2015): 2658-2670. |