热点文献带您关注AI主动视觉——图书馆前沿文献专题推荐服务(17)
发布时间:2020-07-03
在上一期AI文献推荐中,我们为您推荐了计算机视觉领域视觉理解等主题的前沿论文。在本期推荐中,我们将继续为您推荐计算机视觉主动视觉的热点论文。
近年来,随着传统视觉感知技术的日渐成熟和视觉传感设备的巨大进步,主动视觉技术得到越来越多人的关注。主动视觉已经成为计算机视觉未来发展的重要方向,在无人机跟拍、自动驾驶、智能监控等方面有了进一步发展。与传统视觉技术不同,主动视觉技术需要不断根据当前情况进行精准反馈,从而做出高效主动的策略调整。
本期选取了4篇文献,介绍计算机视觉的最新动态,包括主动视觉分类、实现快速计算的基于网格的运动统计方法、利用视觉测距法的导航策略、利用主动感知进行前景分割的背景建模等文献,推送给相关领域的科研人员。
End-to-End Policy Learning for Active Visual Categorization
Jayaraman, Dinesh, etc.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41(7): 1601-1614
Visual recognition systems mounted on autonomous moving agents face the challenge of unconstrained data, but simultaneously have the opportunity to improve their performance by moving to acquire new views at test time. In this work, we first show how a recurrent neural network-based system may be trained to perform end-to-end learning of motion policies suited for this "active recognition" setting. Further, we hypothesize that active vision requires an agent to have the capacity to reason about the effects of its motions on its view of the world. To verify this hypothesis, we attempt to induce this capacity in our active recognition pipeline, by simultaneously learning to forecast the effects of the agent's motions on its internal representation of the environment conditional on all past views. Results across three challenging datasets confirm both that our end-to-end system successfully learns meaningful policies for active category recognition, and that "learning to look ahead" further boosts recognition performance.
GMS: Grid-Based Motion Statistics for Fast, Ultra-robust Feature Correspondence
Bian, Jia-Wang, etc.
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128(6): 1580-1593
Feature matching aims at generating correspondences across images, which is widely used in many computer vision tasks. Although considerable progress has been made on feature descriptors and fast matching for initial correspondence hypotheses, selecting good ones from them is still challenging and critical to the overall performance. More importantly, existing methods oen take a long computational time, limiting their use in real-time applications. This paper attempts to separate true correspondences from false ones at high speed. We term the proposed method (GMS) grid-based motion Statistics, which incorporates the smoothness constraint into a statistic framework for separation and uses a grid-based implementation for fast calculation. GMS is robust to various challenging image changes, involving in viewpoint, scale, and rotation. It is also fast, e.g., take only 1 or 2 ms in a single CPU thread, even when 50K correspondences are processed. This has important implications for real-time applications. What's more, we show that incorporating GMS into the classic feature matching and epipolar geometry estimation pipeline can significantly boost the overall performance. Finally, we integrate GMS into the wellknown ORB-SLAM system for monocular initialization, resulting in a significant improvement.
Minimal navigation solution for a swarm of tiny flying robots to explore an unknown environment
McGuire, K. N., etc.
SCIENCE ROBOTICS, 2019, 4(35)
Swarms of tiny flying robots hold great potential for exploring unknown, indoor environments. Their small size allows them to move in narrow spaces, and their light weight makes them safe for operating around humans. Until now, this task has been out of reach due to the lack of adequate navigation strategies. The absence of external infrastructure implies that any positioning attempts must be performed by the robots themselves. State-of-the-art solutions, such as simultaneous localization and mapping, are still too resource demanding. This article presents the swarm gradient bug algorithm (SGBA), a minimal navigation solution that allows a swarm of tiny flying robots to autonomously explore an unknown environment and subsequently come back to the departure point. SGBA maximizes coverage by having robots travel in different directions away from the departure point. The robots navigate the environment and deal with static obstacles on the fly by means of visual odometry and wall-following behaviors. Moreover, they communicate with each other to avoid collisions and maximize search efficiency. To come back to the departure point, the robots perform a gradient search toward a home beacon. We studied the collective aspects of SGBA, demonstrating that it allows a group of 33-g commercial off-the-shelf quadrotors to successfully explore a realworld environment. The application potential is illustrated by a proof-of-concept search-and-rescue mission in which the robots captured images to find "victims" in an office environment. The developed algorithms generalize to other robot types and lay the basis for tackling other similarly complex missions with robot swarms in the future.
Active Perception for Foreground Segmentation: An RGB-D Data-Based Background Modeling Method
Sun, Yuxiang, etc.
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2019, 16(4): 1596-1609
Foreground moving object segmentation is a fundamental problem in many computer vision applications. As a solution for foreground segmentation, background modeling has been intensively studied over past years and many effective algorithms have been developed. However, accurate foreground segmentation is still a difficult problem. Currently, most of the algorithms work solely within the color space, in which the segmentation performance is prone to be degraded by a multitude of challenges, such as illumination changes, shadows, automatic camera adjustments, and color camouflage. RGB-D cameras are active visual sensors that provide depth measurements along with color images. We present in this paper an innovative background modeling method by using both the color and depth information from an RGB-D camera. The proposed method is evaluated using a public RGB-D data set. Various experiments confirm that our method is able to achieve superior performance compared with existing well-known methods. Note to Practitioners-This paper investigates background modeling for foreground segmentation with active perception. Recent RGB-D cameras that leverage the active perception technology have advanced many computer vision algorithms. In this paper, we develop a background modeling method to achieve superior performance by using an RGB-D camera instead of a color camera. Due to the use of the active sensing technology, the proposed method is characterized by its robustness to common challenges. Our method could be used for improving existing infrastructures, such as visual surveillance systems for parking spaces. Moreover, the simple design of our method allows it to be easily deployed on various computing platforms, which facilitates many practical applications that usually require embedded computing devices. However, our method cannot run real timely at the current status. We believe that it can be further improved using parallel programming techniques to meet the real-time requirement.