We describe an ego-motion estimation system developed specifically for humanoid robots, integrating visual and inertial sensors. It addresses the challenge of significant scale changes due to forward motion with a finite field of view by using recent sparse multi-scale feature tracking techniques. Additionally, it addresses the challenge of long-range temporal correlation due to walking gaits by employing a kinematic-statistical model that does not require accurate knowledge of the robot dynamics and calibration.
The walking motion experiment is a significantly challenging dataset comprised of brisk walking using the head-mounted system through the hallways of a rectangular building. Throughout the dataset the operator frequently changes gaze direction. On this challenging example of humanoid legged motion we achieve a drift of 1.01% over a 292m course.
@inproceedings{tps_humanoids2012, title={Visual-Inertial Ego-Motion Estimation for Humanoid Platforms}, author={Tsotsos, K. and Pretto, A. and Soatto, S.}, booktitle={Proc. of: IEEE-RAS International Conference on Humanoid Robots}, year={2012}, pages={704--711} }
In this work, we propose an effective dense 3D reconstruction and navigation system based on omnidirectional vision well suited for large scale scenarios. Aim of this work is to provide a robust and efficient way to build 3D maps with exhaustive information about both the structure and the appearance of the environment.
We start from the assumption that the surrounding environment (the scene) forms a piecewise smooth surface represented by a triangle mesh. Our system is able to infer the structure of the environment along with the ego-motion of the camera performing a robust tracking of the projection in the omnidirectional image of this surface.
Delaunay triangulation based on corner features is used as an initial guess of the surface tessellation; the triangulation is hence modified using a constrained edge insertion algorithm, where the edgelet features detected in the image represent the new predefined edges which are added to the new triangulation. Both the camera motion and the 3D environment structure parameters are estimated using a direct method inside an optimization framework, taking into account the topology of the subdivision in a robust and efficient way.
@inproceedings{prettoICRA2011, author = {Pretto, A. and Menegatti, E. and Pagello, E.}, title = {Omnidirectional Dense Large-Scale Mapping and Navigation Based on Meaningful Triangulation}, booktitle = {Proc. of. IEEE International Conference on Robotics and Automation (ICRA)}, year = {2011}, pages = {3289--3296}, }
@INPROCEEDINGS{prettoOVR2010, AUTHOR = {Pretto, A. and Soatto, S. and Menegatti, E.}, TITLE = {Scalable Dense Large-Scale Mapping and Navigation}, BOOKTITLE = {Proc. of: Workshop on Omnidirectional Robot Vision (ICRA)}, YEAR = {2010} }
This system presents a new approach to automatically monitor an indoor environment on thermodynamic basis. It uses temperature as the driving parameter and is especially suited for comfort analysis or evaluation of moisture (Patent n.PD2006A000191, E. Grinzato). The system measures all fundamental environment parameters (e.g., air temperature, relative humidity and air speed) by imaging with a thermal camera (Figure 1) a set of special targets arranged in a grid (the reference grid) , which can be placed close to a wall or in any other place of the room (Figure 2). The thermal camera was mounted on a pan-til unit to realize the monitoring process in an automatic way. The core of the device is a software that can process the thermal images in real-time and autonomously control the pan-tilt unit. A fast automatic learning procedure enables to recognize the special targets on the grid also in challenging environments and in very different environment conditions.
The system detects a set of putative targets inside every thermal image using an efficient blob-detection technique.
Every candidate is therefore classified as inlier or outlier using a trained Adaptive Boosting (AdaBoost) machine learning algorithm. For every recorded thermal image, we are then able to recover the metrical positions where the measures are taken. We propose to cope such a task exploiting the relationship between the IR camera image plane
and the planar surface of the reference grid. Our approach is to represent the state of the system (i.e., position of the intersection point between the optical axis of the camera and the planar surface of the reference grid) in a probabilistic fashion through a posterior density function (PDF). The PDF can be estimated recursively over time using incoming observations (thermal images) and actions (pan-tilt movement) using the Recursive Bayes Filter, that exploits the Markov
assumption of the stochastic process. The action model represents the probability density of the state at time t given the
the state at time t-1 and the last action (pan-tilt movement) performed. The observation model represents the probability
density of the observation at time t given the state at the same time. To solve the Bayes Filter we use a Particle Filter, that is an approximated solution of the Bayes Filter. Particle Filters are well suited to cope with multi-modal PDF, as in our case, due to the strong spatial symmetry of the reference grid . Before starting the scanning process, the system is able to perform a reliable global localization of the position of the
thermal camera. During the scanning of the wall surfaces, a set of special positions are automatically and sequentially reached by the moving IR camera: for each position a thermal image is recorded. These special positions represent the centroids of the neighbour targets. Images taken in such location usually contain four or more targets, enabling the es-
timation of the homography that allows an accurate mapping between image points and world points. Given the homography, we rectify the image (i.e., remove the distortion due to perspective projection) in order to obtain a more accurate temperature sampling. In Figure 7 the the graphical user interface (GUI) of the developed application.
We successfully tested our system in several challenging environments.
@INPROCEEDINGS{prettoISR2010, AUTHOR = {Pretto, A. and Menegatti, E. and Bison, P. and Grinzato, E. and Cadelano, G. and Pagello, E.}, TITLE = {An Autonomous Robotized System for a Thermographic Camera}, BOOKTITLE = {Proceddings of: International Symposium on Robotics (ISR).}, YEAR = {2010} }
Motion blur is a severe problem in images grabbed
by legged robots and, in particular, by small humanoid robots.
Standard feature extraction and tracking approaches typically
fail when applied to sequences of images strongly affected
by motion blur.
We propose a new feature detection and tracking scheme that is robust even to non-uniform motion blur. To reliably extract and
track the features, we estimate using a direct method the point spread function (PSF) of the motion blur individually for image patches obtained
via a clustering technique. The estimated PSFs are then
used to build an adapted scale-space representation trying
to minimize the undesired effect of the motion blur.
In a nutshell, we smooth less the image
along the motion blur direction.
Moreover, we only consider highly distinctive features during matching.
Furthermore, we developed a framework for visual odometry based on features extracted out of and matched in monocular image sequences.
We use the ego-motion using the five-point algorithm together with the MLESAC estimator in order to discard the outliers.
The experiments demonstrate that our technique is able to reliably extract and match features and that it is furthermore able to generate a
correct visual odometry, even in presence of strong motion blur
effects and without the aid of any inertial measurement sensor.
@inproceedings{PrettoICRA09, author = {Pretto, A. and Menegatti, E. and Bennewitz, M. and Burgard, W. and Pagello, E.}, title = {A Visual Odometry Framework Robust to Motion Blur}, year = {2009}, booktitle = {Proc. of the IEEE International Conference on Robotics and Automation (ICRA)}, pages = {2250 -- 2257}, }
@inproceedings{prettoHumanoids2007, title={Reliable Features Matching for Humanoid Robots}, author={A. Pretto and E. Menegatti and E. Pagello}, booktitle={IEEE-RAS International Conference on Humanoid Robots}, year={2007}, pages={532 -- 538}, }
We propose a similarity measure for images which can be used in image-based topological localization and topological SLAM problems by autonomous robots with low computational resources.
Instead of storing the images in the robot memory, we propose a compact signature to be extracted from the images.
The signature is based on the calculation of the 2D Haar Wavelet Transform of the gray-level image and weights 170 bytes only. We called this signature DWT-signature. We exploit the frequency and space localization property of the wavelet transform to match the images grabbed by the perspective camera mounted on board of the robot and the reference panoramic images built using an automatic image stitching procedure. The proposed signature allows, at the same time, memory saving and fast and efficient similarity calculation.
For the topological SLAM problem we also present a simple implementation of a loop-closure detection based on the proposed signature. The effectiveness of the proposed image similarity measure is tested using two kinds of small robots: an AIBO ERS-7 robot of the RoboCup Araibo Team of the University of Tokyo and a Kondo KHR-1HV humanoid robot of the IAS-Lab of the University of Padua.
@ARTICLE{prettoRAS2010, title={Image similarity based on Discrete Wavelet Transform for robots with low-computational resources}, author={Pretto, A. and Menegatti, E. and Jitsukawa, Y. and Ueda, R. and Arai, T.}, Journal = {Robotics and Autonomous Systems, Elsevier}, volume = {58}, number = {7}, year = {July 2010}, pages = {879--888} }
Aim of this work is to develop an interactive robotic sculpture able to locate people in the environment, to navigate toward them avoiding the obstacles and to approach them as a polite waiter will do.
The sculpture is conceived and realized by the artist Albano Guatti. The robotic part is totally developed by people at the IAS-lab and at
IT+Robotics according to Guatti's concept. The sculpture
represents a waiter and a waitress (actually their suits, the
statue do not have a body, they represent empty suits). The
waiters chat among them (by play different pre-recorded
voice files) while wandering in the environment. Once the
robot locates a person and get close to her, it asks the
customer one out of different pre-recorded questions. The
more likely is: Would you like a drink?, that is the title of
the sculpture.
The fundamental ability of this robot is the capability to detect people in a unstructured environment and to reach for them, avoiding obstacles, while staying within the bounds of its working space. To implement these functionalities three different sensors have been used, each one devoted to a specific task: (i) an omnidirectional camera used to detect people in the environment; (ii)a ring of sonars used to detect obstacles; (iii) a set of floor sensors that can detect light reflectivity discontinuity on the floor (this can be used to bound the working space of the robot by using a carpet of a different color or by putting stripes of Dutch tape on the floor to bound the working area).
The detection algorithm is based on two filters: a motion detection filter based on the background subtraction algorithm
and a skin detection filter, based on a new skin detection filter.
The sonars data are processed using a modified VHF (Vector
Field Histogram) to be robust to the intrinsic noise of
these sensors.
@inproceedings{lastraAIIA07, AUTHOR = {Lastra, A. and Pretto, A. and Tonello, S. and Menegatti, E.}, TITLE = {Robust color-based skin detection for an interactive robot }, booktitle = {10th Congress of the Italian Association for Artificial Intelligence (AI*IA '07)}, YEAR = {2007}, pages={507--518}, }
We propose to use an omnidirectional camera mounted on a mobile robot to perform a sort of scan matching. The omnidirectional vision system finds the distances of the closest color transitions in the environment, mimicking the way laser rangefinders detect the closest obstacles. The similarity of our sensor with classical rangefinders allows the use of practically unmodified Monte Carlo algorithms, with the additional advantage of being able to easily detect occlusions caused by moving obstacles.
We tested our system in both the RoboCup environment and in an unmodified office environment. In addition, we assessed the robustness of the system to sensor occlusions caused by other moving robots. The localization system runs in real-time on low-cost hardware.
@ARTICLE{MenegattiTRO06, title={Omnidirectional vision scan matching for robot localization in dynamic environments}, author={Menegatti, E. and Pretto, A. and Scarpa, A. and Pagello, E.}, journal={IEEE Transactions on Robotics}, year={June 2006}, volume={22}, number = {3}, pages={523--535} }
@inproceedings{MenegattiIROS04-OMCL, Author = {Menegatti, E. and Pretto, A. and Pagello, E.}, Title = {Testing Omnidirectional Vision-Based Monte-Carlo Localization under Occlusion}, booktitle = {Proc. of the {IEEE/RSJ} International Conference on Intelligent Robots and Systems ({IROS})}, Pages = {2487--2494}, Year = {2004} }
@inproceedings{MenegattiRobocup04, Author = {Menegatti, E. and Pretto, A. and Pagello, E.}, Booktitle = {Proc. of: International RoboCup Symposium}, Title = {A New Omnidirectional Vision Sensor for Monte-Carlo Localization}, Year = {2004} }