260 Pages
English

Visual tracking and grasping of a dynamic object [Elektronische Ressource] : from the human example to an autonomous robotic system / Michael Sorg

Gain access to the library to view online
Learn more

Description

Lehrstuhl fur¨ Realzeit-ComputersystemeVisual Tracking and Grasping of a Dynamic Object:From the Human Example to an Autonomous RoboticSystemMichael Sorg¨ ¨ ¨Vollstandiger Abdruck der von der Fakultat fur Elektrotechnik und Informationstechnik derTechnischen Universitat¨ Munchen¨ zur Erlangung des akademischen Grades einesDoktor Ingenieurs (Dr. Ing.)genehmigten Dissertation.Vorsitzender: Univ. Prof. Dr. Ing. Klaus DiepoldPrufer¨ der Dissertation: 1. Univ. Prof. Dr. Ing. Georg F arber¨2. Hon. Prof. Dr. Ing. Gerd HirzingerDie Dissertationwurde am19.02.2003 beiderTechnischenUniversitat¨ Munchen¨ eingereichtunddurchdieFakultat¨ fur¨ ElektrotechnikundInformationstechnikam16.07.2003 angenom men.Mu¨nchen, den 11.11.2002First of all I want to thank my advisor Prof. Georg F¨arber for havinggiven me the opportunity to work at a real thrilling topic. His manner ofnot pushing me in a certain direction, but leaving enough room to developown ideas, trying things where the outcome was unsure and leaving thefreedom to decide many things “on my own” were very valuable for me.Thereby I learned a lot that I can need in the future. Many thanks toProf. Gerd Hirzinger for his spontanous promise to help as a correctorof this thesis.But this would not have been possible if there hadn’t been Alexa Hauck.Having been already my advisor for my master’s thesis, she fascinatedme for “hand-eye coordination” and finally was developing valuable ideasand a plan for this thesis.

Subjects

Informations

Published by
Published 01 January 2003
Reads 12
Language English
Document size 9 MB

Lehrstuhl fur¨ Realzeit-Computersysteme
Visual Tracking and Grasping of a Dynamic Object:
From the Human Example to an Autonomous Robotic
System
Michael Sorg
¨ ¨ ¨Vollstandiger Abdruck der von der Fakultat fur Elektrotechnik und Informationstechnik der
Technischen Universitat¨ Munchen¨ zur Erlangung des akademischen Grades eines
Doktor Ingenieurs (Dr. Ing.)
genehmigten Dissertation.
Vorsitzender: Univ. Prof. Dr. Ing. Klaus Diepold
Prufer¨ der Dissertation: 1. Univ. Prof. Dr. Ing. Georg F arber¨
2. Hon. Prof. Dr. Ing. Gerd Hirzinger
Die Dissertationwurde am19.02.2003 beiderTechnischenUniversitat¨ Munchen¨ eingereicht
unddurchdieFakultat¨ fur¨ ElektrotechnikundInformationstechnikam16.07.2003 angenom
men.Mu¨nchen, den 11.11.2002
First of all I want to thank my advisor Prof. Georg F¨arber for having
given me the opportunity to work at a real thrilling topic. His manner of
not pushing me in a certain direction, but leaving enough room to develop
own ideas, trying things where the outcome was unsure and leaving the
freedom to decide many things “on my own” were very valuable for me.
Thereby I learned a lot that I can need in the future. Many thanks to
Prof. Gerd Hirzinger for his spontanous promise to help as a corrector
of this thesis.
But this would not have been possible if there hadn’t been Alexa Hauck.
Having been already my advisor for my master’s thesis, she fascinated
me for “hand-eye coordination” and finally was developing valuable ideas
and a plan for this thesis. Besides these “hard facts” she was the best
advisor and colleague that I can think of. It was always fun and very
motivating to work together. I often think where I would be now if I
hadn’t knocked on her door ...
Special thank goes to Thomas Schenk and Andreas Haus¨ sler from the
“neuro” team. Besides the fact that they brought “light into dark” when
we were discussing neuroscientific literature, their interest in robotics
and my work was a special motivation and gave me the feeling that I
(and my students) were doing something valuable.
The work would never have been possible without all the students work-
ing with me during their diploma thesis. They developed really great
ideas and were providing all the necessary pieces to letMinERVA catch.
Many thanks to Christian Maier, Hans Oswald, Georg Selzle, Jan Le-
upold, Thomas Maier, Jean-Charles Beauverger and Sonja Glas. That
many of them ended up as my colleagues emphasizes their “good job”.
At the lab I want to thank all the colleagues from the Robot Vision Group
for providing a real good atmosphere. Special thanks go to Georg Passig
who not only supported me in any problems concerning the robot but
had always time to discuss any other problem concerning work and “the
world”. Thanks to all the people of the Schafkopfrunde. This was (and
is!) great fun.
Again special thank to Johanna Ru¨ttinger. Without her debugging thou-
sands lines of other peoples code, integrating new one and finally pro-
viding huge amounts of experimental data, the experimental part of this
thesis would have been poor. Or to be more precise: without her help I
think MinERVA would have never catched anything!
Last but not least I want to thank my parents and my brother. Not
knowing what I was exactly doing but always trusting that I would do it
right was very pleasant.
Michael SorgAbstract
In this thesis a robotic hand-eye system capable of visual tracking of a moving object and
reaching out to grasp this object using a robotic manipulator is described. A noticeable
numberofsuccessfulmethodsperformingthosetaskshasbeenpublished(alsorecently)and
impressivedemonstrationshavebeenshownthereby. Nevertheless, thereisstillonesystem
that is superior to all the demonstrated ones: the human. Humans perform catching tasks
with a high degree of accuracy, robustness and flexibility. Therefore this thesis investigates
results of neuroscience and applies them to design a robotic hand-eye system for grasping
a moving object. From the experimental data of human catching movements it can be
derived that humans are performing different subtasks during catching: tracking of the
target object, prediction of the future target trajectory, determination of an interaction
point in space and time, and execution of an interceptive arm movement. Thereby the
different subtasks are performed in parallel and the coordination between “hand and eye”
is reactive: the human can easily adapt and correct its interceptive movement triggered
eitherby(sudden)changesinthetargetstrajectoryorbyrefinementofthepredictedobject
trajectory and the hand-target interaction point.
Transferring knowledge gained by the neuroscientists to robotics is often difficult since the
underlying physical systems are very different. Nevertheless there exist interesting models
or experimental data that offer the possibility of transfer. In this thesis for two of the
above noticed subtasks biological concepts are deployed: for visual tracking and for the
execution and timing of the interceptive catching movement.
For the tracking subtask the used visual sensors are closely related to those found in the
human brain: Form, color and motion (optic flow). Through analysis of human visual pro-
cessingfromtheeyeuptothevisualcortexthreemainconceptscouldbeseparated: parallel
information flow, pre-attentive processing and reentry of information. These mechanism
allow the human the optimal utilization of the presentedtion before attention is
put on a certain stimulus. This can be seen as a form of image pre-processing. Integrating
those concepts in a robotic hand-eye system improves image pre-processing for still images
as well as in a tracking task noticeable.
For the determination of hand-target interaction points and the timing of the arm move-
ment relative to the target motion a human-like behavior is adopted. Based on experimen-
tal data a four phasic model for the determination of interaction points and the generation
5ofappropriatevia-pointsforaroboticmanipulatorforreach-to-catchmotionsisdeveloped.
Thismodelsatisfiesthepurposeofflexibility: dependingonthecurrentobjectmotion(and
prediction) the via-points are adapted and the interceptive movement is corrected during
motion execution.
The validity of these concepts is investigated thoroughly in simulations. Together with
modules for target object prediction (using autoregressive models), for the determination
of grasping points and for robot arm motion control a robotic hand-eye system is demon-
strated that proves its practicability in real experiments performed on the experimental
hand-eye system MinERVA.Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Contributions and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Neuroscience 7
2.1 Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Anatomy of the Human Visual System . . . . . . . . . . . . . . . . 8
2.1.2 Models of Human Visual Processing . . . . . . . . . . . . . . . . . . 15
2.1.2.1 Parallel Information Flow and Reentry of Information . . 15
2.1.2.2 Feature Maps and Integration of Information: Visual At-
tention. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Hand–Target Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.1 Interaction with a Static Target: Reaching . . . . . . . . . . . . . . 19
2.2.2 Models of Human Reaching Movements . . . . . . . . . . . . . . . . 21
2.2.3 Interaction with a Moving Target: Catching . . . . . . . . . . . . . 24
2.2.4 Models for Human Catching Movements . . . . . . . . . . . . . . . 24
2.2.4.1 Movement Initiation . . . . . . . . . . . . . . . . . . . . . 25
2.2.4.2 On-line Control of Hand Movement . . . . . . . . . . . . . 28
iii CONTENTS
2.2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3 Robotic Hand-Eye Coordination 33
3.1 Internal Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.1.1 Models of the Hand-Eye System . . . . . . . . . . . . . . . . . . . . 34
3.1.2 Models of the Object to be Grasped. . . . . . . . . . . . . . . . . . 46
3.1.3 Models of Object Motion . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2 Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2.1 Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2.1.1 Contour-based Tracking . . . . . . . . . . . . . . . . . . . 48
3.2.1.2 Color-based Tracking . . . . . . . . . . . . . . . . . . . . . 52
3.2.1.3 Motion-based Tracking . . . . . . . . . . . . . . . . . . . . 60
3.2.2 Sensor Fusion and Integration . . . . . . . . . . . . . . . . . . . . . 61
3.2.3 Grasp Determination . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.3 Motion Reconstruction and Prediction . . . . . . . . . . . . . . . . . . . . 64
3.3.1 Prediction with Auto-regressive Models . . . . . . . . . . . . . . . . 66
3.3.1.1 Global AR Model (least square) . . . . . . . . . . . . . . . 66
3.3.1.2 Local AR Model (maximum likelihood) . . . . . . . . . . . 67
3.3.2 Nearest Neighbor Predictions . . . . . . . . . . . . . . . . . . . . . 70
3.4 Hand-Target Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4.1 Interaction with a Static Target . . . . . . . . . . . . . . . . . . . . 71
3.4.1.1 Positioning . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.4.1.2 Reaching and Grasping . . . . . . . . . . . . . . . . . . . 72
3.4.2 Interaction with a Moving Target . . . . . . . . . . . . . . . . . . . 74
3.4.2.1 Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.4.2.2 Catching and Hitting . . . . . . . . . . . . . . . . . . . . . 75
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77CONTENTS iii
3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4 Hand-Eye System and Interaction with a Moving Target 79
4.1 Internal Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.1.0.3 Automatic Initialization of B-spline Contour Models . . . 85
4.1.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.2 Tracking of Moving Objects . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.2.1 Contour-Based Tracking . . . . . . . . . . . . . . . . . . . . . . . . 96
4.2.2 Color-Based Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.2.3 Motion-Based Tracking . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.2.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.3 Sensor Fusion and Integration . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3.1 Sensor Preprocessing and Fusion: Pre-attentive Processing . . . . . 100
4.3.2 Probability Based Sensor Integration: Attentive Processing . . . . . 105
4.3.2.1 Modified ICONDENSATION Algorithm . . . . . . . . . . 106
4.3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.4 Determination of Grasping Points . . . . . . . . . . . . . . . . . . . . . . . 110
4.4.1 Search and Tracking of Grasps . . . . . . . . . . . . . . . . . . . . . 110
4.4.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.5 Object Motion Reconstruction and Prediction . . . . . . . . . . . . . . . . 115
4.5.1 Average ARM Prediction . . . . . . . . . . . . . . . . . . . . . . . . 115
4.5.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.6 Robot Arm Motion Control . . . . . . . . . . . . . . . . . . . . . . . . . . 118
4.6.1 Human Trajectory Generation . . . . . . . . . . . . . . . . . . . . . 118
4.6.1.1 Static, Double-Step and Dynamic Targets . . . . . . . . . 120
4.6.2 Robotic Trajectory Generation . . . . . . . . . . . . . . . . . . . . 120
4.6.2.1 Determination and Control of Hand’s Position . . . . . . . 122
4.6.2.2 Determination and Control of Hand’s Orientation . . . . . 124iv CONTENTS
4.6.2.3 Collision Detection and Workspace . . . . . . . . . . . . . 130
4.6.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.7 Interaction Point Determination and Intermediate Target Calculation . . . 134
4.7.1 Open Questions and Hypotheses . . . . . . . . . . . . . . . . . . . . 134
4.7.2 Four Phase Model of Hand Motion towards a Moving Target . . . . 136
4.7.2.1 Approach Phase . . . . . . . . . . . . . . . . . . . . . . . 136
4.7.2.2 Adaption Phase . . . . . . . . . . . . . . . . . . . . . . . . 138
4.7.2.3 Contact Phase . . . . . . . . . . . . . . . . . . . . . . . . 140
4.7.2.4 Follow Phase . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.7.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
4.8 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
4.8.1 System Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . 150
4.8.2 State Automaton and Timing Charts . . . . . . . . . . . . . . . . . 151
5 Simulations, Experimental Validation and Results 155
5.1 Tracking with Color, Form and Motion . . . . . . . . . . . . . . . . . . . . 156
5.1.1 Color Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
5.1.2 Form Tracking (CONDENSATION Algorithm) . . . . . . . . . . . 161
5.1.3 Motion Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
5.1.4 Modified ICONDENSATION . . . . . . . . . . . . . . . . . . . . . 164
5.1.5 Reentry of Color in Form Path . . . . . . . . . . . . . . . . . . . . 170
5.2 Prediction of Target Motion . . . . . . . . . . . . . . . . . . . . . . . . . . 182
5.2.1 Simulation: Comparison NN, Global ARM, Local ARM . . . . . . . 184
5.2.2 Real Tracking: Average ARM . . . . . . . . . . . . . . . . . . . . . 189
5.3 Simulation of Hand-Target Interaction . . . . . . . . . . . . . . . . . . . . 191
5.3.1 Control of Position . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
5.4 Real Robot Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
5.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 196