Movies of autonomous and semiautonomous
robots
Two-mm precision is achieved with uncalibrated cameras and a rather crude
on-board arm. The key is mobile
camera-space manipulation. Maneuver
objectives are formulated and realized in the reference frames of the two
on-board cameras – much the same as a human forklift operator controls closure
of fork with pallet gap in his own visual frame of reference. But no human could maneuver the degrees of
freedom of the mechanism based on simultaneous visual information from the two
entirely different vantage points of two widely separated, onboard
cameras. And no human could terminate
with the 2mm precision needed to insert two small forks into the gaps of the
“pallets” of this demonstration. The
artificial system, however, makes use of visual cues, circular marks precisely
located on each pallet – marks that are recognized
automatically and positively in the reference frames of the on-board
cameras. The cameras can be seen on
small mounts on opposite sides of the flat mobile platform.
“Teach-Repeat”
is the norm in industry for controlling a typical robotic arm. But this mode carries with it a steep price:
The “workpiece” must be returned, copy after copy – an active and often costly procedure – to the prototype position of
the workpiece used by the human teacher in order to make this work. But extending
teach-repeat to wheeled navigation is even more useful, and replaces the
active burden of precise workpiece prepositioning with the passive need to
merely retain walls, sinks, etc., relative to which navigation must occur, in
the same place as they were when the paths of interest were initially
taught. There is some additional
engineering required, however, in order to realize precision/reliability as
required for floor maintenance, or, as with this demo, autonomous wheelchair navigation. Estimation,
based upon a combination of odometry and visual detection of wall cues, is used in this
instance. It
is precise and versatile. And it
leaves open the possibility of applying ultrasound sensors to steer around
objects that were not present when the maneuver was taught. In fact, retaining a compressed form of the
record of the rather imprecise ultrasound echo during the teaching event allows
for use of a template comparison during repeat – in turn permitting very close
approach to objects such as toilets that must be neared as part of the maneuver
requirement while at the same time recognizing, in a wider-open space,
introduced, new objects that must be avoided.
This distinction will be useful for floor maintenance as well as the
automatically guided wheelchair.
Moreover, in the event that all paths are blocked, the machine can
identically reverse course (moving backward for example where the taught path
took the rider forward) so the likelihood of being stranded is very low. The paradigm allows, as shown in the video,
for complex paths that may include multiple direction changes as well as pure
and near pivoting. It can and should be made available
to individuals with severe disabilities, including sight impairment, who
otherwise could not navigate through the halls of educational institutions,
etc. This video shows the level of user
participation needed, together with the high “repeat” precision. Both
in terms of user designation of the desired terminus and in terms of new-path
establishment or “teaching”. With simple
adjustments of the permanent placement of on-board cameras’ pointing direction,
and associated EKF observation equations, cues could be placed upon ceilings or
floors as well as walls and furnishings.
Or walls and ceilings. Note that it is best, when high precision of
tracking is required, such as shown in the “This video … “ link, to rely upon
wall cues. Where ceiling cues are relied upon, even small listing of the chair
between teach and repeat episodes will cause significant tracking error. On the other hand, the high precision of
tracking inherent with wall cues in
close quarters is extremely robust to real-world effects such as the tilting of
chairs with pneumatic tires.
Three useful elements enable
this project. 1. Highly accurate force
control can be achieved without force sensing through the use of
estimation-based positioning, CSM, with its robust precision relative to – in
this case – a laser-spot-characterized surface. Combined with approximate
knowledge of the contacting combination’s compliance normal to the surface,
position control relative to the surface ensures robust force control even in
the presence of erratic frictional forces.
Such real-world frictional forces typically make the alternative –
direct feedback of sensed contact force – difficult. The force control is achieved by specifying
and commanding via camera-space manipulation the interference of the contacting
surfaces. (An “interference” of zero
would be consistent with the sanding surface and the treated surface just
touching, with zero force.) A very small
amount of interference – 4mm in this case – achieves the needed force range in
the direction normal to the sanded surface. The high precision afforded by CSM
is important in this context because the contacting objects need to be stiff in
order to control position of the tool in the plane that at any instant is
tangent to the sanded surface. 2. User
designation of the boundaries of surface regions that are to be treated can be
specified intuitively and transferred robustly to the system’s participating,
uncalibrated cameras through the autonomous pan/tilt direction shown in the
video. “Laser spots” falling on the
surface are detected robustly using image differencing. The “matching” of laser spots so detected by
participant, uncalibrated cameras refers to the identification among many
consecutively acquired images (prior to the robot’s appearance into the scene)
containing these laser spots cast down onto the target surface as may be
needed. Large numbers of such
acquisitions of images where laser spots have been directed toward new surface
locations results in correspondence or mapping among participant
two-dimensional camera spaces of the spot-center locations that is highly
precise. 3. A preferred method of image analysis is this
spot matching as applied to surface characterization in the context of
CSM. Not only does this allow eventual
robust and precise control of the robot in order to address physically and precisely
the surface of interest without calibration of robot or cameras; it also
carries with it an inherent advantage relative to edge detection, in part
because an edge may more usefully be construed as the intersection of two
surfaces.
Jesse Batsche’s Undergraduate Research Project
represents a very nice extension of the above. Our longer-term goals entail extending this
to exact, large-scale surface replication using a mix of abrasion and cutting. Unlike existing CNC machines which rely upon
precise fixturing and kinematic calibration of the cutting tool, surface
replication in this instance merely entails recovering the profile of laser
spots in a group of stationary but uncalibrated cameras between a prototype
exposure and a carved/abraded-to-shape replica.
Semiautonomous robotic digging supervised
and directed by a remote user (Movie), by Sam Chen, Alec Hirshauer, Erin
Mulholland, Lydia Szeligowski, Biao Zhang, Shenwei Zhu. In the spring of 2006 several students
contributed to this NASA-sponsored project.
The movie illustrates one important aspect of their work – the ability
to specify geometric characteristics to a digging robot and carry out those
instructions without a human being on-site and without system calibration. Although this particular robot is
fixed-based, it is important to note that nothing in this illustration requires
a fixed-base robot. In other words, none
of the elements or components – cameras, robot, laser pointers, pan/tilt units
– require or assume prior calibration.
Hence, all of the accuracy transfers directly to any remote site to
which the system may have been transported.
The human supervisor, moreover, can specify input or instructions
without the de-facto requirement of human-in-the-loop control (think of a
backhoe operator) of being physically on-site.
This is a user-interactive demonstration, based on a series of actual CSM
experiments. The objective here is to
convey the powerful prospect of human supervision – in this case of an inventory-sorting
exercise – as achieved using “point and click” combined with automatic
laser-spot convergence. Begin the
exercise selecting large or small videos and then click on either of the two
boxes that are fully uncovered (i.e. not partially occluded) in the largest
image. Then let the demonstration go on
from there to see what happens, before you are prompted to click again. This same general strategy can be adapted to
serve all kinds of remote human supervisory control. Because of CSM it is precise and robust;
there is no need to worry about terminal precision in 3D or even the
possibility of cameras shifting at the remote site: If incoming data become
mutually incompatible with respect to the next robot-joint-level commands
needed to culminate the maneuver as instructed, the system “knows it” and will
either prompt for new supervisory instructions or back away and begin the
robot-maneuver approach afresh. Human
selection of one or several surface-point junctures on an image can be used to
define, in a way that both human supervisor and machine “understand,” a very
wide range of real-world tasks.
Many surface treatments could be achieved on large, arbitrarily located
bodies using artificial mechanical dexterity.
The problems are: 1. specifying to the system the domain of surface
space across which the maneuver must occur; and 2. Controlling the internal
degrees of freedom of the robot in order, precisely, to achieve the needed
end-member/tool motion in order, exhaustively across the selected surface
region, to complete the process. The
former is here shown using human “point-and-click” supervision, and the latter
with CSM – combined with multiple laser-spot samples reflected off the surfaces
of interest in all participant, uncalibrated cameras.
Two millimeters, or one pixel in the scale of the controlling, uncalibrated-camera
images: That is the margin for error in
engaging each bag in order to follow with a taught, repeat action to place that
bag onto the nozzle without it ripping.
This is achieved using precisely printed-on “visual cues” together with CSM for
controlling the degrees of freedom of the robot. All six axes are controlled during each
bag-engagement event, fully controlled in 3D using two ceiling-mounted,
uncalibrated cameras.
http://www.nd.edu/NDInfo/Research/sskaar/Home.html
Several
videos as well as supplementary technical explanation can be found at the above
address.
Three of these videos show early experiments with a particular problem – the
problem whose inability to be solved with calibration some say was the final
straw of 1980s experiments with the workerless factory. Each of the three illustrates a different
feature of CSM. (The visual cues on wheel and brakeplate
were, importantly, NOT available to engineers trying to address this problem in
the 1980s. However, we believe that the
indicated technology could be implemented without these cues using structured
light (which was used then.)
1. The "wheel load" task (shown at four time normal speed in the
movie) requires high translational precision (within about one mm) and
rotational
(Quicktime Video
2.0 Mb or MPEG Video 1.7 Mb)
precision (roughly one degree about any axis) in order to be completed
successfully. This performance is achieved regularly and reliably, despite
large and arbitrary movement of the "workpiece" (in this case, the
brake plate). Precision in this case
comes despite a required panning and tilting of controlling cameras. Information from pre-pan/tilt motion must be
retained however for the maneuver to succeed (without reestablishing the preplanned
motion.) This is accomplished by retaining certain mathematical properties of
the “view parameters” prior to the pan/tilt movement, and incorporating into
them certain knowledge of the camera pan/tilt angular movement.
2. Sometimes the nominal geometry of
the grasp differs from the reality significantly. The method must be adaptable to grasp
variability of this sort.
(Quicktime Video 2.0 Mb or MPEG Video 1.7 Mb)
In
the wheel-load movie above, a very large and arbitrary one-dimensional
rotational "shift" in the grasp of the wheel is applied requiring
explicit modeling and estimation of this degree of freedom (rather than relying
on "absorbtion" of grasp-description errors, as is accomplished
automatically in the estimation process for much smaller shifts in grasp, e.g.
the pallet-stacking project discussed above).
3. In the movie below, another remarkable
capability of the method is illustrated:
(Quicktime Video 2.0 Mb or MPEG Video 1.7 Mb)
Control
success is achieved with no loss of precision or reliability despite the fact
that camera 2 "sees" the event through a common bedroom mirror. Only
a "mirror-image" transformation of camera-space requirements is
necessary to realize this capability. This kind of performance is possible
because, with camera-space manipulation, the maneuver objectives are specified
and realized in the reference frames (minimum of two) of the camera-sensors.
Contact
information: Please write to: Dept. of Aerospace and Mechanical Engr; Univ.
of Notre Dame; Notre Dame IN 46556. Or
call at (574) 631-6676. Or email th to skaar.1@nd.edu.
The
file below is a binary file created in Microsoft Word 5.1 for the Macintosh
computer
This
page maintained by S. B. Skaar Send me
your inquiries.