The NASA photograph illustrates one obviously difficult, fairly exotic, and currently unmet need to employ robots.
Autonomous, robotic construction of a truss in a remote, possibly hostile environment, however, is a challenge to the current capability of robotics in much the same way as hundreds of more everyday tasks also challenge the current state of the art.
To understand the nature of this challenge, consider one of the surprisingly few situations where robots are now used routinely in industry: spot-welding of the frame of an automobile.
A great deal of the cost and difficulty of implementing this task satisfactorily lies not with the robot itself but rather with the dedicated jigs and fixtures needed to constrain each consecutive workpiece. In particular, provided there is no shift of the grasp of the welding gun, the robot can return the tip of its tool to an exact location in three-dimensional, physical space from copy to consecutive copy; so the critical condition for manuever success lies with creating the ability to locate the juncture of interest of each workpiece to within whatever tolerance may be required for the weldment at hand.
This ability depends on solid, rigid, specialized, dedicated, and typically costly fixtures. Even with such fixtures in place, some variability can be expected in the frame-relative position of each weldment.
Contrast this with the wheel mounting of the nearby video - a manuever which uses camera-space manipulation.
(Quicktime Video 2.0 Mb or MPEG Video 1.7 Mb)
The "wheel load" task (shown at four time normal speed in the movie), for example, requires high translational precision (within about one mm) and rotational precision (roughly one degree about any axis) in order to be completed successfully. (click here for more detail about the videotaped wheel-load experiment) This performance is achieved regularly and reliably, despite large and arbitrary movement of the "workpiece" (in this case, the brake plate).
Not suprisingly, where assembly of this kind occurs in practice, a human being is present and central to the procedure. As dicussed in the next section, in fact, the philosophical basis of camera-space manipulation shares certain aspects in common with human manipulation. This second section, titled philosophy, also discusses the important distinctions between camera-space manipulation and the two alternative approaches, "calibration" and "visual servoing", which are directed toward similar goals.
The next section, titled camera-space objectives, provides a discussion of the procedure used to convert a desired position/orientation sequence of the two bodies of interest to a target "camera-space" sequence of objectives.
After "camera-space objectives", the next section discusses the artificial cues which are used in many (but not all, see below) of the camera-space manipulation experiments conducted to date. The image-analysis speed and versatility which are available with these particular insignias stem from certain invariant properties of their projections.
Following the discussion of artificial cues, we present a discussion titled camera-space kinematics. This section is centered on the question of how best to exploit the camera-specific stream of data, acquired during the approach of the manipulable body towards its objective, to refine the local understanding of the "kinematic" relationship between joint-rotation position and the appearance of any given point on the manipulated body in the image plane (or in "camera space").
The next segment, titled resolving theta, discusses the way in which results from the previous three segments can be applied to command internal joint-coordinate positions (or position sequences) which will result in the desired relative motion of the bodies of interest.
Numerical problems are avoided in the estimation procedures discussed above because the pin-hole camera model is replaced by its asymptotic limit for infinite camera distances. A way of exploiting approximate knowledge of finite camera-to-workpiece distances while still preserving the numerical stability of the orthographic model is discussed in the section titled flattening.
Because of the algorithmic structure of camera-space manipulation, a natural computer-processing parallelism can be exploited. The "division of labor" among the several computational tasks is well separarted, requiring the intertransfer of relatively little data, and little or no coordination of timing among the several computational tasks. The discussion of this matter is contained in the section titled parallel processing.
An important class of human-controlled manipulators - fork lifts - which might benefit from automation is an example of a "nonholonomic" system.
(Quicktime Video 1.8 Mb or MPEG Video 1.5 Mb)
Camera-space manipulation, as described above, is applicable to "holonomic" systems only (i.e. systems where the position of the maneuvered body is "path independent" and a function of current joint position only). The substantial work that is necessary to extend camera-space manipulation to nonholonomic forklifts is discussed in the section titled nonholonomic camera-space manipulation.
Another extension to camera-space manipulation is required if efficient advantage is to be taken of the ability to pan and tilt participating cameras.
The essential idea of the section titled pan-tilt is that camera information acquired prior to camera movement can (and often must) be applied to the estimation of the new camera-space kinematics following each camera's move. Such application greatly reduces the amount of new information needed for restoring adequate understanding of the new camera-space kinematics. The implications of the ability to exploit information acquired prior to camera movement are huge. Among these implications is that the tradeoff between precision and workspace breadth is ended.
Although camera-space manipulation is robust with respect to reasonably small errors in the kinematic models, including error in the grasp geometry, special provisions may be made when large grasp uncertainty is present. The explicit modeling and estimation of certain unknown aspects of the grasp geometry is discussed in the section titled grasp estimation.
In many applications, the workpiece is not stationary, but rather is moving autonomously through the manipulator's workspace. An example of this is conveyor-belt transportation of the workpiece. The section titled moving target discusses an approach for estimating the camera-space position history of the moving workpiece based on some knowledge of the nature of the physical motion of the workpiece.
(Quicktime Video 1.4 Mb or MPEG Video 1.0 Mb)
The video above, for example, illustrates how a sequence of projectile positions in camera-space is used in conjunction with a simple projectile-motion model to forecast the camera-space projectile trajectory in time to allow the arm to intercept the ball in camera space. The computer screen sequence indicates two arcs: one is the actual camera-space trace of the projectile, and the other is the system forecast of this trace following information acquired up through the indicated projectile position. For more detailed information about this movie, click here.
Although image analysis has a long-standing history in its own right, its application to camera space manipulation is special in the sense that, when analyzing any new image, great advantage can be taken of the very quantities which must be estimated with camera-space manipulation using the results of previously analyzed images.
![]()
wheel to mount on brake plate high-gradient pixels in wheel image
The section titled natural features discusses one way of achieving a mutual or "synergistic" relationship between image analysis of "natural" features and the estimation of camera-space kinematics. "Natural feature" locations, in this work, are found from the high-gradient picture elements, as indicated in the pictures above.
Prior to the summary section, a penultimate section titled error analysis discusses an analysis of propagation of error from initial data acquisition to final three-dimensional positioning. It also discusses analytical and experimental evidence for the robustness of the method to kinematic/camera modeling error.
Part I of this monograph is made up of the sections mentioned above. To best understand Camera Space Manipulation, begin the investigation with Philosophy.
Continue to: Philosophy
Return to: Vision-Based Robotics Using Estimation (Home Page)