To evaluate grasping metrics for human grasping, the poses of a human hand and an object grasped by the hand are detected. For this purpose, single view RGB images are processed.
The poses as well as translations are estimated from images without depth information. For this purpose, a neural network called kypt_transformer by Hampali et al. is used. The output of the neural network that is used is then applied to generate surface meshes. The generated surface meshes are based on the MANO model mesh for the hand and the YCB object meshes for the object.
To generate the MANO mesh, some additional steps must be taken since the output of the kypt_transformer does not match the MANO model. Inverse kinematics is applied in the form of a neural network called IKNet to estimate the angles between the hand joints. Hand shape parameters are also estimated using particle swarm optimization to fit the shape and size of a detected hand.
To generate the object mesh, an untransformed YCB mesh that matches the detected object is selected. Since the object rotation is given as a vector with six values from the neural network, a rotation matrix is calculated after splitting the vector into two vectors, normalizing the vectors, and calculating the cross products to obtain a third vector. By applying the rotation matrix and a translation vector, the pose of the object mesh is set.
The meshes can be visualized using the open3d library. In the visualizations, it can be seen that the poses of the hand and object are not entirely accurate, but provide a good approximation of the scene, especially with monocular RGB images as the only input.
To provide more contact area between meshes, the object meshes can also be scaled.
Forces and moments are estimated using the hydroelastic contact model. It is used to calculate contact dislocations and to simulate deformation between rigid bodies by a finite element method. It computes hydrodynamic forces between rigid bodies using a boundary element method.
For this purpose, the surface meshes are converted into volume meshes using ftetwild and gmsh.
The potential points of the resulting tetrahedra are also calculated using the distance3d library.
Then, the contact areas are estimated, as well as the forces generated between the meshes, based on the elasticity modules of the hand and the object, and the friction coefficient.
These forces are later used to evaluate the grasp metrics. The following metrics are planned to be tested: force closure, Ferrari Canny, grasp isotropy, min singular, wrench volume, wrench resistance, grasp polygon volume.