PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments

Kairui Ding 1 , Boyuan Chen 1 , Ruihai Wu 2 , Yuyang Li 3 , Zongzheng Zhang 1 , Huan-ang Gao 1 , Siqi Li 1 , Yixin Zhu 3 , Guyue Zhou 1,4 , Hao Dong 2 , Hao Zhao †1
1 Institute for AI Industry Research (AIR), Tsinghua University
2 CFCS, School of Computer Science, Peking University     3 Institute for Artificial Intelligence, Peking University
4 School of Vehicle and Mobility, Tsinghua University
Indicates Corresponding Author

Demonstration Video of PreAfford.

Abstract

Robotic manipulation of ungraspable objects with two-finger grippers presents significant challenges due to the paucity of graspable features, while traditional pre-grasping techniques, which rely on repositioning objects and leveraging external aids like table edges, lack the adaptability across object categories and scenes. Addressing this, we introduce PreAfford, a novel pre-grasping planning framework that utilizes a point-level affordance representation and a relay training approach to enhance adaptability across a broad range of environments and object types, including those previously unseen. Demonstrated on the ShapeNet-v2 dataset, PreAfford significantly improves grasping success rates by 69% and validates its practicality through real-world experiments. This work offers a robust and adaptable solution for manipulating ungraspable objects.

Introduction & Method

Introduction to PreAfford. Adopting a relay training paradigm, two successive modules collaborate to handle the grasping tasks of ungraspable objects utilizing different environmental features (edge, slope, slot and wall).


Framework of PreAfford. Pre-grasping and grasping modules are included, each containing an affordance network, proposal network, and critic network. In the inference phase, the two modules process point clouds to devise strategies for pre-grasping and grasping. Conversely, during training, the grasping module generates rewards to train the pre-grasping module, which we intuitively call relay.

Results

Qualitative results. Both training and testing categories across four distinct scenarios (edge, slot, slope, and wall) are tested. Affordance maps illustrate potential effective interaction areas. It can be observed that PreAfford yields reasonable pre-grasping and grasping strategies across different object categories and scenes on both seen and unseen objects.
Multi-feature environment adaptability. (a) Rendered image of the multi-feature environment. (b) Point cloud and affordance hot map, showcasing reasonable pre-grasping policies for objects at different locations.
Table: Comparison with baselines. Pre-grasping significantly improves grasping success rates by 52.9%. Closed-loop strategy further improves success rates by 16.4% over all categories. A push in a random direction and a push towards the geometric center are also included as baselines.
Setting Train object categories Test object categories
Edge Wall Slope Slot Multi Avg. Edge Wall Slope Slot Multi Avg.
W/o pre-grasping 2.3 3.8 4.3 3.4 4.0 3.6 6.1 2.3 2.9 5.7 6.0 4.6
Random-direction Push 21.6 10.3 6.4 16.8 18.1 14.6 24.9 17.2 12.1 18.4 23.0 19.1
Center-point Push 32.5 23.7 40.5 39.2 39.0 35.0 25.1 17.4 28.0 30.2 21.5 24.4
Ours w/o closed-loop 67.2 41.5 58.3 76.9 63.6 61.5 56.4 37.3 62.6 75.8 55.4 57.5
Ours 81.4 43.4 73.1 83.5 74.1 71.1 83.7 47.6 80.5 83.0 74.6 73.9
Real-world experiment figures. Four pre-grasping cases with their affordance map are demonstrated: (a) Move a tablet to table edge, (b) Push a plate towards wall, (c) Push a keyboard up onto a slope and (d) Slide a tablet into a slot.
Table: Real-world experiment results. For both direct grasping (without pre-grasping) and grasping after pre-grasping cases, we perform two experiments on each object in each experimental scene, success rates are shown below as percentages.
Setting Seen categories Unseen categories
Edge Wall Slope Slot Multi Avg. Edge Wall Slope Slot Multi Avg.
W/o pre-grasping 0 0 0 0 0 0 10 0 5 0 0 3
With pre-grasping 70 45 80 90 85 74 80 30 75 90 85 72

BibTeX

If you find our work useful in your research, please consider citing:
@misc{ding2024preafford,
      title={PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments}, 
      author={Kairui Ding and Boyuan Chen and Ruihai Wu and Yuyang Li and Zongzheng Zhang and Huan-ang Gao and Siqi Li and Yixin Zhu and Guyue Zhou and Hao Dong and Hao Zhao},
      year={2024},
      eprint={2404.03634},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}