Log Image

AdaWorldPolicy: World-Model-Driven Diffusion Policy with Online Adaptive Learning for Robotic Manipulation

*Note: All robotic arm operations are automatically generated through model inference and shown at 10x speed.


We first show the real-world rollout results of our AdaWorldPolicy (without AdaOL) on the four tasks under in-domain settings.

Four Tasks under In-Domain Settings


Task 1: Sweep
Coffee Beans

Pick up the broom and sweep the coffee beans on the table into the dustpan.

Task 2: Long-horizon
Pick-and-Place

Pick up the wooden egg, and then place it into the egg carton.


Task 3: Pour Water
with Transparent Cups

Pick up the measuring cup and pour the water into a glass cup.

Task 4: Wipe Writings
on Whiteboard

Wiping the writing on the whiteboard.



Next, we'll show the real-world rollout results of our AdaWorldPolicy (with AdaOL) on the four tasks under 4 domain shifts.

Task 1: Sweep Beans under 4 Domain Shifts


Shift A: Change tablecloth

Shift B: Add distractors

Shift C: Change objects

Shift D: Add random lighting

*Note: Consistent with the evaluation protocol mentioned in the appendix, the task is considered successful if no more than 3 beans (or corn kernels) remain on the table.

*Note C: In Task 1 - Shift C, we replace coffee beans with corn kernels.


Task 2: Place Egg under 4 Domain Shifts


Shift A: Change tablecloth

Shift B: Add distractors

Shift C: Change objects

Shift D: Add random lighting

*Note: In Task 2, the wooden eggs can be placed in a plate or a spoon. The 1st row shows the results with eggs on a plate, and the 2nd row shows the results with eggs on a spoon.

*Note C: In Task 2 - Shift C, we replace wooden eggs with Kinder Joy eggs.


Task 3: Pour Water under 4 Domain Shifts


Shift A: Change tablecloth

Shift B: Add distractors

Shift C: Change objects

Shift D: Add random lighting

*Note A: In Task 3 - Shift A, all methods failed when fully covering the table with a new tablecloth. Manipulating transparent objects itself is a challenging task. Therefore, we only use the tablecloth as the background to produce more meaningful results.

*Note B: In Task 3 - Shift B, like what in Shift A, all methods failed when using colorful distractors. Therefore, we turn to use transparent distractors.

*Note C: In Task 3 - Shift C, we not only change the target cup (from a smaller one to a larger one with different texture), but also increase the milliliters of water from 160ml to 260ml.


Task 4: Wipe Whiteboard under 4 Domain Shifts


Shift A: Change tablecloth

Shift B: Add distractors

Shift C: Tilt the whiteboard

Shift D: Add random lighting

*Note C: In Task 4 - Shift C, we tilt up the whiteboard with 35mm height without changing any object appearance.