GR00T-Mimic Workflow: Step 1 – Data Recording

Key Message

The GR00T-Mimic workflow begins by collecting a minimal number of human teleoperation demonstrations. These demonstrations serve as the foundation for generating large-scale training data for robot learning in an efficient and scalable way.

Summary

Data Recording Process
- A human operator remotely controls the robot, and the demonstration data is recorded in Isaac Lab.
- This teleoperated data forms the foundation for robot learning.
Various Teleoperation Interfaces Supported
- Supports multiple input devices such as keyboard, SpaceMouse, and Apple Vision Pro.
- Other control interfaces can also be easily integrated within Isaac Lab.
Generate Large-Scale Training Data from Minimal Demonstrations
- Typically, only 5–20 demonstrations are required to generate up to 1,000 training trajectories.
- Initial data collection can be completed in approximately 10–30 minutes by a skilled operator.
Key Advantages
- Minimizes the human teleoperation workload
- Easily adaptable to various robots and environments
- High-quality initial data enables large-scale learning

GR00T-Mimic Workflow: Step 2 – Data Annotation

Key Message

Data annotation is the process of dividing the teleoperation demonstrations into learning-friendly subtasks. This can be done manually or automatically and is a critical step for converting raw human demonstrations into structured training data.

Summary

Segmenting Teleoperation Datasets
- Teleoperated demonstrations are segmented by subtasks.
- Examples: Grasp, Pour, Place — separated based on robot action boundaries.
Automation vs. Manual Annotation
- Simple tasks can be automatically segmented.
- Complex tasks are recommended to be annotated manually.
Time Required
- If the number of demonstrations is small, annotation typically takes less than 10 minutes.
Key Advantages
- Converts demonstration data into a format suitable for robot learning
- Increases training efficiency and ensures data quality

GR00T-Mimic Workflow: Step 3 – Data Generation

Key Message

The data generation phase is the core process in which new robot training data is automatically generated based on teleoperation demonstrations. Using Isaac Lab’s parallel simulation capabilities, large datasets can be generated quickly, significantly improving learning speed and efficiency.

Summary

Automated Data Generation
- New robot motion data is generated automatically in various situations and environments based on teleoperation demonstrations.
- By changing initial conditions, diverse demonstrations can be produced.
Parallel Simulation Support
- Dozens to hundreds of environments can run in parallel within Isaac Lab.
- Enables large-scale data generation for the same task, dramatically speeding up learning.
Time Efficiency
- Compared to generating one dataset sequentially, parallel processing can be over 100× faster.
Key Advantages
- Rapid and scalable dataset creation
- Supports complex and diverse task/environment learning
- Essential for efficient robot training

GR00T-Mimic Workflow Example: Data Generation + Learning

Key Message

Using the GR00T-Mimic workflow, a small number of human demonstrations can be expanded into a large training dataset (e.g., 1,000 trajectories), enabling robots to successfully learn visuomotor tasks such as block stacking.

Summary

Collecting Teleoperation Demonstrations
- A human operator performs a task (e.g., stacking blocks) 10 times through teleoperation.
- These demonstrations become the foundation of robotic learning.
Large-Scale Training Data Generation
- Using GR00T-Mimic, 10 demonstrations are expanded into 1,000 synthetic trajectories.
- Enables large-scale training with minimal human effort.
Robot Learning and Performance Verification
- The generated data is used to train a BCRNN visuomotor policy.
- The robot learns and successfully performs the original task (e.g., stacking blocks).
Key Advantages
- Minimal human demonstrations → large-scale training data
- High efficiency, fast training cycles, scalable learning
- Entire workflow (data generation, learning, validation) is completed within Isaac Lab

What is BCRNN? (Behavior Cloning Recurrent Neural Network)

A model combining Behavior Cloning (learning from human demonstration) and RNN (Recurrent Neural Network for sequential data).
Suitable for learning continuous robot action sequences such as “grasp → move → place”.

GR00T-Mimic Workflow Example: GR-1 Humanoid Robot Task Annotation

Key Message

Using the GR-1 humanoid robot as an example, the GR00T-Mimic workflow defines robot tasks (e.g., grasping, stacking, pouring) in terms of subtasks, and marks the end time of each subtask to generate structured training datasets.

Summary

Example Task: GR-1 Robot Performing Grasp, Stack, Pour
- The robot performs different actions with both arms.
- Right arm: Idle → Grasp (yellow cylinder) → Pass to left arm
- Left arm: Grasp (red beaker) → Pour → Move to blue container
Defining Subtask Boundaries
- Each task is divided into “start and end moments”.
- E.g., end of right-arm idle, end of grasp, end of pouring.
Core Principle: Define Based on “End Moment”
- When annotating data, the moment a subtask ends is most important.
- These marked end points are critical for training and trajectory generation.
Key Advantages
- Even complex tasks can be clearly defined by subtasks
- Enables intuitive and structured robot training data creation

Lab GR00T-Mimic Workflow

GR00T-Mimic Workflow: Step 1 – Data Recording

GR00T-Mimic Workflow: Step 2 – Data Annotation

GR00T-Mimic Workflow: Step 3 – Data Generation

GR00T-Mimic Workflow Example: Data Generation + Learning

GR00T-Mimic Workflow Example: GR-1 Humanoid Robot Task Annotation