Learning AI-Powered Robotics: My First Workflow Using Gemini Robotics-ER 1.5
I decided to test Gemini Robotics-ER 1.5 by tackling a real-world challenge: can AI help organize a messy hallway full of scattered shoes?
Google’s Gemini Robotics-ER 1.5 brings AI reasoning to robotics, so I decided to try it by tackling a messy hallway shoe organization challenge.
Phase 1: Intelligent Scene Analysis
The first step helps the robot understand the environment. The robot, powered by Gemini Robotics-ER 1.5, analyzes the scene and identifies every shoe with contextual awareness:
The AI identifies each shoe by type, color, size, left/right, and pair matching:
SCENE_ANALYSIS_PROMPT = “”“
Identify all shoes in this hallway. For each shoe provide:
- Location coordinates
- Shoe type (boot, sneaker, dress shoe, flat)
- Color and size category
- Left or right foot identification
- Pair matching
The label returned should be an identifying name for the object detected.The answer should follow the json format: [{”point”: <point>,
“label”: <label1>}, ...]. The points are in [y, x] format
normalized to 0-1000.
“”“
Result (sudo-code) : a structured JSON map of all shoes with coordinates and labels.
[
{”point”: [650, 180], “label”: “brown boot left”},
{”point”: [650, 320], “label”: “brown boot right”},
{”point”: [750, 180], “label”: “black dress shoe left”},
{”point”: [750, 320], “label”: “black dress shoe right”},
{”point”: [850, 180], “label”: “black oxford left”},
{”point”: [850, 320], “label”: “black oxford right”},
{”point”: [550, 500], “label”: “brown flat left”},
{”point”: [680, 500], “label”: “brown flat right”},
{”point”: [420, 650], “label”: “blue canvas sneaker left”},
{”point”: [580, 650], “label”: “blue canvas sneaker right”},
{”point”: [250, 750], “label”: “black dress shoe left”},
{”point”: [420, 750], “label”: “white sneaker left”},
{”point”: [580, 750], “label”: “white sneaker right”},
{”point”: [320, 400], “label”: “brown boot top shelf”},
{”point”: [480, 400], “label”: “brown boot top shelf”},
{”point”: [680, 300], “label”: “black slip-on middle shelf”},
{”point”: [520, 300], “label”: “black slip-on middle shelf”}
]Phase 2: Planning & Robot Trajectory
AI creates smart placement strategy (heavy boots on bottom, daily shoes at eye level, dress shoes on top) AND generates detailed movement commands with trajectories for existing robot hardware to execute the plan.
ORGANIZATION_WITH_TRAJECTORY_PROMPT = “”“
Create a complete organization plan with robotic movement trajectories:
Requirements:
- Weight distribution (heavy boots on bottom shelf)
- Accessibility (frequent shoes at eye level)
- Aesthetics (group similar colors/styles)
- Shelf capacity and dimensions
- Smooth robotic arm trajectories for each movement
For each shoe placement action, generate:
1. Pick-up trajectory from current floor position
2. Transport trajectory avoiding obstacles
3. Placement trajectory to target shelf location
The points should be labeled by order of the trajectory, from ‘0’(start point at left hand) to <n> (final point) The answer should follow the json format:[{”point”: <point>, “label”: <label1>}, ...].The points are in [y, x] format normalized to 0-1000.
Output format:
{
“organization_plan”: [
{
“action”: “organize_pair”,
“shoes”: [”shoe_id_left”, “shoe_id_right”],
“target_shelf”: “bottom/middle/top”,
“target_position”: [y, x],
“reasoning”: “placement logic”,
“trajectory_sequence”: [
{
“phase”: “pickup”,
“shoe”: “shoe_id_left”,
“path”: [
{”point”: [y0, x0], “label”: “0”},
...
{”point”: [y15, x15], “label”: “15”}
]
},
{
“phase”: “transport”,
“path”: [...],
“clearance_height”: 150
},
{
“phase”: “placement”,
“path”: [...]
}
]
}
]
}
“”“
Result (sudo-code): a movement plan with reasoning and detailed trajectories for the robot to execute.
“organization_plan”: [
{
“action”: “organize_pair”,
“shoes”: [”brown_boot_left”, “brown_boot_right”],
“target_shelf”: “bottom”,
“target_position”: [580, 200],
“reasoning”: “Heavy boots require bottom shelf for stability”,
“trajectory_sequence”: [
{
“phase”: “pickup_left”,
“shoe”: “brown_boot_left”,
“path”: [
{”point”: [650, 180], “label”: “0”},
{”point”: [640, 175], “label”: “1”},
{”point”: [625, 165], “label”: “2”},
{”point”: [610, 150], “label”: “3”},
{”point”: [595, 135], “label”: “4”},
{”point”: [580, 120], “label”: “5”},
{”point”: [565, 110], “label”: “6”},
{”point”: [550, 105], “label”: “7”},
{”point”: [535, 100], “label”: “8”},
{”point”: [520, 98], “label”: “9”},
{”point”: [505, 97], “label”: “10”},
{”point”: [490, 96], “label”: “11”},
{”point”: [475, 95], “label”: “12”},
{”point”: [460, 95], “label”: “13”},
{”point”: [445, 95], “label”: “14”},
{”point”: [430, 95], “label”: “15”}
]
}
... This is exciting technology that represents a leap in making robots more intuitive and capable. Looking forward to learning more about Gemini Robotics-ER 1.5

