πŸ€– DroidAgent

DroidRun uses a powerful DroidAgent system that combines LLM-based reasoning and execution to control Android devices effectively.

πŸ“š Core Components

The DroidAgent architecture consists of:

  • Planning System: Optional planning capabilities provided by PlannerAgent
  • Execution System: CodeActAgent for executing tasks
  • Tool System: Modular tools for Android and iOS device control
  • Reflection System: Optional additional task success evaluation for stronger reasoning

πŸ”„ Execution Flow

1

Goal Setting

The user provides a natural language task like β€œOpen settings and enable dark mode”

2

Planning (With Reasoning)

If reasoning=True, PlannerAgent breaks down the goal into smaller tasks

3

Task Execution

CodeActAgent executes each task using the appropriate tools

4

Task Evaluation (With Reflection)

If reflection=True the agent evaluates the task execution and optionally gives advice for analysis

5

Result Analysis

The agent analyzes results and determines next steps

6

Error Handling

Failed tasks can be retried or the plan adjusted

πŸ› οΈ Available Tools

The DroidAgent has access to these core tools:

🎯 Planning System

When reasoning is enabled, the agent uses advanced planning:

agent = DroidAgent(
    goal="Configure device settings",
    llm=llm,
    tools=tools,
    reasoning=True  # Enable planning
)

Features:

  • Step Planning: Break down complex tasks
  • Error Recovery: Handle unexpected situations
  • Optimization: Choose efficient approaches
  • Verification: Validate results

πŸ” Tracing Support

The tracing system helps monitor execution:

# Start Phoenix server first
# Run 'phoenix serve' in a separate terminal

agent = DroidAgent(
    goal="Your task",
    llm=llm,
    tools=tools,
    enable_tracing=True  # Enable Phoenix tracing
)

For detailed information, see the Execution Tracing documentation.

βš™οΈ Configuration

from droidrun import AdbTools, DroidAgent

# Load tools
tools = AdbTools(serial="device_id")

# Create agent
agent = DroidAgent(
    goal="Open Settings and enable dark mode",
    llm=llm,                        # Language model
    tools=tools,                    # Tool provider
    max_steps=15,                   # Maximum planning steps
    timeout=1000,                   # Overall timeout
    reasoning=True,                 # Enable planning
    reflection=True,                # Enable reflection
    enable_tracing=True,            # Execution tracing
    debug=False,                    # Debug mode
    save_trajectories=False         # Save trajectory data for debugging (GIF and logs)

)

# Run the agent
result = await agent.run()

πŸ› οΈ Execution Modes

DroidAgent supports two execution modes:

Direct Execution (reasoning=False)

agent = DroidAgent(
    goal="Take a screenshot",
    llm=llm,
    tools=tools,
    reasoning=False  # No planning
)
  • Treats goal as a single task
  • Directly executes using CodeActAgent
  • Suitable for simple, straightforward tasks

Planning Mode (reasoning=True)

agent = DroidAgent(
    goal="Find and install Twitter app",
    llm=llm,
    tools=tools,
    reasoning=True  # With planning
)
  • PlannerAgent creates step-by-step plan
  • Handles complex, multi-step tasks
  • Adaptively updates plan based on results

Reflection mode(reasoning=True, reflection=True)

agent = DroidAgent(
  goal="Find the cheapest hotel for New York in the Booking.com app and send them over to my whatsapp colleague",
  llm=llm,
  tools=tools,
  reasoning=True,  # With planning
  reflection=True  # With additional reasoning
)
  • Planner creates step-by-step plan
  • Actor executes subtasks
  • Reflector evaluates subtasks from Actor for re-planning

πŸ“Š Execution Results

The agent returns detailed execution results:

result = await agent.run()

# Check success
if result["success"]:
    print("Goal completed successfully!")
else:
    print(f"Failed: {result['reason']}")

# Access execution details
print(f"Steps executed: {result['steps']}")
print(f"Task history: {result['task_history']}")

πŸ’‘ Best Practices

  1. Use Planning for Complex Tasks

    • Enable reasoning for multi-step operations
    • Direct mode is faster for simple tasks
  2. Enable Reflection When Needed

    • Use for UI-heavy interactions
    • Provides better screen understanding
  3. Set Appropriate Timeouts

    • Adjust based on task complexity
    • Consider device performance
  4. Handle Errors Properly

    • Check task_history for debugging
  5. Memory Usage

    • Use tools.remember() for important information
    • Agent preserves context between planning iterations