DroidAgent
Understanding the DroidAgent system in DroidRun
π€ DroidAgent
DroidRun uses a powerful DroidAgent system that combines LLM-based reasoning and execution to control Android devices effectively.
π Core Components
The DroidAgent architecture consists of:
- Planning System: Optional planning capabilities provided by PlannerAgent
- Execution System: CodeActAgent for executing tasks
- Tool System: Modular tools for Android and iOS device control
- Reflection System: Optional additional task success evaluation for stronger reasoning
π Execution Flow
Goal Setting
The user provides a natural language task like βOpen settings and enable dark modeβ
Planning (With Reasoning)
If reasoning=True, PlannerAgent breaks down the goal into smaller tasks
Task Execution
CodeActAgent executes each task using the appropriate tools
Task Evaluation (With Reflection)
If reflection=True the agent evaluates the task execution and optionally gives advice for analysis
Result Analysis
The agent analyzes results and determines next steps
Error Handling
Failed tasks can be retried or the plan adjusted
π οΈ Available Tools
The DroidAgent has access to these core tools:
π― Planning System
When reasoning is enabled, the agent uses advanced planning:
Features:
- Step Planning: Break down complex tasks
- Error Recovery: Handle unexpected situations
- Optimization: Choose efficient approaches
- Verification: Validate results
π Tracing Support
The tracing system helps monitor execution:
For detailed information, see the Execution Tracing documentation.
βοΈ Configuration
π οΈ Execution Modes
DroidAgent supports two execution modes:
Direct Execution (reasoning=False)
- Treats goal as a single task
- Directly executes using CodeActAgent
- Suitable for simple, straightforward tasks
Planning Mode (reasoning=True)
- PlannerAgent creates step-by-step plan
- Handles complex, multi-step tasks
- Adaptively updates plan based on results
Reflection mode(reasoning=True, reflection=True)
- Planner creates step-by-step plan
- Actor executes subtasks
- Reflector evaluates subtasks from Actor for re-planning
π Execution Results
The agent returns detailed execution results:
π‘ Best Practices
-
Use Planning for Complex Tasks
- Enable reasoning for multi-step operations
- Direct mode is faster for simple tasks
-
Enable Reflection When Needed
- Use for UI-heavy interactions
- Provides better screen understanding
-
Set Appropriate Timeouts
- Adjust based on task complexity
- Consider device performance
-
Handle Errors Properly
- Check task_history for debugging
-
Memory Usage
- Use tools.remember() for important information
- Agent preserves context between planning iterations