Tools
- 14 abstract methods for device interaction (UI, apps, state, memory)
@ui_action
decorator for automatic trajectory recording- Consistent interface across platforms (Android, iOS)
Quick Reference
14 Abstract Methods:get_state()
,get_date()
,tap_by_index()
,swipe()
,drag()
,input_text()
,back()
,press_key()
,start_app()
,take_screenshot()
,list_packages()
,get_apps()
,remember()
,get_memory()
,complete()
,_extract_element_coordinates_by_index()
save_trajectories
:"none"
|"step"
|"action"
- Controls automatic screenshot capturememory
:List[str]
- Stores remembered informationfinished
,success
,reason
- Task completion state
@Tools.ui_action
- Captures screenshots whensave_trajectories="action"
Architecture
The Tools architecture follows a 2-layer pattern:- Abstract Layer (
tools.py
): Defines theTools
abstract base class with method signatures and the@ui_action
decorator - Implementation Layer: Platform-specific implementations
adb.py
:AdbTools
for Android devices using ADB and Portal app (TCP or content provider)ios.py
:IOSTools
for iOS devices using HTTP API to communicate with Portal app
- Tools ABC: Defines 14 abstract methods that all implementations must provide
- @ui_action decorator: Automatically captures screenshots and UI states when
save_trajectories="action"
- Portal integration: Android uses
PortalClient
(TCP or content provider), iOS uses HTTP API
- Consistent API across platforms
- Easy addition of new device types
- Type safety and IDE support
- Automatic trajectory recording for debugging
- Clear contract for implementing new tools
Common Interface
All Tools implementations must provide these methods:UI Interaction
tap_by_index(index: int) -> str
- Tap element by indexswipe(start_x: int, start_y: int, end_x: int, end_y: int, duration_ms: int = 300) -> bool
- Swipe gesturedrag(start_x: int, start_y: int, end_x: int, end_y: int, duration_ms: int = 3000) -> bool
- Drag gestureinput_text(text: str, index: int = -1, clear: bool = False) -> str
- Text inputback() -> str
- Back navigationpress_key(keycode: int) -> str
- Key press
App Management
start_app(package: str, activity: str = "") -> str
- Launch applist_packages(include_system_apps: bool = False) -> List[str]
- List packagesget_apps(include_system_apps: bool = True) -> List[Dict[str, Any]]
- Get apps with labels
State Retrieval
get_state() -> Dict[str, Any]
- Get UI state and accessibility treeget_date() -> str
- Get device date/timetake_screenshot() -> Tuple[str, bytes]
- Capture screenshot
Memory and Completion
remember(information: str) -> str
- Store context informationget_memory() -> List[str]
- Retrieve stored memorycomplete(success: bool, reason: str) -> None
- Mark task complete
Internal Helpers
_extract_element_coordinates_by_index(index: int) -> Tuple[int, int]
- Extract element coordinates
Decorator: @ui_action
save_trajectories="action"
is enabled. It captures screenshots and UI states after each UI action for debugging and analysis.
Usage:
- Method executes normally and returns its result
- After method returns, decorator checks if
self.save_trajectories == "action"
- If true, it looks for
step_screenshots
andstep_ui_states
in caller’s global scope (usingsys._getframe(1)
) - If these lists exist, appends current screenshot (
self.take_screenshot()[1]
) and UI state (self.get_state()
) - Enables action replay and debugging by building a complete trajectory
save_trajectories
attribute set to "action"
for the decorator to capture screenshots. Other values ("none"
, "step"
) will skip automatic capture.
Standard decorated methods in AdbTools:
_extract_element_coordinates_by_index()
- Extract element coordinatesswipe()
- Swipe gesturedrag()
- Drag gestureinput_text()
- Text inputback()
- Back buttonpress_key()
- Key pressstart_app()
- Launch appcomplete()
- Mark task complete
tap_by_index()
is NOT decorated with @ui_action
in the current implementation. The decorator is only applied to the internal _extract_element_coordinates_by_index()
method that tap_by_index()
calls.
Custom Tool Integration
You can extend Tools to add platform-specific functionality or create custom tool implementations.Extending Existing Tools
Creating New Tool Implementations
To create a custom Tools implementation, you must implement all 14 abstract methods:Helper Function: describe_tools
tools
Tools - The Tools instance to describeexclude_tools
Optional[List[str]] - List of tool names to exclude from the description
Dict[str, Callable[..., Any]]
- Dictionary mapping tool names to their methods
- UI interaction:
swipe
,input_text
,press_key
,tap_by_index
,drag
- App management:
start_app
,list_packages
- State management:
remember
,complete
get_state()
- Called internally by agentstake_screenshot()
- Called internally by agentsget_memory()
- Accessed directly by agentsback()
- Typically handled by agents or wrapped in other tools
Tool Communication with Agents
Tools instances are passed to agents and provide the atomic actions for device control. Agents call these methods directly or wrap them for LLM function calling.How Agents Use Tools:
Tools are used via describe_tools():
Thedescribe_tools()
function extracts callable methods for LLM function calling:
Platform Comparison
Feature | AdbTools (Android) | IOSTools (iOS) |
---|---|---|
Connection | ADB + Portal (USB/TCP) | HTTP (Portal app) |
Element indexing | ✅ Full support | ✅ Full support |
Text input | ✅ Unicode + index/clear | ⚠️ Unicode only (no index/clear) |
Screenshots | ✅ Fast (PNG) | ✅ Fast (PNG) |
Swipe | ✅ Precise coordinates | ⚠️ Direction-based |
Drag | ✅ Full support | ❌ Not implemented |
Back button | ✅ Hardware key (keycode 4) | ❌ Not implemented |
get_date() | ✅ Device date/time | ❌ Not implemented |
get_apps() | ✅ All packages with labels | ❌ Not implemented |
App packages | ✅ All packages | ⚠️ Limited to configured |
Key codes | ✅ Full Android keycodes | ⚠️ Limited (HOME/ACTION/CAMERA) |
State retrieval | ✅ Accessibility tree + phone state | ✅ Accessibility tree + phone state |
_extract_element_coordinates_by_index() | ✅ Implemented | ❌ Not implemented |
Best Practices
1. Always call get_state() before tap_by_index()
Thetap_by_index()
method relies on cached UI elements from the last get_state()
call:
2. Use remember() for important context
3. Enable trajectory recording for debugging
4. Handle platform differences
5. Use complete() to signal task finish
Error Handling
Tools methods use consistent error handling patterns: String returns with error messages:Advanced: Stacking Decorators
You can combine@Tools.ui_action
with custom decorators:
@Tools.ui_action
closest to the function definition so it executes last (captures screenshot after all other decorators).
See Also
- AdbTools API - Android implementation
- IOSTools API - iOS implementation
- DroidAgent API - Agent integration
- Custom Tools Guide - Creating custom tools