Skip to main content
Raw Android device I/O via ADB + Portal.

AndroidDriver

class AndroidDriver(DeviceDriver)
Raw Android device I/O via ADB and the Droidrun Portal app. AndroidDriver provides low-level device communication for Android devices through ADB (Android Debug Bridge). It supports both TCP communication and content provider modes via the Droidrun Portal app. AndroidDriver declares its capabilities in the supported set, and unsupported methods raise NotImplementedError.

AndroidDriver.__init__

def __init__(
    serial: str | None = None,
    use_tcp: bool = False,
    remote_tcp_port: int = 8080,
) -> None
Initialize the AndroidDriver instance. Arguments:
  • serial str | None - Device serial number (e.g., “emulator-5554”, “192.168.1.100:5555”). If None, auto-detects the first available device.
  • use_tcp bool - Whether to prefer TCP communication (default: False). TCP is faster but requires port forwarding. Falls back to content provider mode if TCP fails.
  • remote_tcp_port int - TCP port for Portal app communication on device (default: 8080)
Usage:
from droidrun.tools import AndroidDriver

# Auto-detect device
driver = AndroidDriver()

# Specific device
driver = AndroidDriver(serial="emulator-5554")

# TCP mode (faster communication, requires port forwarding)
driver = AndroidDriver(serial="emulator-5554", use_tcp=True)
Supported methods:
AndroidDriver.supported = {
    "tap", "swipe", "input_text", "press_key", "drag",
    "start_app", "install_app", "screenshot",
    "get_ui_tree", "get_date", "get_apps", "list_packages",
}
Notes:
  • Automatically sets up the Droidrun Portal keyboard on connect() via setup_keyboard()
  • Creates a PortalClient instance that handles TCP/content provider communication
  • Device serial can be emulator name, USB serial, or TCP/IP address:port
  • Must call connect() or ensure_connected() before using any methods

Lifecycle Methods

AndroidDriver.connect

async def connect() -> None
Establish connection to the device. Discovers the ADB device, creates a PortalClient, and sets up the Portal keyboard. Usage:
driver = AndroidDriver(serial="emulator-5554")
await driver.connect()

AndroidDriver.ensure_connected

async def ensure_connected() -> None
Connect if not already connected. Safe to call multiple times.

Input Action Methods

AndroidDriver.tap

async def tap(x: int, y: int) -> None
Tap at absolute pixel coordinates on the device screen. Arguments:
  • x int - X coordinate
  • y int - Y coordinate
Usage:
await driver.tap(540, 960)

AndroidDriver.swipe

async def swipe(
    x1: int,
    y1: int,
    x2: int,
    y2: int,
    duration_ms: float = 1000,
) -> None
Swipe from (x1, y1) to (x2, y2). Arguments:
  • x1 int - Starting X coordinate
  • y1 int - Starting Y coordinate
  • x2 int - Ending X coordinate
  • y2 int - Ending Y coordinate
  • duration_ms float - Duration of swipe in milliseconds (default: 1000)
Usage:
# Swipe up (scroll down content)
await driver.swipe(540, 1500, 540, 500, duration_ms=300)

# Swipe left
await driver.swipe(800, 960, 200, 960, duration_ms=250)
Notes:
  • Duration is converted to seconds internally (dividing by 1000)
  • Includes an async sleep matching the swipe duration for UI settling

AndroidDriver.input_text

async def input_text(text: str, clear: bool = False) -> bool
Type text into the currently focused field. Arguments:
  • text str - Text to input. Supports Unicode and special characters.
  • clear bool - Whether to clear existing text before inputting (default: False)
Returns:
  • bool - True if input succeeded, False otherwise
Usage:
await driver.tap(540, 300)  # Focus text field first
success = await driver.input_text("Hello World")

# Clear existing text and input new text
success = await driver.input_text("New text", clear=True)
Notes:
  • Uses the Droidrun Portal app keyboard for reliable text input via PortalClient
  • Supports Unicode characters and special characters

AndroidDriver.press_key

async def press_key(keycode: int) -> None
Send a single key-event to the device. Common keycodes:
  • 3: HOME
  • 4: BACK
  • 66: ENTER
  • 67: DELETE
Full keycode reference: Android KeyEvent Documentation Arguments:
  • keycode int - Android keycode to press
Usage:
await driver.press_key(66)  # Press enter
await driver.press_key(3)   # Press home
await driver.press_key(4)   # Press back

AndroidDriver.drag

async def drag(
    x1: int,
    y1: int,
    x2: int,
    y2: int,
    duration: float = 3.0,
) -> None
Drag from (x1, y1) to (x2, y2). Arguments:
  • x1 int - Starting X coordinate
  • y1 int - Starting Y coordinate
  • x2 int - Ending X coordinate
  • y2 int - Ending Y coordinate
  • duration float - Duration of drag in seconds (default: 3.0)
Notes:
  • Currently raises NotImplementedError (declared in supported set but not yet implemented)

App Management Methods

AndroidDriver.start_app

async def start_app(package: str, activity: str | None = None) -> str
Launch an application on the device. If activity is not provided, automatically resolves the main/launcher activity using cmd package resolve-activity. Arguments:
  • package str - Package name (e.g., “com.android.settings”, “com.google.android.apps.messaging”)
  • activity str | None - Optional activity name (e.g., ”.Settings”). If None, auto-detects the main launcher activity.
Returns:
  • str - Result message indicating success or error
Usage:
# Auto-detect main activity
result = await driver.start_app("com.android.settings")

# Specific activity
result = await driver.start_app("com.android.settings", ".Settings")

AndroidDriver.install_app

async def install_app(path: str, **kwargs) -> str
Install an APK on the device. Arguments:
  • path str - Path to the APK file on the local machine
  • reinstall bool - Whether to reinstall if app already exists (default: False)
  • grant_permissions bool - Whether to grant all permissions automatically (default: True)
Returns:
  • str - Result message indicating success or error
Usage:
result = await driver.install_app("/path/to/app.apk")
result = await driver.install_app("/path/to/app.apk", reinstall=True)

AndroidDriver.list_packages

async def list_packages(include_system: bool = False) -> List[str]
Return installed package names. Arguments:
  • include_system bool - Whether to include system apps (default: False)
Returns:
  • List[str] - List of package names

AndroidDriver.get_apps

async def get_apps(include_system: bool = True) -> List[Dict[str, str]]
Return installed apps as list of dicts with ‘package’ and ‘label’ keys. Arguments:
  • include_system bool - Whether to include system apps (default: True)
Returns:
  • List[Dict[str, str]] - List of dictionaries containing ‘package’ and ‘label’ keys

State and Observation Methods

AndroidDriver.screenshot

async def screenshot(hide_overlay: bool = True) -> bytes
Capture the current screen as raw PNG bytes. Arguments:
  • hide_overlay bool - Whether to hide Portal app overlay elements during screenshot (default: True)
Returns:
  • bytes - Raw PNG image data
Usage:
png_bytes = await driver.screenshot()
with open("screenshot.png", "wb") as f:
    f.write(png_bytes)

AndroidDriver.get_ui_tree

async def get_ui_tree() -> Dict[str, Any]
Return the raw UI / accessibility tree from the device. Returns a dictionary containing both the accessibility tree and phone state data from the Portal app. Returns:
  • Dict[str, Any] - Raw UI tree data from the device

AndroidDriver.get_date

async def get_date() -> str
Get the current date and time on the device. Returns:
  • str - Date and time string from device
Usage:
date = await driver.get_date()
print(f"Device date: {date}")
# Output: "Thu Jan 16 14:30:25 UTC 2025"

Properties

Instance variables:
  • device - ADB device instance (from async_adbutils)
  • portal - PortalClient instance for device communication (TCP or content provider mode)
  • supported - Set of supported method names for capability checking

Notes

  • Portal app required: The Droidrun Portal app must be installed and accessibility service enabled on the device
  • TCP vs Content Provider: TCP is faster but requires port forwarding (adb forward tcp:8080 tcp:8080). Content provider is the fallback mode using ADB shell commands.
  • Capability checking: Check "method_name" in driver.supported to determine if a method is available before calling it
  • Async-only: All methods are async and must be awaited
  • No element resolution: AndroidDriver provides raw device I/O only. Element resolution (by index) is handled by UIState via the StateProvider layer.

Example Workflow

import asyncio
from droidrun.tools import AndroidDriver

async def main():
    # Initialize driver
    driver = AndroidDriver(serial="emulator-5554", use_tcp=True)
    await driver.connect()

    # Start Chrome app
    result = await driver.start_app("com.android.chrome")
    print(result)

    # Get UI tree (raw data)
    tree = await driver.get_ui_tree()

    # Tap at coordinates
    await driver.tap(540, 300)

    # Input text
    await driver.input_text("Droidrun framework")

    # Press enter
    await driver.press_key(66)

    # Take screenshot
    png_bytes = await driver.screenshot()
    with open("search_result.png", "wb") as f:
        f.write(png_bytes)

asyncio.run(main())
Note: For higher-level interactions with element indexing and structured results, use action functions with ActionContext (see DroidAgent) rather than calling the driver directly.