UI Actions - Core UI interaction tools for iOS device control.

IOSTools

class IOSTools(Tools)

Core UI interaction tools for iOS device control.

IOSTools.__init__

def __init__(url: str, bundle_identifiers: List[str] = []) -> None

Initialize the IOSTools instance.

Arguments:

  • url - iOS device URL. This is the URL of the iOS device. It is used to send requests to the iOS device.
  • bundle_identifiers - List of bundle identifiers to include in the list of packages

IOSTools.get_state

async def get_state(serial: Optional[str] = None) -> List[Dict[str, Any]]

Get all clickable UI elements from the iOS device using accessibility API.

Arguments:

  • serial - Optional device URL (not used for iOS, uses instance URL)

Returns:

List of dictionaries containing UI elements extracted from the device screen

IOSTools.tap_by_index

async def tap_by_index(index: int, serial: Optional[str] = None) -> str

Tap on a UI element by its index.

This function uses the cached clickable elements to find the element with the given index and tap on its center coordinates.

Arguments:

  • index - Index of the element to tap

Returns:

Result message

IOSTools.tap

async def tap(index: int) -> str

Tap on a UI element by its index.

This function uses the cached clickable elements from the last get_clickables call to find the element with the given index and tap on its center coordinates.

Arguments:

  • index - Index of the element to tap

Returns:

Result message

IOSTools.swipe

async def swipe(
    start_x: int,
    start_y: int,
    end_x: int,
    end_y: int,
    duration_ms: int = 300
) -> bool

Performs a straight-line swipe gesture on the device screen. To perform a hold (long press), set the start and end coordinates to the same values and increase the duration as needed.

Arguments:

  • start_x - Starting X coordinate
  • start_y - Starting Y coordinate
  • end_x - Ending X coordinate
  • end_y - Ending Y coordinate
  • duration_ms - Duration of swipe in milliseconds (not used in iOS API)

Returns:

Bool indicating success or failure

IOSTools.input_text

async def input_text(text: str, serial: Optional[str] = None) -> str

Input text on the iOS device.

Arguments:

  • text - Text to input. Can contain spaces, newlines, and special characters including non-ASCII.
  • serial - Optional device serial (not used for iOS, uses instance URL)

Returns:

Result message

IOSTools.back

async def back() -> str

IOSTools.press_key

async def press_key(keycode: int) -> str

Press a key on the iOS device.

iOS Key codes:

  • 0: HOME
  • 4: ACTION
  • 5: CAMERA

Arguments:

  • keycode - iOS keycode to press

IOSTools.start_app

async def start_app(package: str, activity: str = "") -> str

Start an app on the iOS device.

Arguments:

  • package - Bundle identifier (e.g., “com.apple.MobileSMS”)
  • activity - Optional activity name (not used on iOS)

IOSTools.take_screenshot

async def take_screenshot() -> Tuple[str, bytes]

Take a screenshot of the iOS device. This function captures the current screen and adds the screenshot to context in the next message. Also stores the screenshot in the screenshots list with timestamp for later GIF creation.

IOSTools.get_phone_state

async def get_phone_state(serial: Optional[str] = None) -> Dict[str, Any]

Get the current phone state including current activity and keyboard visibility.

Arguments:

  • serial - Optional device serial number (not used for iOS)

Returns:

Dictionary with current phone state information

IOSTools.list_packages

async def list_packages(include_system_apps: bool = True) -> List[str]

IOSTools.remember

async def remember(information: str) -> str

Store important information to remember for future context.

This information will be included in future LLM prompts to help maintain context across interactions. Use this for critical facts, observations, or user preferences that should influence future decisions.

Arguments:

  • information - The information to remember

Returns:

Confirmation message

IOSTools.get_memory

def get_memory() -> List[str]

Retrieve all stored memory items.

Returns:

List of stored memory items

IOSTools.complete

def complete(success: bool, reason: str = "")

Mark the task as finished.

Arguments:

  • success - Indicates if the task was successful.
  • reason - Reason for failure/success

UI Actions - Core UI interaction tools for iOS device control.

IOSTools

class IOSTools(Tools)

Core UI interaction tools for iOS device control.

IOSTools.__init__

def __init__(url: str, bundle_identifiers: List[str] = []) -> None

Initialize the IOSTools instance.

Arguments:

  • url - iOS device URL. This is the URL of the iOS device. It is used to send requests to the iOS device.
  • bundle_identifiers - List of bundle identifiers to include in the list of packages

IOSTools.get_state

async def get_state(serial: Optional[str] = None) -> List[Dict[str, Any]]

Get all clickable UI elements from the iOS device using accessibility API.

Arguments:

  • serial - Optional device URL (not used for iOS, uses instance URL)

Returns:

List of dictionaries containing UI elements extracted from the device screen

IOSTools.tap_by_index

async def tap_by_index(index: int, serial: Optional[str] = None) -> str

Tap on a UI element by its index.

This function uses the cached clickable elements to find the element with the given index and tap on its center coordinates.

Arguments:

  • index - Index of the element to tap

Returns:

Result message

IOSTools.tap

async def tap(index: int) -> str

Tap on a UI element by its index.

This function uses the cached clickable elements from the last get_clickables call to find the element with the given index and tap on its center coordinates.

Arguments:

  • index - Index of the element to tap

Returns:

Result message

IOSTools.swipe

async def swipe(
    start_x: int,
    start_y: int,
    end_x: int,
    end_y: int,
    duration_ms: int = 300
) -> bool

Performs a straight-line swipe gesture on the device screen. To perform a hold (long press), set the start and end coordinates to the same values and increase the duration as needed.

Arguments:

  • start_x - Starting X coordinate
  • start_y - Starting Y coordinate
  • end_x - Ending X coordinate
  • end_y - Ending Y coordinate
  • duration_ms - Duration of swipe in milliseconds (not used in iOS API)

Returns:

Bool indicating success or failure

IOSTools.input_text

async def input_text(text: str, serial: Optional[str] = None) -> str

Input text on the iOS device.

Arguments:

  • text - Text to input. Can contain spaces, newlines, and special characters including non-ASCII.
  • serial - Optional device serial (not used for iOS, uses instance URL)

Returns:

Result message

IOSTools.back

async def back() -> str

IOSTools.press_key

async def press_key(keycode: int) -> str

Press a key on the iOS device.

iOS Key codes:

  • 0: HOME
  • 4: ACTION
  • 5: CAMERA

Arguments:

  • keycode - iOS keycode to press

IOSTools.start_app

async def start_app(package: str, activity: str = "") -> str

Start an app on the iOS device.

Arguments:

  • package - Bundle identifier (e.g., “com.apple.MobileSMS”)
  • activity - Optional activity name (not used on iOS)

IOSTools.take_screenshot

async def take_screenshot() -> Tuple[str, bytes]

Take a screenshot of the iOS device. This function captures the current screen and adds the screenshot to context in the next message. Also stores the screenshot in the screenshots list with timestamp for later GIF creation.

IOSTools.get_phone_state

async def get_phone_state(serial: Optional[str] = None) -> Dict[str, Any]

Get the current phone state including current activity and keyboard visibility.

Arguments:

  • serial - Optional device serial number (not used for iOS)

Returns:

Dictionary with current phone state information

IOSTools.list_packages

async def list_packages(include_system_apps: bool = True) -> List[str]

IOSTools.remember

async def remember(information: str) -> str

Store important information to remember for future context.

This information will be included in future LLM prompts to help maintain context across interactions. Use this for critical facts, observations, or user preferences that should influence future decisions.

Arguments:

  • information - The information to remember

Returns:

Confirmation message

IOSTools.get_memory

def get_memory() -> List[str]

Retrieve all stored memory items.

Returns:

List of stored memory items

IOSTools.complete

def complete(success: bool, reason: str = "")

Mark the task as finished.

Arguments:

  • success - Indicates if the task was successful.
  • reason - Reason for failure/success