Lesson 1 of 6

Lesson 01 — Appium 2 Architecture

Title: What Appium is and how it works under the hood

Description: A beginner-friendly introduction to Appium 2 — what it is, how its three-part architecture works, and how to install everything you need to start automating mobile apps.

Why it matters for QA: Most beginners struggle with Appium setup because they don't understand the architecture. Once you see how the pieces fit together, installation errors, missing drivers, and confusing log messages all start to make sense.


1. What is Appium?

Mobile regression testing becomes slow and inconsistent when every scenario is checked by hand. Appium lets you automate those checks by sending commands to a real device, emulator, or simulator from test code.

Appium is a cross-platform automation server for mobile UI testing. It can tap buttons, enter text, perform gestures, read element state, and verify mobile flows across Android and iOS.

What makes Appium special compared to other tools:

  • It works with both Android and iOS apps from a single codebase
  • It supports native apps (built with Swift/Kotlin), hybrid apps (web inside a shell), and mobile web (browser on phone)
  • You write tests in your favorite language — JavaScript, Python, Java, Ruby — Appium doesn't care
  • It does not require modifying your app to test it (no special SDK to add)
What you can automate with Appium: ┌─────────────────────────────────────────────┐ │ Native Apps │ Hybrid Apps │ Mobile Web │ │ (Swift/Kotlin) │ (Cordova...) │ (Chrome) │ └─────────────────────────────────────────────┘ Both Android and iOS

2. The three-part architecture

This is the most important thing to understand about Appium. It has three main parts that talk to each other:

┌──────────────┐ HTTP ┌───────────────┐ Native ┌──────────────┐ │ │ Requests │ │ Automation │ │ │ CLIENT │ ────────────► │ APPIUM SERVER │ ─────────────► │ DEVICE │ │ (your test) │ │ │ via Driver │ (Android/iOS)│ │ │ ◄──────────── │ │ ◄───────────── │ │ └──────────────┘ Responses └───────────────┘ └──────────────┘

Let's break down each part:

Part 1: The Client (your test code)

The client is the code you write. It might be a JavaScript file using WebdriverIO, a Python script using Appium-Python-Client, or a Java class using Selenium's Java bindings.

Your client code does things like:

javascriptjavascript
// Your test code — this is the "client"
await $("~loginButton").click();
await $("~usernameField").setValue("testuser@email.com");

The client doesn't know anything about Android or iOS directly. It just sends commands over HTTP to the Appium server and waits for responses.

In practice, the client is your test framework layer. It describes the desired action, but it does not directly control Android or iOS internals.

Part 2: The Appium Server

The Appium server is a program that runs on your computer (or on a remote server). It:

  1. Receives HTTP requests from your client
  2. Figures out which driver should handle the request
  3. Forwards the command to that driver
  4. Returns the result back to your client

You start it by running appium in your terminal. By default it listens on port 4723.

The server is the routing layer. It validates session requests, receives commands, and forwards them to the driver that owns the active session.

Part 3: The Driver

The driver is a plugin that knows how to talk to a specific platform. The Android driver knows how to communicate with Android. The iOS driver knows how to communicate with iOS.

In Appium 2, drivers are installed separately — you pick and install only what you need.

The driver is the platform adapter. UiAutomator2 knows how to automate Android; XCUITest knows how to automate iOS.

Real example of what happens when you tap a button: 1. Your test: $("~loginButton").click() 2. Client sends: POST /session/abc123/element/{elementId}/click 3. Server routes: "This is a UiAutomator2 session" → sends to Android driver 4. Driver sends: ADB command to Android emulator → tap at x:150, y:320 5. Android taps: The button is tapped on the device 6. Response: OK → driver → server → client → your test continues

3. The W3C WebDriver protocol — why Appium uses a standard

Appium communicates using the W3C WebDriver protocol. This is an industry standard (like a common language) that was originally created for browser automation (Selenium uses it too).

Why does this matter for you?

Because if you already know Selenium for web testing, Appium will feel very familiar. The commands, the session concept, and the element-finding methods all work the same way.

The protocol defines things like:

  • How to start a new session (POST /session)
  • How to find an element (POST /session/{id}/element)
  • How to click an element (POST /session/{id}/element/{elementId}/click)
  • How to end a session (DELETE /session/{id})
Selenium (web) Appium (mobile) │ │ └──── Both use W3C WebDriver Protocol ────┘ │ Same HTTP commands, same concept of sessions, same element interaction model

This is great because tools, libraries, and knowledge transfer between web and mobile testing.


4. Appium 1 vs Appium 2 — what changed

If you've seen older Appium tutorials, you might notice things look different. Here's why:

What changedAppium 1Appium 2
DriversBundled inside AppiumSeparate plugins, installed individually
Installnpm install -g appium installs everythingInstall Appium first, then each driver separately
Capability prefixdeviceNameappium:deviceName (vendor prefix required)
Plugin systemNo pluginsExtensible plugin architecture
UpdatesEverything updates togetherDrivers update independently

What this means in practice: In Appium 2, you first install the Appium server, then you install only the drivers you need. This keeps things modular and allows drivers to update on their own schedule.


5. Installing Appium and drivers

You need Node.js installed first (version 16 or higher). Then:

bashbash
# Install Appium globally on your machine
npm install -g appium

# Verify Appium is installed correctly
appium --version

# See what drivers are available to install
appium driver list

# Install the Android driver (UiAutomator2)
appium driver install uiautomator2

# Install the iOS driver (XCUITest)
appium driver install xcuitest

# Confirm your drivers are installed
appium driver list --installed

# Start the Appium server (runs on port 4723 by default)
appium

After running appium, you'll see output like this:

[Appium] Welcome to Appium v2.x.x [Appium] Non-default server args: [Appium] Appium REST http interface listener started on 0.0.0.0:4723

This means your server is running and ready to accept commands.

For Android testing, you also need:

  • Android Studio installed
  • An Android Virtual Device (AVD) or a physical Android phone with USB debugging enabled
  • ANDROID_HOME environment variable pointing to your Android SDK

For iOS testing (macOS only):

  • Xcode installed from the Mac App Store
  • Xcode Command Line Tools (xcode-select --install)

6. Appium Inspector — your visual exploration tool

Before writing locators in code, you need to know what elements exist in your app. Appium Inspector is a visual tool that connects to your running app and shows you its element tree.

Think of it like browser DevTools, but for your mobile app.

What you can do with Appium Inspector:

  • See every element in your app (buttons, text fields, labels, etc.)
  • Click on any element and see its properties (accessibility id, resource-id, class name, etc.)
  • Test locator strategies before writing them in code
  • Generate capability JSON for your test setup

How to use it:

  1. Download Appium Inspector from github.com/appium/appium-inspector
  2. Make sure your Appium server is running (appium in terminal)
  3. Make sure your device or emulator is running with your app open
  4. Open Appium Inspector, enter your capabilities (we'll cover these in Lesson 02)
  5. Click "Start Session" and explore your app
Appium Inspector workflow: Start Appium Open your Launch Inspect Server → device/emu → Inspector → elements (appium cmd) (Android/iOS) (enter caps) (click around)

Important tip: Use Inspector as a starting point to discover element attributes. Don't copy-paste XPath from Inspector directly into your tests — we'll talk about better locator strategies in Lesson 03.


7. Mental model — what happens when you run a test

Let's trace through the complete lifecycle of a single test to cement your understanding:

Step 1: You run your test file node test.js Step 2: Your client code creates capabilities (settings) and sends a POST /session request to http://localhost:4723 Step 3: Appium server receives the request, reads "platformName: Android" and "automationName: UiAutomator2", and starts the Android driver Step 4: The Android driver connects to your emulator/device via ADB, installs the app (if needed), and launches it Step 5: Appium returns a session ID back to your client e.g. { sessionId: "a1b2c3d4-..." } Step 6: Your test sends element-finding and interaction commands using that session ID — each command goes Client → Server → Driver → Device Step 7: When the test finishes, your client sends DELETE /session/a1b2c3d4 The app closes, the session ends Step 8: Appium logs everything — if something goes wrong, the logs tell you exactly which step failed

Official docs


Quick recap

ConceptWhat it is
AppiumA tool for automating mobile app testing via code
ClientYour test code that sends commands
Appium ServerThe middleman that routes commands to the right driver
DriverThe platform-specific plugin (UiAutomator2 for Android, XCUITest for iOS)
W3C WebDriverThe standard HTTP protocol all parts use to communicate
Appium InspectorA visual tool to explore your app's element tree
Appium 2 key changeDrivers are separate plugins — install only what you need

Practice exercises

  1. Install Appium on your machine. Run appium --version and confirm it prints a version number.

  2. Install the UiAutomator2 driver and run appium driver list --installed to confirm it appears.

  3. Start the Appium server and take a screenshot of the terminal output showing it's listening on port 4723.

  4. Open Appium Inspector and explore its interface. Don't connect to a device yet — just get familiar with where to enter capabilities and the "Start Session" button.

  5. Draw the architecture diagram from memory (on paper or in a note). Label: Client, Appium Server, Driver, Device, and the arrows between them. This forces you to internalize the flow.


Next: Lesson 02 — Sessions and capabilities (lab02.md)