Lesson 1 of 6
Lesson 01 — Appium 2 Architecture
Title: What Appium is and how it works under the hood
Description: A beginner-friendly introduction to Appium 2 — what it is, how its three-part architecture works, and how to install everything you need to start automating mobile apps.
Why it matters for QA: Most beginners struggle with Appium setup because they don't understand the architecture. Once you see how the pieces fit together, installation errors, missing drivers, and confusing log messages all start to make sense.
1. What is Appium?
Mobile regression testing becomes slow and inconsistent when every scenario is checked by hand. Appium lets you automate those checks by sending commands to a real device, emulator, or simulator from test code.
Appium is a cross-platform automation server for mobile UI testing. It can tap buttons, enter text, perform gestures, read element state, and verify mobile flows across Android and iOS.
What makes Appium special compared to other tools:
- It works with both Android and iOS apps from a single codebase
- It supports native apps (built with Swift/Kotlin), hybrid apps (web inside a shell), and mobile web (browser on phone)
- You write tests in your favorite language — JavaScript, Python, Java, Ruby — Appium doesn't care
- It does not require modifying your app to test it (no special SDK to add)
What you can automate with Appium:
┌─────────────────────────────────────────────┐
│ Native Apps │ Hybrid Apps │ Mobile Web │
│ (Swift/Kotlin) │ (Cordova...) │ (Chrome) │
└─────────────────────────────────────────────┘
Both Android and iOS
2. The three-part architecture
This is the most important thing to understand about Appium. It has three main parts that talk to each other:
┌──────────────┐ HTTP ┌───────────────┐ Native ┌──────────────┐
│ │ Requests │ │ Automation │ │
│ CLIENT │ ────────────► │ APPIUM SERVER │ ─────────────► │ DEVICE │
│ (your test) │ │ │ via Driver │ (Android/iOS)│
│ │ ◄──────────── │ │ ◄───────────── │ │
└──────────────┘ Responses └───────────────┘ └──────────────┘
Let's break down each part:
Part 1: The Client (your test code)
The client is the code you write. It might be a JavaScript file using WebdriverIO, a Python script using Appium-Python-Client, or a Java class using Selenium's Java bindings.
Your client code does things like:
// Your test code — this is the "client"
await $("~loginButton").click();
await $("~usernameField").setValue("testuser@email.com");
The client doesn't know anything about Android or iOS directly. It just sends commands over HTTP to the Appium server and waits for responses.
In practice, the client is your test framework layer. It describes the desired action, but it does not directly control Android or iOS internals.
Part 2: The Appium Server
The Appium server is a program that runs on your computer (or on a remote server). It:
- Receives HTTP requests from your client
- Figures out which driver should handle the request
- Forwards the command to that driver
- Returns the result back to your client
You start it by running appium in your terminal. By default it listens on port 4723.
The server is the routing layer. It validates session requests, receives commands, and forwards them to the driver that owns the active session.
Part 3: The Driver
The driver is a plugin that knows how to talk to a specific platform. The Android driver knows how to communicate with Android. The iOS driver knows how to communicate with iOS.
In Appium 2, drivers are installed separately — you pick and install only what you need.
The driver is the platform adapter. UiAutomator2 knows how to automate Android; XCUITest knows how to automate iOS.
Real example of what happens when you tap a button:
1. Your test: $("~loginButton").click()
2. Client sends: POST /session/abc123/element/{elementId}/click
3. Server routes: "This is a UiAutomator2 session" → sends to Android driver
4. Driver sends: ADB command to Android emulator → tap at x:150, y:320
5. Android taps: The button is tapped on the device
6. Response: OK → driver → server → client → your test continues
3. The W3C WebDriver protocol — why Appium uses a standard
Appium communicates using the W3C WebDriver protocol. This is an industry standard (like a common language) that was originally created for browser automation (Selenium uses it too).
Why does this matter for you?
Because if you already know Selenium for web testing, Appium will feel very familiar. The commands, the session concept, and the element-finding methods all work the same way.
The protocol defines things like:
- How to start a new session (POST
/session) - How to find an element (POST
/session/{id}/element) - How to click an element (POST
/session/{id}/element/{elementId}/click) - How to end a session (DELETE
/session/{id})
Selenium (web) Appium (mobile)
│ │
└──── Both use W3C WebDriver Protocol ────┘
│
Same HTTP commands,
same concept of sessions,
same element interaction model
This is great because tools, libraries, and knowledge transfer between web and mobile testing.
4. Appium 1 vs Appium 2 — what changed
If you've seen older Appium tutorials, you might notice things look different. Here's why:
| What changed | Appium 1 | Appium 2 |
|---|---|---|
| Drivers | Bundled inside Appium | Separate plugins, installed individually |
| Install | npm install -g appium installs everything | Install Appium first, then each driver separately |
| Capability prefix | deviceName | appium:deviceName (vendor prefix required) |
| Plugin system | No plugins | Extensible plugin architecture |
| Updates | Everything updates together | Drivers update independently |
What this means in practice: In Appium 2, you first install the Appium server, then you install only the drivers you need. This keeps things modular and allows drivers to update on their own schedule.
5. Installing Appium and drivers
You need Node.js installed first (version 16 or higher). Then:
# Install Appium globally on your machine
npm install -g appium
# Verify Appium is installed correctly
appium --version
# See what drivers are available to install
appium driver list
# Install the Android driver (UiAutomator2)
appium driver install uiautomator2
# Install the iOS driver (XCUITest)
appium driver install xcuitest
# Confirm your drivers are installed
appium driver list --installed
# Start the Appium server (runs on port 4723 by default)
appiumAfter running appium, you'll see output like this:
[Appium] Welcome to Appium v2.x.x
[Appium] Non-default server args:
[Appium] Appium REST http interface listener started on 0.0.0.0:4723
This means your server is running and ready to accept commands.
For Android testing, you also need:
- Android Studio installed
- An Android Virtual Device (AVD) or a physical Android phone with USB debugging enabled
ANDROID_HOMEenvironment variable pointing to your Android SDK
For iOS testing (macOS only):
- Xcode installed from the Mac App Store
- Xcode Command Line Tools (
xcode-select --install)
6. Appium Inspector — your visual exploration tool
Before writing locators in code, you need to know what elements exist in your app. Appium Inspector is a visual tool that connects to your running app and shows you its element tree.
Think of it like browser DevTools, but for your mobile app.
What you can do with Appium Inspector:
- See every element in your app (buttons, text fields, labels, etc.)
- Click on any element and see its properties (accessibility id, resource-id, class name, etc.)
- Test locator strategies before writing them in code
- Generate capability JSON for your test setup
How to use it:
- Download Appium Inspector from github.com/appium/appium-inspector
- Make sure your Appium server is running (
appiumin terminal) - Make sure your device or emulator is running with your app open
- Open Appium Inspector, enter your capabilities (we'll cover these in Lesson 02)
- Click "Start Session" and explore your app
Appium Inspector workflow:
Start Appium Open your Launch Inspect
Server → device/emu → Inspector → elements
(appium cmd) (Android/iOS) (enter caps) (click around)
Important tip: Use Inspector as a starting point to discover element attributes. Don't copy-paste XPath from Inspector directly into your tests — we'll talk about better locator strategies in Lesson 03.
7. Mental model — what happens when you run a test
Let's trace through the complete lifecycle of a single test to cement your understanding:
Step 1: You run your test file
node test.js
Step 2: Your client code creates capabilities (settings) and
sends a POST /session request to http://localhost:4723
Step 3: Appium server receives the request, reads "platformName: Android"
and "automationName: UiAutomator2", and starts the Android driver
Step 4: The Android driver connects to your emulator/device via ADB,
installs the app (if needed), and launches it
Step 5: Appium returns a session ID back to your client
e.g. { sessionId: "a1b2c3d4-..." }
Step 6: Your test sends element-finding and interaction commands using
that session ID — each command goes Client → Server → Driver → Device
Step 7: When the test finishes, your client sends DELETE /session/a1b2c3d4
The app closes, the session ends
Step 8: Appium logs everything — if something goes wrong, the logs
tell you exactly which step failed
Official docs
Quick recap
| Concept | What it is |
|---|---|
| Appium | A tool for automating mobile app testing via code |
| Client | Your test code that sends commands |
| Appium Server | The middleman that routes commands to the right driver |
| Driver | The platform-specific plugin (UiAutomator2 for Android, XCUITest for iOS) |
| W3C WebDriver | The standard HTTP protocol all parts use to communicate |
| Appium Inspector | A visual tool to explore your app's element tree |
| Appium 2 key change | Drivers are separate plugins — install only what you need |
Practice exercises
-
Install Appium on your machine. Run
appium --versionand confirm it prints a version number. -
Install the UiAutomator2 driver and run
appium driver list --installedto confirm it appears. -
Start the Appium server and take a screenshot of the terminal output showing it's listening on port 4723.
-
Open Appium Inspector and explore its interface. Don't connect to a device yet — just get familiar with where to enter capabilities and the "Start Session" button.
-
Draw the architecture diagram from memory (on paper or in a note). Label: Client, Appium Server, Driver, Device, and the arrows between them. This forces you to internalize the flow.
Next: Lesson 02 — Sessions and capabilities (lab02.md)