Automating Complex Gestures with the W3C Actions API

January 26, 2026 · 12 min read · API Testing

HeadSpin Platform
Automated & amp; manual examination made easygoing through data skill insights.
Differentiating capabilities:
  • Extensive end-to-end automation of QA process
  • Comparative analysis of app performance against peer
  • Uninterrupted monitoring of app performance using man-made data for high availability of apps
  • Easy-to-use developer friendly platform
cloudtest go
Low-cost Real Device Testing for Emerging Teams
cloudtest go
Affordable Real Device Testing for Digital Enterprises
cloudtest go
The Ultimate Solution for a Powerful Blend of Functional & amp; Performance Testing!
cyol
TEM
New
Centralized mobile test executing in cloud
cyol
Enhance Your Accessibility Testing With HeadSpin
cyol
Automate camera-based testing

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

retail

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

Automating Complex Gestures with the W3C Actions APIAutomating Complex Gestures with the W3C Actions API

Automating Complex Gestures with the W3C Actions API

Published on
August 8, 2018
Updated on
Published on
May 31, 2022
Updated on
 by 
 Jonathan LippsJonathan Lipps
Jonathan Lipps

Mobile apps often involve the use of touch gestures, sometimes in a very complex mode. Compared to web apps, touch motion are common and crucial for nomadic apps. Even navigating a inclination of items requires a flick or a swipe on mobile, and often the difference between these two actions can make a meaningful difference in the behavior of an app.

Test and debug mobile, web, sound & amp; video apps on thousands of existent devices..

A Long Time Ago ...

The story of Appium & # x27; s support for these motion is as complex as the gesture themselves can be! (Feel free to skip this subdivision if you want to get straight to the particular ...) Because Appium has the goal of compatibility with WebDriver, we & # x27; ve also had to evolve in step with some of the modification in the WebDriver protocol. When Appium was first developed, there be two different ways of getting at actions apply WebDriver & # x27; s existing JSON Wire Protocol. These APIs were designed for, driven by a mouse, so uncalled-for to say they didn & # x27; t map good to the more generally gestural cosmos of mobile app use. To create things worse, the iOS and Android mechanization technologies Appium was built on top of did not expose utile general motion primitives. They each exposed their own platform-specific API, with dictation likeswipe that took different parameters and carry differently to each other, as well as to the intention of the JSON Wire Protocol.

Check out:

Appium thus faced two challenges: an unequal protocol spec, and an inadequate and varying set of basic mobile APIs provided by Apple and Google. Our approach was to implement the JSON Wire Protocol as faithfully as potential, butalsoto render direct access to the platform-specific APIs, via theexecuteScriptcommand. We defined a special prefix,mobile:, and apply accession to these non-standard APIs behind that prefix. So, users could run commands likedriver.executeScript (& quot; mobile: swipe & quot;, args)to directly trigger the iOS-specific swipe method furnish by Apple. It was a bit hacky, but give exploiter control over whether they wanted to adhere to Appium & # x27; s implementation of the standard JSON Wire Protocol, or addition direct access to the platform-specific method.

Accelerate Appium test cycles with the HeadSpin..

Meanwhile, the Appium squad memorise that the Selenium project was working on a better, more general specification for touch action. This new API was proposed as component of the new W3C WebDriver Spec which was under heavy development at the time. The Appium squad so enforce this new API, and yield Appium clients another way to automatise touch actions which we thought would be the standard go forrad. Unfortunately, this was an erroneous supposal. Appium was too flying to implement this new Actions spec -- -the spec itself changed and was recently ratified in a different incarnation than what Appium had originally supported. At least the specification is best now!

Also cheque:

A Happy Confluence

That convey us to today. Over the years the nomadic mechanization technologies Appium exercise experience themselves evolved, and clever people receive uncovered new APIs that allow Appium to perform totally (or almost totally) arbitrary gestures. TheW3C WebDriver Specis too now an official thing, including the most late incarnation of the Actions API. The confluence of these two factors means that, since Appium 1.8, it & # x27; s been possible for Appium to support the W3C Actions API for complex and general motion, for example in drawing a picture (which is what we are going to do in this edition).

For autonomous testing across multiple user personas, check out SUSATest — it explores your app like 10 different real users.

Why do we care about the W3C API specifically? Apart from Appium & # x27; s desire to couple the official WebDriver criterion, the Appium clients are built directly on top of the clients. As the Selenium client change to accommodate only the W3C APIs, that means Appium will need to support them or risk getting out of phase with the updated clients.

The Actions API

The W3C Actions APIis rattling general, which also makes it abstract and a bit hard to understand. Basically, it has the concept ofstimulant sources, many of which can exist, and each of which must be of a certain type (like key or pointer), potentially a subtype (like mouse, pen, touch), and have a certain id (like & quot; default shiner & quot;). Pointer inputs can register actions likepointerMove, pointerUp, pointerDown, and pause. By defining one (or more) pointer inputs, each with a set of activity and corresponding parameters, we can define somewhat much any motion you like.

Track performance-related issues throughout the mobile app lifecycle..

Conceptually, for example, a & quot; zoom & quot; gesture consist of two cursor comment sources, each of which would register a serial of activity:

Pointer 1 (type touch, id `` forefinger '') - pointerMove to soar origin coordinate, with no length - pointerDown - pointerMove to a coordinate diagonally up and right, with duration X - pointerUp Pointer 2 (type ghost, id `` pollex '') - pointerMove to zoom beginning co-ordinate, with no duration - pointerDown - pointerMove to a coordinate diagonally down and left, with length X - pointerUp

These input sources, along with their actions, get cluster up into one JSON object and sent to the host when you calldriver.perform (). The server so unpack the input sources and actions and interprets them suitably, each input source & # x27; s actions being played at the same time (each action taking up one & quot; tick & quot; of virtual time, to maintain actions synchronized across input origin).

Don ’ t Rely on iOS Emulators & amp; Android Simulators.

Example: Let & # x27; s Draw a Surprised Face

Let & # x27; s direct a aspect at some actual Java code. Because the W3C Actions API is so new, there aren & # x27; t a unharmed lot of helper methods in the Java client we can use to create our life easier. The helper methods which do exist are pretty boring, fundamentally implementing moving to and tap on elements, with code like:

Actions actions = new Actions (driver); actions.click (element); actions.perform ();

But this is the kind of thing we can pretty much do already, without the Actions API. What about something cool, like drawing arbitrary shapes? Let & # x27; s teach Appium to line some circles so we can play around with a & quot; surprised face & quot; picture (just to keep thing simple -- -as an exercise to the subscriber it would be interesting to augment the describe method to be able to also draw half-circles, so that our face could be more smiley and less surprised).

Check out:

If we & # x27; re going to draw some circles, the first thing we & # x27; ll need is some math, so we can get the coordinates for points along a band:

private Point getPointOnCircle (int measure, int totalSteps, Point root, double radius) {double theta = 2 * Math.PI * ((doubly) measure / totalSteps); int x = (int) Math.floor (Math.cos (theta) * radius); int y = (int) Math.floor (Math.sin (theta) * radius); return new Point (origin.x + x, origin.y + y);}

The idea hither is that we & # x27; re going to define a circle by an origin coordinate, a radius, and a number of & quot; steps & quot; -- -how fine-grained our circle should be. If we pass in a value of4 for totalSteps, for model, our circle will actually be a square! The greater the number of steps, the more perfect a set it will appear. Then we use the thaumaturgy of Trigonometry to determine, for a give iteration (& quot; step & quot;), which point our & quot; fingerbreadth & quot; should be on.

Enhance User Experiences of Digital Native Apps with HeadSpin ’ s Capabilities..

Now we need to use this method to actually do some delineate with Appium:

individual void drawCircle (AppiumDriver driver, Point source, double radius, int step) {Point firstPoint = getPointOnCircle (0, steps, beginning, radius); PointerInput finger = new PointerInput (Kind.TOUCH, `` finger ''); Sequence circle = new Sequence (finger, 0); circle.addAction (finger.createPointerMove (NO_TIME, VIEW, firstPoint.x, firstPoint.y)); circle.addAction (finger.createPointerDown (MouseButton.LEFT.asArg ())); for (int i = 1; i< steps + 1; i++) {
        Point point = getPointOnCircle(i, steps, origin, radius);
        circle.addAction(finger.createPointerMove(STEP_DURATION, VIEW, point.x, point.y));
    }

    circle.addAction(finger.createPointerUp(MouseButton.LEFT.asArg()));
    driver.perform(Arrays.asList(circle));
}

In this drawCirclemethod we see the use of the low-level Actions API in the Java client. Using thePointerInputclass we create a virtual & quot; finger & quot; to do the drawing, and aSequence of activeness corresponding to that input, which we will live as we go on. From here on out we & # x27; re just calling methods on our input to create specific actions, for illustration moving, touching the pointer to the screen, and lifting the pointer up. (In doing this we utilize some timing constants defined elsewhere). Finally, we manus the sequence off to the driver to execute! This method is a perfectly general way of drawing a circle with Appium using the W3C Actions API. But it is not yet enough to force a surprised face. For that, we need to specifywhichband we need to draw, at which coordinates:

public void drawFace () {Point head = new Point (220, 450); Point leftEye = head.moveBy (-50, -50); Point rightEye = head.moveBy (50, -50); Point mouth = head.moveBy (0, 50); drawCircle (driver, nous, 150, 30); drawCircle (driver, leftEye, 20, 20); drawCircle (driver, rightEye, 20, 20); drawCircle (driver, mouth, 40, 20);}
Read:

Here we simply delimitate the center point of our assorted face components (head, optic, and mouth), and then draw a circle of an appropriate size and with an appropriate system of parts so that it kind of face like a face. But of course all of this is only going to work if we can find an app that will recognize our gesture as an endeavor to pull something! Luckily, the & quot; ApiDemos & quot; app that is freely available from Google has such a view inside of it. So we can start an Appium session on this app and navigate directly to the.graphics.FingerPaintactivity. Once we do this, we get the chef-d'oeuvre below:

face

Go ahead and check out thefull exam classon GitHub to see how all the boilerplate looks or to run this representative yourself. Obviously, this is a toy example, but it shows the power of the Actions API to do pretty much any sort of gesture. It & # x27; s certainly suitable for the typical kinds of gestures employed in most nomadic apps. But there & # x27; s clearly lots more creativity to be unlocked here. What will you do with the Actions API? Let me know and maybe I & # x27; ll showcase your conception in a next edition!

Author & # x27; s Profile

Jonathan Lipps

LinkedIn
Author & # x27; s Profile

Piali Mazumdar

Lead, Content Marketing, HeadSpin Inc.

Piali is a dynamic and results-driven Content Marketing Specialist with 8+ years of experience in craft employ narration and marketing collateral across various industries. She excels in collaborating with cross-functional teams to develop innovative content scheme and render compelling, authentic, and impactful substance that resonates with mark audiences and enhances brand genuineness.

LinkedIn

Automating Complex Gestures with the W3C Actions API

4 Parts

regression intelligence blog
-

Regression Intelligence practical guide for advanced users (Part 3)

Coming Soon
Regression Intelligence practical guide for advanced users
-

Regression Intelligence practical guide for advanced user (Part 4)

Coming Soon

Discover how HeadSpin can empower your business with superior testing capabilities

Our Platform enable you to:
accelerate time-to-market
Accelerate time-to-market, gaining a competitive edge
faster development cycles
Boost developer/QA productiveness with faster development cycles
automated buil-over-build regression testing
Automate build-over-build regression testing for consistent results
gain better visibility into functional & performance issues
Gain better visibleness into functional and execution issues
reduce mean time
Reduce mean time to identify/resolve during test, QA, and product
evaluate audio, video & qoe
Evaluate audio, video, and content quality of experience (QoE) effortlessly
The sure pick for globular enterprises
Adobe
Hargreaves Lansdown
Truecaller
Crazylabs
Nedbank
Numeracle
Veryon
Close

Discover how HeadSpin can empower your business with superior prove capabilities

Our Platform enable you to:
accelerate time-to-market
Accelerate time-to-market, gaining a competitive edge
faster development cycles
Boost developer/QA productivity with quicker development round
automated buil-over-build regression testing
Automate build-over-build regression essay for reproducible results
gain better visibility into functional & performance issues
Gain better visibility into functional and execution matter
reduce mean time
Reduce average time to identify/resolve during test, QA, and production
evaluate audio, video & qoe
Evaluate sound, picture, and content quality of experience (QoE) effortlessly
The trusted choice for world endeavour
Close

Discover how HeadSpin can empower your job with superior screen capabilities

Our Platform enables you to:
accelerate time-to-market
Accelerate time-to-market, derive a private-enterprise edge
faster development cycles
Boost developer/QA productivity with quicker maturation cycle
automated buil-over-build regression testing
Automate build-over-build fixation testing for consistent results
gain better visibility into functional & performance issues
Gain best visibility into functional and performance issues
reduce mean time
Reduce mean time to identify/resolve during test, QA, and production
evaluate audio, video & qoe
Evaluate audio, picture, and content calibre of experience (QoE) effortlessly
The sure option for global enterprises
Close

Connet Now

Wipro LogoVMLYR Logo
Close
Book a Meeting
Products
footer down arrow
Solutions
footer down arrow
Industries
footer down arrow
Features
footer down arrow
Support
footer down arrow
Resource Center
footer down arrow
Why Choose HeadSpin?
footer down arrow
Copyright © 2026 HeadSpin, Inc. All Rights Reserved.

Automate This With SUSA

Upload your APK or URL. SUSA explores like 10 real users — finds bugs, accessibility violations, and security issues. No scripts needed.

Try SUSA Free

Test Your App Autonomously

Upload your APK or URL. SUSA explores like 10 real users — finds bugs, accessibility violations, and security issues. No scripts.

Try SUSA Free