Capturing Audio Output During Testing: Part 2

January 30, 2026 · 11 min read · Testing Guide

HeadSpin Platform
Automated & amp; manual examination made easy through data science perceptiveness.
Differentiating capacity:
  • Extensive end-to-end mechanisation of QA procedure
  • Comparative analysis of app execution against peers
  • Uninterrupted monitoring of app execution using semisynthetic data for high availability of apps
  • Easy-to-use developer friendly program
cloudtest go
Affordable Existent Device Testing for Emerging Teams
cloudtest go
Low-cost Real Device Testing for Digital Enterprises
cloudtest go
The Ultimate Solution for a Powerful Blend of Functional & amp; Performance Testing!
cyol
TEM
New
Centralized mobile test execution in cloud
cyol
Enhance Your Accessibility Testing With HeadSpin
cyol
Automate camera-based testing

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

retail

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎‎

Capturing Audio Output During TestingCapturing Audio Output During Testing

Capturing Audio Output During Testing: Part 2

Published on
May 22, 2019
Updated on
Published on
March 30, 2022
Updated on
 by 
 Jonathan LippsJonathan Lipps
Jonathan Lipps

HeadSpin & # x27; s custom Bluetooth board and analysis API allows testing voice assistants, validating streaming media, and work with voice calls on real device.

Testing Voice Assistant

Previously, we looked at. Now, it & # x27; s time to look at how to verify that the audio matches our expectations! The real fact that we have outlook want to be convey in audio form. In other lyric, what we want to do is take the audio we & # x27; ve captured from a particular test run, and assert that it is in some waysimilarto another audio file that we already have available. We can call this latter audio file the & quot; baseline & quot; or & quot; gold standard & quot;, against which we will be running our tests.

Context

In the first piece, the audio we care about verifying is a level-headed snippet from one ofmy band & # x27; s old songs. So, what we demand to do is save a snip we cognise to be & quot; good & quot;, so that future versions of the tryout can be compared against this. I & # x27; ve gone ahead and copied such a snippet into theresourcefulness directoryof the Appium Pro task. The province of our test as we left it in the previous part was that we had captured audio from an, and swear the audio had been saved, but had not done anything with it. Here & # x27; s where we are starting out from now, with our new test class that inherits from the old tryout class:

public class Edition070_Audio_Verification extends Edition069_Audio_Capture {private File getReferenceAudio () throw URISyntaxException {URL refImgUrl = getClass () .getClassLoader () .getResource (`` Edition070_Reference.wav ''); return Paths.get (refImgUrl.toURI ()) .toFile ();} @ Test @ Override public void testAudioCapture () throws Exception {WebDriverWait expect = new WebDriverWait (driver, 10); // navigate to band homepage driver.get (`` http: //www.splendourhyaline.com ''); // chatter the amazon stock icon for the first album WebElement store = wait.until (ExpectedConditions.presenceOfElementLocated (By.cssSelector (`` img [src='/img/store-amazon.png '] ''))); driver.executeScript (`` window.scrollBy (0, 100); ''); store.click (); // start playing a sample of the first trail WebElement drama = wait.until (ExpectedConditions.presenceOfElementLocated (By.xpath (`` //div [@ data-track-number= ' 2 '] //div [@ data-action='dm-playable '] ''))); driver.executeScript (`` window.scrollBy (0, 150); ''); // depart the song sampling play.click (); // start an ffmpeg audio capture of scheme audio. Replace with a path and device id // appropriate for your scheme (list device with ` ffmpeg -f avfoundation -list_devices true -i `` '' ` File audioCapture = new File (`` /Users/jlipps/Desktop/capture.wav ''); captureForDuration (audioCapture, 10000);}}

What we now need to do is assert that our audioCapture File in some sense matches our gold standard. But how on earth would we do that?

SUSA automates exploratory testing with persona-driven behavior, catching bugs that scripted automation misses.

Audio file similarity

As a naive approach, we could adopt that alike wav file might be similar on a byte tier. We could try to use something like an MD5 hash of the file and compare it with our gilt standard. This, nonetheless, will not act. Unless the WAV files are incisively the same, the MD5 hash will belike be whole unrelated. We could get slightly more complicated, and actually read the WAV file as a current of bytes and compare each byte of our captured WAV with the baseline WAV. This access, unfortunately, is also doomed to fail! Tiny differences in sound would lead to huge dispute on a byte level. Also, if the timing of the two WAV files differs by anything more than the sample rate (which is many thousands of times per moment), every single byte will be different, and our equivalence will be utter garbage.

HeadSpin - One program for all your Audio-Visual examination.!

What we will do instead is take advantage of work that has been done in the reality of audiofingerprint. Fingerprinting is what lies behind services like Shazam or last.fm that can detect what song is be play even though it might be enter through your phone & # x27; s mike. Fingerprinting is a complex algorithm that takes into account various acoustic belongings of WAV file segments and produces what is essentially a hashish of a piece of audio. The crucial thing is that similar audio files will produce more alike hashes, so they can actually be fruitfully liken with one another.

Chromaprint

The fingerprinting library we will use is calledChromaprint, and you will need to download the appropriate version for your system. Just like with ffmpeg, we will run the Chromaprint binary as a Java subprocess. The way we & # x27; d run it outside of Java, on the dictation line, would be like this:

# in the chromaprint directory ./fpcalc -raw /path/to/audio.wav

This will produce output that gibe to the fingerprint of the audio file. Using the -raw flag signify we get the raw numerical output rather than the base64-encoded output (which is nice and small but makes the comparability between fingerprint less potent). Running from the terminal, the output will look something like:

DURATION=9 FINGERPRINT=1663902633,1696875689,1709563129,1688656920,1688648712,1688644617,1688708873,1692705291,1684312587,1814528570,1832305258,1863766762,3993379306,3993380074,3997640938,3998550234,4275505354,4271783130,4137558234,1988825290,1451753930,1586000842,1586029546,3599411178,3603677163,3595079099,3590884507,1444650121,1448785096,1985647688,1985647640,1985581096,1983422504,1983438904,2000576584,1949176009,1949172170,1950220986,1954481834,1958288106,1958292202,1958247034,1957227082,1957211979,2011735113,1982377992,2116780040,2117763128,1041928232,1041948712,907960360,907878456,1728520216,1812338040,1812336376,1818103480

But we require to run this from Java, so we take a handy form that encapsulates all this fingerprinting business, including lam Chromaprint & # x27; s fpcalc binary. It will also be responsible for parsing the response and store it in a way that makes comparison easy:

course AudioFingerprint {individual static String FPCALC = `` /Users/jlipps/Desktop/chromaprint/fpcalc ''; individual String fingerprint; AudioFingerprint (String fingermark) {this.fingerprint = fingermark;} public String getFingerprint () {return fingerprint;} public double equivalence (AudioFingerprint early) {homecoming FuzzySearch.partialRatio (this.getFingerprint (), other.getFingerprint ());} public static AudioFingerprint calcFP (File wavFile) throws Exception {String output = new ProcessExecutor () .command (FPCALC, `` -raw '', wavFile.getAbsolutePath ()) .readOutput (true) .execute () .outputUTF8 (); Pattern fpPattern = Pattern.compile (`` ^FINGERPRINT= (.+) $ '', Pattern.MULTILINE); Matcher fpMatcher = fpPattern.matcher (output); String fingermark = null; if (fpMatcher.find ()) {fingerprint = fpMatcher.group (1);} if (fingerprint == null) {throw new Exception (`` Could not get fingerprint via Chromaprint fpcalc '');} return new AudioFingerprint (fingerprint);}}

Basically, what & # x27; s locomote on here is that we are setting a way to the fpcalc binary, and then using the ProcessExecutor Java library (from the good folks atZeroTurnaroundto make executing fpcalc very leisurely. We so use regular expression matching on the output to extract a fingerprint from an audio file. Most of the codification here is simply Java class boilerplate and regular expression logic!

Comparing fingerprints

The most crucial bit is the equivalence method, where we are do use of something called theLevenshtein lengthbetween strings to figure out how similar to audio fingerprint actually are. To this end I & # x27; m expend a library phoneJavaWuzzy(a port of the useful Python libraryFuzzyWuzzy), which contains the important algorithms so I don & # x27; t need to care about implementing them. The response of my yell to the partialRatio method is a number between 0 and 100, where 100 is a perfect lucifer and 0 signifies no matching segments at all.

All we ask to do so, is hook this class up into our examination so that we can fingerprint both our newly captured audio as well as the baseline audio, and so run the equivalence. In my experiments, I was capable to reach a value of about 75 for a right comparison, whereas other song snippets came in at an appropriately low value, say 45. Of course, you & # x27; ll want to determine through experimentation what your similarity threshold should be, base on the special sound domain, clip duration, etc ...

Read:

Hooking in the new codification is comparatively easy (starting from the point in the test method where we get the audioCapture file populated with the new sound:

// now we calculate the fingermark of the freshly-captured audio ... AudioFingerprint fp1 = AudioFingerprint.calcFP (audioCapture); // as well as the fingerprint of our baseline sound ... AudioFingerprint fp2 = AudioFingerprint.calcFP (getReferenceAudio ()); // and compare the two dual comparing = fp1.compare (fp2); // ultimately, we assert that the comparison is sufficiently potent Assert.assertThat (comparison, Matchers.greaterThanOrEqualTo (70.0));

Here, I & # x27; ve added a helper method called getReferenceAudio () to get me the baseline audio File target from the resources directory. And notice the assertion in the final line, which turns this bit of automation into a bona fide test of audio similarity!

So, when all is say and done, itispossible to test audio with Appium and Java (and since we are using ffmpeg and Chromaprint as subprocesses, the same proficiency can be used in any other programming language as well). This is relatively unexplored territory, though, so I would anticipate there to be a certain amount of potential flakiness for this kind of examine. That being say, the Chromaprint fingerprinting algorithm is used commercially and appears to be quite full, so at the end of the day the quality of the test will depend on the quality, duration, and genre of your sound. Please do let me know if you put this into practice as I & # x27; d enjoy to hear any case report of this technique. And don & # x27; t bury to check out thetotal code sampleon GitHub, to see everything in circumstance. Glad examination, and happy hearing! Oh, and in case you genuinely need to know: yes, my band will be coming out with a new studio album very soon!

Author & # x27; s Profile

Jonathan Lipps

LinkedIn
Author & # x27; s Profile

Piali Mazumdar

Lead, Content Marketing, HeadSpin Inc.

Piali is a active and results-driven Content Marketing Specialist with 8+ years of experience in crafting engaging narratives and marketing collateral across divers industries. She surpass in collaborating with cross-functional team to develop advanced message strategies and deliver compelling, unquestionable, and impactful content that vibrate with target audience and enhances brand authenticity.

LinkedIn

Capturing Audio Output During Testing: Part 2

4 Parts

regression intelligence blog
-

Regression Intelligence practical guide for forward-looking user (Part 3)

Coming Soon
Regression Intelligence practical guide for advanced users
-

Regression Intelligence hardheaded guidebook for advanced users (Part 4)

Coming Soon

Discover how HeadSpin can empower your business with superior examine capabilities

Our Platform enable you to:
accelerate time-to-market
Accelerate time-to-market, gaining a competitory edge
faster development cycles
Boost developer/QA productiveness with faster evolution cycles
automated buil-over-build regression testing
Automate build-over-build fixation testing for consistent results
gain better visibility into functional & performance issues
Gain better visibility into functional and execution issues
reduce mean time
Reduce mean time to identify/resolve during exam, QA, and production
evaluate audio, video & qoe
Evaluate audio, video, and contented character of experience (QoE) effortlessly
The trusted choice for global initiative
Adobe
Hargreaves Lansdown
Truecaller
Crazylabs
Nedbank
Numeracle
Veryon
Close

Discover how HeadSpin can empower your business with superior testing capability

Our Platform enable you to:
accelerate time-to-market
Accelerate time-to-market, derive a private-enterprise edge
faster development cycles
Boost developer/QA productivity with faster development cycles
automated buil-over-build regression testing
Automate build-over-build fixation testing for reproducible consequence
gain better visibility into functional & performance issues
Gain better visibility into functional and performance issues
reduce mean time
Reduce mean clip to identify/resolve during test, QA, and product
evaluate audio, video & qoe
Evaluate audio, video, and contented lineament of experience (QoE) effortlessly
The trusted choice for global enterprises
Close

Discover how HeadSpin can empower your job with superior testing capableness

Our Platform enable you to:
accelerate time-to-market
Accelerate time-to-market, gain a competitive edge
faster development cycles
Boost developer/QA productivity with faster development cycles
automated buil-over-build regression testing
Automate build-over-build fixation testing for coherent results
gain better visibility into functional & performance issues
Gain best visibleness into functional and performance issues
reduce mean time
Reduce average clip to identify/resolve during test, QA, and production
evaluate audio, video & qoe
Evaluate sound, video, and contented quality of experience (QoE) effortlessly
The trusted selection for global enterprises
Close

Connet Now

Wipro LogoVMLYR Logo
Close
Book a Meeting
Products
footer down arrow
Solutions
footer down arrow
Industries
footer down arrow
Features
footer down arrow
Support
footer down arrow
Resource Center
footer down arrow
Why Choose HeadSpin?
footer down arrow
Copyright © 2026 HeadSpin, Inc. All Rights Reserved.

Automate This With SUSA

Upload your APK or URL. SUSA explores like 10 real users — finds bugs, accessibility violations, and security issues. No scripts needed.

Try SUSA Free

Test Your App Autonomously

Upload your APK or URL. SUSA explores like 10 real users — finds bugs, accessibility violations, and security issues. No scripts.

Try SUSA Free