Maestro is the fastest and easiest way to create end-to-end UI tests for your mobile app — no matter whether it’s built with native or cross-platform technologies. Maestro isn’t an “Android testing framework” or “Flutter testing framework” — it’s a “UI” testing framework, thanks to the fact that it depends only on accessibility information.

In this article you’ll learn about a problem that our users faced when testing Flutter apps — and how we solved it by making a contribution to the Flutter open source project!

A word about Flutter

For those of you who aren’t familiar with Flutter, it’s a cross-platform framework for building apps using the Dart language. Almost every piece of UI in Flutter (whether a button, a list, or a whole screen) is called a widget.

A unique characteristic of Flutter is that it works like a game engine — the widgets are rendered directly onto a canvas. This has many advantages, but since no native components are used, the semantics information describing UI cannot be provided automatically, like it’s done in the default UI frameworks provided by Android and iOS. Instead, Flutter widgets construct the whole semantics information tree themselves.

Semantics in Flutter

That’s where a component of Flutter called AccessibilityBridge comes into play. It translates the virtual semantics tree that lives inside the Flutter app into native semantics tree that is understood by screen readers such as TalkBack or VoiceOver — or by accessibility-based testing frameworks like Maestro!

Of course this is simplified (Flutter maintains accessibility information in another tree called semantics tree), but the gist remains the same.

Thanks to Flutter’s excellent support for accessibility, Maestro has always worked great with it.

Keys in Flutter

If you’ve ever used a reactive framework (like React or Flutter), you’re probably familiar with the concept of keys. Every widget in Flutter accepts an optional “key” argument. Keys are used by the framework to control how one widget replaces another — a case that happens often in editable lists, for example. A key can also be used to uniquely identify a widget on the screen, although it’s not its primary purpose.

Problem statement

A few months ago, a Maestro user filed an interesting feature request on GitHub. He wanted Maestro to be able to interact with a Flutter app on the basis of keys:

In other words, he wanted to be able to write this code in Maestro flows:

- tapOn:
    flutterKey: add_comment
- tapOn:
    flutterKey: comment_text_field
- inputText: Mobile UI Testing, Simplified.

This feature request turned out to be important for Maestro users — it quickly reached 30 likes, becoming the second most-upvoted issue on our GitHub repo!

With such strong feedback from the community, we decided to look into it. Fortunately, the issue author also shared his own early draft implementation — with the use of flutter_driver package.

This is what open-source maintainers love the most — a high-quality feature request with some draft implementation from the author. Even though in the end we went with a different approach, that draft code made the initial evaluation much quicker. Huge kudos to Eduardo!

Enter: flutter_driver

flutter_driver is Flutter’s original solution to end-to-end UI testing. To use it, the developer has to add dependency on the flutter_driver package and initialize it during app startup. Then they write the test script in Dart, which is executed on the host machine. The test script communicates with the running app by sending JSON-RPC commands (such as tap or enterText) over WebSockets.

Since the driver script talks with the same Dart VM that runs the Flutter app, it can do all sorts of things, such as simulating taps, clicks, entering text. This also includes finding widgets by key:

void main() async {
  final driver = await FlutterDriver.connect();
  await driver.tap(find.byValueKey('add_comment'));
  await driver.close();
}

An important limitation of flutter_driver is that it can only interact with UI rendered by Flutter — i.e. native UI elements such as permission dialogs are out of reach. Fortunately, this is not a problem since Maestro already interacts with native UI by using semantic information.

So far so good — where’s the catch?

What’s not OK with flutter_driver

For a few years now, flutter_driver has been slowly phased out by the Flutter team. There are no signs of it being removed — likely because many existing tests, including inside Google, depend on it — but it doesn’t receive almost any support, bugfixes, or development. This can be seen in the large number of unaddressed issues.

Flutter developers are encouraged to migrate from flutter_driver to the integration_test plugin. The integration_test plugin addresses some of flutter_driver’s pain points (like being fully black-box, and communicating only by sending JSONs back and forth), but introduces a few new ones, and it still isn’t a viable solution for any serious UI testing. It also hasn’t received any significant update since it was released in Flutter 2.0 in Q1 2021.

Now you understand why we didn’t see flutter_driver as a stable foundation. But apart from the stability and maintenance standpoints, there were other serious downsides.

Doesn’t work in release builds

flutter_driver works only in apps built in debug and profile mode. This limitation comes from the fact that it’s implemented as a Dart VM service extension — and such service extensions are disabled in release builds. Changing this behavior would require forking the package and ditching service extensions in favor of e.g. a plain old HTTP server.

Requires modification of app code

The app developer has to include flutter_driver in the pubspec.yaml file, and add some initialization code. This voids one of Maestro’s most loved features — blissfully simple setup, or rather the lack of it.

Context switching

Even if we added the flutterKey argument to selectors, it’d only work with the Flutter app being currently tested, thus forcing Maestro users to keep track of the test’s current “context” (“is my Flutter app currently running?”). Definitely not intuitive enough.

Simple cases like tap or inputText would likely work just fine, but what about more complex ones? One of Maestro’s great features is its powerful selector system — you can write commands as complex as:

- tapOn:
    below: "View above that has this text"
    above:
        id: "view_below_id"
    leftOf: "View to the right has this text"
    rightOf: "View to the left has this text"
    containsChild: "Text in a child view"
    childOf:
        - id: "buy-now"
    containsDescendants:
        - id: "title_id"
          text: "A descendant view has id 'title_id' and this text"
        - "Another descendant view has this text"

How could we make it work with flutter_driver? There’s no obvious answer. Some fancy tree merging? Remember that Maestro aims to be the simplest and most effective testing framework?

Bad precedent

Overall, this approach would require to “teach” Maestro about Flutter’s existence — and we really don’t want to do that. Remember that one of Maestro’s killer features is that it works equally well no matter the technology you write your app in? It does so by using only the semantics information produced by your app (to learn more about how it works under the hood, check out this article).

Adding special support for Flutter would be a bad precedent — we’d deviate away from the accessibility-only approach and have to maintain additional code that exists just for the sake of supporting Flutter.

At the drawing board, again

We weren’t happy with the flutter_driver approach.

Using only the draft implementation provided by Eduardo (the original issue author) and doing some thought experiments (with a fair share of my own previous experiences building test tooling for Flutter), we already found so many possible problems.

We took a step back. Are we even trying to solve the right problem?

At the beginning I showed how semantics information from the Flutter app is converted by AccessibilityBridge into native semantics. It’s a complex system, but from the app developer’s perspective it comes down to using the Semantics widget to wrap widgets to be annotated with semantic information:

Semantics(
  label: 'A weird red box widget',
  child: InkWell(
    onTap: () => launchUrl('https://youtu.be/YlUKcNNmywk?t=148'),
    child: Container(width: 100, height: 100, color: Colors.red),
  ),
)

The Semantics.label field is transformed by AccessibilityBridge into accessibility label, which is called text Maestro’s selectors:

- tapOn:
    text: "A weird red box widget"

The problem with accessibility label is that it’s read out to users by screen readers — well, that’s actually its purpose — and you don’t want to replace helpful content with some weird identifiers for use by UI testing. Don’t be evil!

Fortunately, native semantics on both Android and iOS also supports accessibility identifier — it’s the same as accessibility label, but only for UI testing. Screen readers won’t read it out.

Maestro has always supported interaction on the basis of semantics identifier:

- tapOn:
    id: login_button

So why it doesn’t work in Flutter? The answer is simple: the Semantics widget doesn’t have a property for semantics identifier.

A quick Google search also revealed the Flutter issue aptly named “Missing accessibility id for testing purposes”.

It was the most-upvoted Flutter issue related to testing. Created in May 2018, it accumulated 129 upvotes. Turns out we were not the only ones to be annoyed by missing accessibility identifier!

If we solved this problem, we’d benefit not only Maestro users, but literally everyone who is serious about testing their Flutter apps.

Plan B — let’s fix Flutter

It took a while to go through that issue — 120 comments over 5 years is quite a read! Having done it, I already had a pretty good idea of what kind change is needed — Semantics widget must get the identifier property:

Semantics(
  identifier: 'tap_me',
  label: 'A weird red box widget',
  child: InkWell(
    onTap: () => launchUrl('https://youtu.be/YlUKcNNmywk?t=147'),
    child: Container(width: 100, height: 100, color: Colors.red),
  ),
)

To tap on the above widget by semantics identifier:

- tapOn:
    id: "tap-me"

Pros and cons

The new “fix Flutter approach” seemed enticing. The advantages were obvious:

High reward — things will just start to work
Stays true to Maestro’s philosophy — zero setup, ease of use, and simplicity
Has none of the problems of flutter_driver

There were also quite a few uncertainties:

Requires a (potentially complex) contribution to an external project
A contribution that project owners may not even be willing to accept
Even if accepted, it’ll only be generally available in the next major Flutter release, which happens every quarter
App developers will be forced to upgrade their Flutter version to take advantage of it

Fortunately this feature request wasn’t urgent, so we decided to take our time and do it right. We could always fallback and reconsider the “flutter_driver approach” if the “fix Flutter approach” failed.

Implementation

The most fun part!

Even though the feature we contributed seems simple, it required changes in all layers of Flutter — the Framework, the Engine, and the Android and iOS embedders — and writing some code using a whole zoo of programming languages — Dart, Java, Objective-C++, and C++.

The Flutter project is split into two major repositories — the Framework (flutter/flutter) and the Engine (flutter/engine).

Making changes in the Framework was fast and pleasant, while the Engine was a whole different beast. Hundreds of thousands lines of code. Waiting an hour to compile didn’t make things fun — fortunately incremental builds took only a minute or two.

The Flutter project also enforces that each and every PR must be green before being merged to master. This caused difficulties because we had to change some internal APIs on the Framework ↔ Engine boundary, which resulted in red CI.

This intentional circular dependency enforces that Engine changes which aren’t compatible with the Framework cannot be merged.

We worked around that by introducing a temporary API and carefully adding the identifier property. This resulted in 5 pull requests:

The process was quite involved, but it makes perfect sense in a huge project like Flutter.

Results

Success! 🎉

The new Semantics.identifier property is present in native accessibility tree on Android and iOS and works seamlessly in both Maestro and Maestro Studio.

Sort of accidentally we also solved the #1 most-upvoted testing issue in Flutter — and we did that for users of all mobile UI testing frameworks, not just Maestro! So if you’re still using Appium to test Flutter apps and don’t want to switch to Maestro just yet, you can still benefit from our contribution.

The community was also happy that a 5 year old issue has been fixed 🤠

Try it out!

Semantics.identifier is already available on the latest stable Flutter 3.19 release:

$ flutter channel stable
$ flutter upgrade

We have also updated our Flutter documentation — it now not only explains how to use Semantics.identifier, but it’s much more extensive and helpful 🙌

That’s all for now — until the next time!

If you haven’t yet tried Maestro to test your Flutter app, do it — we promise it’ll blow your mind. Get started here and make sure to join us on Slack!

If you’re curious about how the iOS driver works under the hood, check out this article.

The power of open-source. Making Maestro work better with Flutter