Tag Archives: User Experience

Sensing Force-Based Gestures on the Pixel 4



Touch input has traditionally focussed on two-dimensional finger pointing. Beyond tapping and swiping gestures, long pressing has been the main alternative path for interaction. However, a long press is sensed with a time-based threshold where a user’s finger must remain stationary for 400–500 ms. By its nature, a time-based threshold has negative effects for usability and discoverability as the lack of immediate feedback disconnects the user’s action from the system’s response. Fortunately, fingers are dynamic input devices that can express more than just location: when a user touches a surface, their finger can also express some level of force, which can be used as an alternative to a time-based threshold.

While a variety of force-based interactions have been pursued, sensing touch force requires dedicated hardware sensors that are expensive to design and integrate. Further, research indicates that touch force is difficult for people to control, and so most practical force-based interactions focus on discrete levels of force (e.g., a soft vs. firm touch) — which do not require the full capabilities of a hardware force sensor.

For a recent update to the Pixel 4, we developed a method for sensing force gestures that allowed us to deliver a more expressive touch interaction experience By studying how the human finger interacts with touch sensors, we designed the experience to complement and support the long-press interactions that apps already have, but with a more natural gesture. In this post we describe the core principles of touch sensing and finger interaction, how we designed a machine learning algorithm to recognise press gestures from touch sensor data, and how we integrated it into the user experience for Pixel devices.

Touch Sensor Technology and Finger Biomechanics
A capacitive touch sensor is constructed from two conductive electrodes (a drive electrode and a sense electrode) that are separated by a non-conductive dielectric (e.g., glass). The two electrodes form a tiny capacitor (a cell) that can hold some charge. When a finger (or another conductive object) approaches this cell, it ‘steals’ some of the charge, which can be measured as a drop in capacitance. Importantly, the finger doesn’t have to come into contact with the electrodes (which are protected under another layer of glass) as the amount of charge stolen is inversely proportional to the distance between the finger and the electrodes.
Left: A finger interacts with a touch sensor cell by ‘stealing’ charge from the projected field around two electrodes. Right: A capacitive touch sensor is constructed from rows and columns of electrodes, separated by a dielectric. The electrodes overlap at cells, where capacitance is measured.
The cells are arranged as a matrix over the display of a device, but with a much lower density than the display pixels. For instance, the Pixel 4 has a 2280 × 1080 pixel display, but a 32 × 15 cell touch sensor. When scanned at a high resolution (at least 120 Hz), readings from these cells form a video of the finger’s interaction.
Slowed touch sensor recordings of a user tapping (left), pressing (middle), and scrolling (right).
Capacitive touch sensors don’t respond to changes in force per se, but are tuned to be highly sensitive to changes in distance within a couple of millimeters above the display. That is, a finger contact on the display glass should saturate the sensor near its centre, but will retain a high dynamic range around the perimeter of the finger’s contact (where the finger curls up).

When a user’s finger presses against a surface, its soft tissue deforms and spreads out. The nature of this spread depends on the size and shape of the user’s finger, and its angle to the screen. At a high level, we can observe a couple of key features in this spread (shown in the figures): it is asymmetric around the initial contact point, and the overall centre of mass shifts along the axis of the finger. This is also a dynamic change that occurs over some period of time, which differentiates it from contacts that have a long duration or a large area.
Touch sensor signals are saturated around the centre of the finger’s contact, but fall off at the edges. This allows us to sense small deformations in the finger’s contact shape caused by changes in the finger’s force.
However, the differences between users (and fingers) makes it difficult to encode these observations with heuristic rules. We therefore designed a machine learning solution that would allow us to learn these features and their variances directly from user interaction samples.

Machine Learning for Touch Interaction
We approached the analysis of these touch signals as a gesture classification problem. That is, rather than trying to predict an abstract parameter, such as force or contact spread, we wanted to sense a press gesture — as if engaging a button or a switch. This allowed us to connect the classification to a well-defined user experience, and allowed users to perform the gesture during training at a comfortable force and posture.

Any classification model we designed had to operate within users’ high expectations for touch experiences. In particular, touch interaction is extremely latency-sensitive and demands real-time feedback. Users expect applications to be responsive to their finger movements as they make them, and application developers expect the system to deliver timely information about the gestures a user is performing. This means that classification of a press gesture needs to occur in real-time, and be able to trigger an interaction at the moment the finger’s force reaches its apex.

We therefore designed a neural network that combined convolutional (CNN) and recurrent (RNN) components. The CNN could attend to the spatial features we observed in the signal, while the RNN could attend to their temporal development. The RNN also helps provide a consistent runtime experience: each frame is processed by the network as it is received from the touch sensor, and the RNN state vectors are preserved between frames (rather than processing them in batches). The network was intentionally kept simple to minimise on-device inference costs when running concurrently with other applications (taking approximately 50 µs of processing per frame and less than 1 MB of memory using TensorFlow Lite).
An overview of the classification model’s architecture.
The model was trained on a dataset of press gestures and other common touch interactions (tapping, scrolling, dragging, and long-pressing without force). As the model would be evaluated after each frame, we designed a loss function that temporally shaped the label probability distribution of each sample, and applied a time-increasing weight to errors. This ensured that the output probabilities were temporally smooth and converged towards the correct gesture classification.

User Experience Integration
Our UX research found that it was hard for users to discover force-based interactions, and that users frequently confused a force press with a long press because of the difficulty in coordinating the amount of force they were applying with the duration of their contact. Rather than creating a new interaction modality based on force, we therefore focussed on improving the user experience of long press interactions by accelerating them with force in a unified press gesture. A press gesture has the same outcome as a long press gesture, whose time threshold remains effective, but provides a stronger connection between the outcome and the user’s action when force is used.
A user long pressing (left) and firmly pressing (right) on a launcher icon.
This also means that users can take advantage of this gesture without developers needing to update their apps. Applications that use Android’s GestureDetector or View APIs will automatically get these press signals through their existing long-press handlers. Developers that implement custom long-press detection logic can receive these press signals through the MotionEvent classification API introduced in Android Q.

Through this integration of machine-learning algorithms and careful interaction design, we were able to deliver a more expressive touch experience for Pixel users. We plan to continue researching and developing these capabilities to refine the touch experience on Pixel, and explore new forms of touch interaction.

Acknowledgements
This project is a collaborative effort between the Android UX, Pixel software, and Android framework teams.

Source: Google AI Blog


Toward Human-Centered Design for ML Frameworks



As machine learning (ML) increasingly impacts diverse stakeholders and social groups, it has become necessary for a broader range of developers — even those without formal ML training — to be able to adapt and apply ML to their own problems. In recent years, there have been many efforts to lower the barrier to machine learning, by abstracting complex model behavior into higher-level APIs. For instance, Google has been developing TensorFlow.js, an open-source framework that lets developers write ML code in JavaScript to run directly in web browsers. Despite the abundance of engineering work towards improving APIs, little is known about what non-ML software developers actually need to successfully adopt ML into their daily work practices. Specifically, what do they struggle with when trying modern ML frameworks, and what do they want these frameworks to provide?

In “Software Developers Learning Machine Learning: Motivations, Hurdles, and Desires,” which received a Best Paper Award at the IEEE conference on Visual Languages and Human-Centric Computing (VL/HCC), we share our research on these questions and report the results from a large-scale survey of 645 people who used TensorFlow.js. The vast majority of respondents were software or web developers, who were fairly new to machine learning and usually did not use ML as part of their primary job. We examined the hurdles experienced by developers when using ML frameworks and explored the features and tools that they felt would best assist in their adoption of these frameworks into their programming workflows.

What Do Developers Struggle With Most When Using ML Frameworks?
Interestingly, by far the most common challenge reported by developers was not the lack of a clear API, but rather their own lack of conceptual understanding of ML, which hindered their ability to successfully use ML frameworks. These hurdles ranged from the initial stages of picking a good problem to which they could apply TensorFlow.js (e.g., survey respondents reported not knowing “what to apply ML to, where ML succeeds, where it sucks”), to creating the architecture of a neural net (e.g., “how many units [do] I have to put in when adding layers to the model?”) and knowing how to set and tune parameters during model training (e.g., “deciding what optimizers, loss functions to use”). Without a conceptual understanding of how different parameters affect outcomes, developers often felt overwhelmed by the seemingly infinite space of parameters to tune when debugging ML models.

Without sufficient conceptual support, developers also found it hard to transfer lessons learned from “hello world” API tutorials to their own real-world problems. While API tutorials provide syntax for implementing specific models (e.g., classifying MNIST digits), they typically don't provide the underlying conceptual scaffolding necessary to generalize beyond that specific problem.

Developers often attributed these challenges to their own lack of experience in advanced mathematics. Ironically, despite the abundance of non-experts tinkering with ML frameworks nowadays, many felt that ML frameworks were intended for specialists with advanced training in linear algebra and calculus, and thus not meant for general software developers or product managers. This semblance of imposter syndrome may be fueled by the prevalence of esoteric mathematical terminology in API documentation, which may unintentionally give the impression that an advanced math degree is necessary for even practical integration of ML into software projects. Though math training is indeed beneficial, the ability to grasp and apply practical concepts (e.g., a model’s learning rate) to real-world problems does not require an advanced math degree.

What Do Developers Want From ML Frameworks?
Developers who responded to our survey wanted ML frameworks to teach them not only how to use the API, but also the unspoken idioms that would help them to effectively apply the framework to their own problems.

Pre-made Models with Explicit Support for Modification
A common desire was to have access to libraries of canonical ML models, so that they could modify an existing template rather than creating new ones from scratch. Currently, pre-trained models are being made more widely available in many ML platforms, including TensorFlow.js. However, in their current form, these models do not provide explicit support for novice consumption. For example, in our survey, developers reported substantial hurdles transferring and modifying existing model examples to their own use cases. Thus, the provision of pre-made ML models should also be coupled with explicit support for modification.

Synthesize ML Best Practices into Just-in-Time Hints
Developers also wished frameworks could provide ML best practices, i.e., practical tips and tricks that they could use when designing or debugging models. While ML experts may acquire heuristics and go-to strategies through years of dedicated trial and error, the mere decision overhead of “which parameter should I try tuning first?” can be overwhelming for developers who aren't ML experts. To help narrow this broad space of decision possibilities, ML frameworks could embed tips on best practices directly into the programming workflow. Currently, visualizations like TensorBoard and tfjs-vis make it possible to help see what's going on inside of their models.

Coupling these with just-in-time strategic pointers, such as whether to adapt a pre-trained model or to build one from scratch, or diagnostic checks, like practical tips to “decrease learning rate” if the model is not converging, could help users acquire and make use of practical strategies. These tips could serve as an intermediate scaffolding layer that helps demystify the math theory underlying ML into developer-friendly terms.

Support for Learning-by-Doing
Finally, even though ML frameworks are not traditional learning platforms, software developers are indeed treating them as lightweight vehicles for learning-by-doing. For example, one survey respondent appreciated when conceptual support was tightly interwoven into the framework, rather than being a separate resource: “...the small code demos that you can edit and run right there. Really helps basic understanding.” Another explained that “I prefer learning by doing, so I would like to see more tutorials, examples” embedded into ML frameworks. Some found it difficult to take a formal online course, and would rather learn in bite-sized pieces through hands-on tinkering: “Due to the rest of life, I have to fit learning into small 5-15 minute blocks.”

Given these desires to learn-by-doing, ML frameworks may need to more clearly distinguish between a spectrum of resources aimed at different levels of expertise. Although many frameworks already have “hello world” tutorials, to properly set expectations these frameworks could more explicitly differentiate between API (syntax-specific) onboarding and ML (conceptual) onboarding.

Looking Forward
Ultimately, as the frontiers of ML are still evolving, providing practical, conceptual tips for software developers and creating a shared reservoir of community-curated best practices can benefit ML experts and novices alike. Hopefully, these research findings pave the way for more user-centric designs of future ML frameworks.

Acknowledgements
This work would not have been possible without Yannick Assogba, Sandeep Gupta, Lauren Hannah-Murphy, Michael Terry, Ann Yuan, Nikhil Thorat, Daniel Smilkov, Martin Wattenberg, Fernanda Viegas, and members of PAIR and TensorFlow.js.

Source: Google AI Blog


Toward Human-Centered Design for ML Frameworks



As machine learning (ML) increasingly impacts diverse stakeholders and social groups, it has become necessary for a broader range of developers — even those without formal ML training — to be able to adapt and apply ML to their own problems. In recent years, there have been many efforts to lower the barrier to machine learning, by abstracting complex model behavior into higher-level APIs. For instance, Google has been developing TensorFlow.js, an open-source framework that lets developers write ML code in JavaScript to run directly in web browsers. Despite the abundance of engineering work towards improving APIs, little is known about what non-ML software developers actually need to successfully adopt ML into their daily work practices. Specifically, what do they struggle with when trying modern ML frameworks, and what do they want these frameworks to provide?

In “Software Developers Learning Machine Learning: Motivations, Hurdles, and Desires,” which received a Best Paper Award at the IEEE conference on Visual Languages and Human-Centric Computing (VL/HCC), we share our research on these questions and report the results from a large-scale survey of 645 people who used TensorFlow.js. The vast majority of respondents were software or web developers, who were fairly new to machine learning and usually did not use ML as part of their primary job. We examined the hurdles experienced by developers when using ML frameworks and explored the features and tools that they felt would best assist in their adoption of these frameworks into their programming workflows.

What Do Developers Struggle With Most When Using ML Frameworks?
Interestingly, by far the most common challenge reported by developers was not the lack of a clear API, but rather their own lack of conceptual understanding of ML, which hindered their ability to successfully use ML frameworks. These hurdles ranged from the initial stages of picking a good problem to which they could apply TensorFlow.js (e.g., survey respondents reported not knowing “what to apply ML to, where ML succeeds, where it sucks”), to creating the architecture of a neural net (e.g., “how many units [do] I have to put in when adding layers to the model?”) and knowing how to set and tune parameters during model training (e.g., “deciding what optimizers, loss functions to use”). Without a conceptual understanding of how different parameters affect outcomes, developers often felt overwhelmed by the seemingly infinite space of parameters to tune when debugging ML models.

Without sufficient conceptual support, developers also found it hard to transfer lessons learned from “hello world” API tutorials to their own real-world problems. While API tutorials provide syntax for implementing specific models (e.g., classifying MNIST digits), they typically don't provide the underlying conceptual scaffolding necessary to generalize beyond that specific problem.

Developers often attributed these challenges to their own lack of experience in advanced mathematics. Ironically, despite the abundance of non-experts tinkering with ML frameworks nowadays, many felt that ML frameworks were intended for specialists with advanced training in linear algebra and calculus, and thus not meant for general software developers or product managers. This semblance of imposter syndrome may be fueled by the prevalence of esoteric mathematical terminology in API documentation, which may unintentionally give the impression that an advanced math degree is necessary for even practical integration of ML into software projects. Though math training is indeed beneficial, the ability to grasp and apply practical concepts (e.g., a model’s learning rate) to real-world problems does not require an advanced math degree.

What Do Developers Want From ML Frameworks?
Developers who responded to our survey wanted ML frameworks to teach them not only how to use the API, but also the unspoken idioms that would help them to effectively apply the framework to their own problems.

Pre-made Models with Explicit Support for Modification
A common desire was to have access to libraries of canonical ML models, so that they could modify an existing template rather than creating new ones from scratch. Currently, pre-trained models are being made more widely available in many ML platforms, including TensorFlow.js. However, in their current form, these models do not provide explicit support for novice consumption. For example, in our survey, developers reported substantial hurdles transferring and modifying existing model examples to their own use cases. Thus, the provision of pre-made ML models should also be coupled with explicit support for modification.

Synthesize ML Best Practices into Just-in-Time Hints
Developers also wished frameworks could provide ML best practices, i.e., practical tips and tricks that they could use when designing or debugging models. While ML experts may acquire heuristics and go-to strategies through years of dedicated trial and error, the mere decision overhead of “which parameter should I try tuning first?” can be overwhelming for developers who aren't ML experts. To help narrow this broad space of decision possibilities, ML frameworks could embed tips on best practices directly into the programming workflow. Currently, visualizations like TensorBoard and tfjs-vis make it possible to help see what's going on inside of their models.

Coupling these with just-in-time strategic pointers, such as whether to adapt a pre-trained model or to build one from scratch, or diagnostic checks, like practical tips to “decrease learning rate” if the model is not converging, could help users acquire and make use of practical strategies. These tips could serve as an intermediate scaffolding layer that helps demystify the math theory underlying ML into developer-friendly terms.

Support for Learning-by-Doing
Finally, even though ML frameworks are not traditional learning platforms, software developers are indeed treating them as lightweight vehicles for learning-by-doing. For example, one survey respondent appreciated when conceptual support was tightly interwoven into the framework, rather than being a separate resource: “...the small code demos that you can edit and run right there. Really helps basic understanding.” Another explained that “I prefer learning by doing, so I would like to see more tutorials, examples” embedded into ML frameworks. Some found it difficult to take a formal online course, and would rather learn in bite-sized pieces through hands-on tinkering: “Due to the rest of life, I have to fit learning into small 5-15 minute blocks.”

Given these desires to learn-by-doing, ML frameworks may need to more clearly distinguish between a spectrum of resources aimed at different levels of expertise. Although many frameworks already have “hello world” tutorials, to properly set expectations these frameworks could more explicitly differentiate between API (syntax-specific) onboarding and ML (conceptual) onboarding.

Looking Forward
Ultimately, as the frontiers of ML are still evolving, providing practical, conceptual tips for software developers and creating a shared reservoir of community-curated best practices can benefit ML experts and novices alike. Hopefully, these research findings pave the way for more user-centric designs of future ML frameworks.

Acknowledgements
This work would not have been possible without Yannick Assogba, Sandeep Gupta, Lauren Hannah-Murphy, Michael Terry, Ann Yuan, Nikhil Thorat, Daniel Smilkov, Martin Wattenberg, Fernanda Viegas, and members of PAIR and TensorFlow.js.

Source: Google AI Blog


Toward Human-Centered Design for ML Frameworks



As machine learning (ML) increasingly impacts diverse stakeholders and social groups, it has become necessary for a broader range of developers — even those without formal ML training — to be able to adapt and apply ML to their own problems. In recent years, there have been many efforts to lower the barrier to machine learning, by abstracting complex model behavior into higher-level APIs. For instance, Google has been developing TensorFlow.js, an open-source framework that lets developers write ML code in JavaScript to run directly in web browsers. Despite the abundance of engineering work towards improving APIs, little is known about what non-ML software developers actually need to successfully adopt ML into their daily work practices. Specifically, what do they struggle with when trying modern ML frameworks, and what do they want these frameworks to provide?

In “Software Developers Learning Machine Learning: Motivations, Hurdles, and Desires,” which received a Best Paper Award at the IEEE conference on Visual Languages and Human-Centric Computing (VL/HCC), we share our research on these questions and report the results from a large-scale survey of 645 people who used TensorFlow.js. The vast majority of respondents were software or web developers, who were fairly new to machine learning and usually did not use ML as part of their primary job. We examined the hurdles experienced by developers when using ML frameworks and explored the features and tools that they felt would best assist in their adoption of these frameworks into their programming workflows.

What Do Developers Struggle With Most When Using ML Frameworks?
Interestingly, by far the most common challenge reported by developers was not the lack of a clear API, but rather their own lack of conceptual understanding of ML, which hindered their ability to successfully use ML frameworks. These hurdles ranged from the initial stages of picking a good problem to which they could apply TensorFlow.js (e.g., survey respondents reported not knowing “what to apply ML to, where ML succeeds, where it sucks”), to creating the architecture of a neural net (e.g., “how many units [do] I have to put in when adding layers to the model?”) and knowing how to set and tune parameters during model training (e.g., “deciding what optimizers, loss functions to use”). Without a conceptual understanding of how different parameters affect outcomes, developers often felt overwhelmed by the seemingly infinite space of parameters to tune when debugging ML models.

Without sufficient conceptual support, developers also found it hard to transfer lessons learned from “hello world” API tutorials to their own real-world problems. While API tutorials provide syntax for implementing specific models (e.g., classifying MNIST digits), they typically don't provide the underlying conceptual scaffolding necessary to generalize beyond that specific problem.

Developers often attributed these challenges to their own lack of experience in advanced mathematics. Ironically, despite the abundance of non-experts tinkering with ML frameworks nowadays, many felt that ML frameworks were intended for specialists with advanced training in linear algebra and calculus, and thus not meant for general software developers or product managers. This semblance of imposter syndrome may be fueled by the prevalence of esoteric mathematical terminology in API documentation, which may unintentionally give the impression that an advanced math degree is necessary for even practical integration of ML into software projects. Though math training is indeed beneficial, the ability to grasp and apply practical concepts (e.g., a model’s learning rate) to real-world problems does not require an advanced math degree.

What Do Developers Want From ML Frameworks?
Developers who responded to our survey wanted ML frameworks to teach them not only how to use the API, but also the unspoken idioms that would help them to effectively apply the framework to their own problems.

Pre-made Models with Explicit Support for Modification
A common desire was to have access to libraries of canonical ML models, so that they could modify an existing template rather than creating new ones from scratch. Currently, pre-trained models are being made more widely available in many ML platforms, including TensorFlow.js. However, in their current form, these models do not provide explicit support for novice consumption. For example, in our survey, developers reported substantial hurdles transferring and modifying existing model examples to their own use cases. Thus, the provision of pre-made ML models should also be coupled with explicit support for modification.

Synthesize ML Best Practices into Just-in-Time Hints
Developers also wished frameworks could provide ML best practices, i.e., practical tips and tricks that they could use when designing or debugging models. While ML experts may acquire heuristics and go-to strategies through years of dedicated trial and error, the mere decision overhead of “which parameter should I try tuning first?” can be overwhelming for developers who aren't ML experts. To help narrow this broad space of decision possibilities, ML frameworks could embed tips on best practices directly into the programming workflow. Currently, visualizations like TensorBoard and tfjs-vis make it possible to help see what's going on inside of their models.

Coupling these with just-in-time strategic pointers, such as whether to adapt a pre-trained model or to build one from scratch, or diagnostic checks, like practical tips to “decrease learning rate” if the model is not converging, could help users acquire and make use of practical strategies. These tips could serve as an intermediate scaffolding layer that helps demystify the math theory underlying ML into developer-friendly terms.

Support for Learning-by-Doing
Finally, even though ML frameworks are not traditional learning platforms, software developers are indeed treating them as lightweight vehicles for learning-by-doing. For example, one survey respondent appreciated when conceptual support was tightly interwoven into the framework, rather than being a separate resource: “...the small code demos that you can edit and run right there. Really helps basic understanding.” Another explained that “I prefer learning by doing, so I would like to see more tutorials, examples” embedded into ML frameworks. Some found it difficult to take a formal online course, and would rather learn in bite-sized pieces through hands-on tinkering: “Due to the rest of life, I have to fit learning into small 5-15 minute blocks.”

Given these desires to learn-by-doing, ML frameworks may need to more clearly distinguish between a spectrum of resources aimed at different levels of expertise. Although many frameworks already have “hello world” tutorials, to properly set expectations these frameworks could more explicitly differentiate between API (syntax-specific) onboarding and ML (conceptual) onboarding.

Looking Forward
Ultimately, as the frontiers of ML are still evolving, providing practical, conceptual tips for software developers and creating a shared reservoir of community-curated best practices can benefit ML experts and novices alike. Hopefully, these research findings pave the way for more user-centric designs of future ML frameworks.

Acknowledgements
This work would not have been possible without Yannick Assogba, Sandeep Gupta, Lauren Hannah-Murphy, Michael Terry, Ann Yuan, Nikhil Thorat, Daniel Smilkov, Martin Wattenberg, Fernanda Viegas, and members of PAIR and TensorFlow.js.

Source: Google AI Blog


Using Deep Learning to Improve Usability on Mobile Devices



Tapping is the most commonly used gesture on mobile interfaces, and is used to trigger all kinds of actions ranging from launching an app to entering text. While the style of clickable elements (e.g., buttons) in traditional desktop graphical user interfaces is often conventionally defined, on mobile interfaces it can still be difficult for people to distinguish tappable versus non-tappable elements due to the diversity of styles. This confusion can lead to false affordances (e.g., a feature that could be mistaken for a button) and a lack of discoverability that can lead to user frustration, uncertainty, and errors. To avoid this, interface designers can conduct a study or a visual affordance test to help clarify the tappability of items in their interfaces. However, such studies are time-consuming and their findings are often limited to a specific app or interface design.

In our CHI'19 paper, "Modeling Mobile Interface Tappability Using Crowdsourcing and Deep Learning", we introduced an approach for modeling the usability of mobile interfaces at scale. We crowdsourced a task to study UI elements across a range of mobile apps to measure the perceived tappability by a user. Our model predictions were consistent with the user group at the ~90% level, demonstrating that a machine learning model can be effectively used to estimate the perceived tappability of interface elements in their design without the need for expensive and time consuming user testing.
Predicting Tappability with Deep Learning
Designers often use visual properties such as the color or depth of an element to signify its availability for interaction on interfaces, e.g., the blue color and underline of a link. While these common signifiers are useful, it is not always clear when to apply them in each specific design setting. Furthermore, with design trends evolving, traditional signifiers are constantly being altered and challenged, potentially causing user uncertainty and mistakes.

To understand how users perceive this changing landscape, we analyzed the potential signifiers affecting tappability in real mobile apps—element type (e.g., check boxes, text boxes, etc.), location, size, color, and words. We started by crowdsourcing volunteers to label the perceived clickability of ~20,000 unique interface elements from ~3,500 apps. With the exception of text boxes, type signifiers yielded low uncertainty in user perceived tappability. The location signifier refers to the position of a feature on the screen and is informed by the common layout design in mobile apps, as demonstrated in the figure below.
Heatmaps displaying the accuracy of tappable and non-tappable elements by location, where warmer colors represent areas of higher accuracy. Users labeled non-tappable elements more accurately towards the upper center of the interface, and tappable elements towards the bottom center of the interface.
The impact of element size was relatively weak, but did indicate confusion in the case of large non-tappable elements. Users showed a tendency to bright colors and short word counts for tappable elements, though word semantics also played a significant role.

We used these labels to train a simple deep neural network that predicts the likelihood that a user will perceive an interface element as tappable versus non-tappable. For a given element of the interface, the model uses a range of features, including the spatial context of the element on the screen (location), the semantics and functionality of the element (words and type), and the visual appearance (size as well as raw pixels). The neural network model applies a convolutional neural network (CNN) to extract features from raw pixels, and uses learned semantic embeddings to represent text content and element properties. The concatenation of all these features are then fed to a fully-connected network layer, the output of which produces a binary classification of an element's tappability.

Evaluation of the Model
The model allowed us to automatically diagnose mismatches between the tappability of each interface element as perceived by a user—predicted by our model—and the intended or actual tappable state of the element specified by the developer or designer. In the example below, our model predicts that there is a 73% chance that a user would think the labels such as "Followers" or "Following" are tappable, while these interface elements are in fact not programmed to be tappable.
To understand how our model behaves compared to human users, particularly when there is ambiguity in human perception, we generated a second, independent dataset by crowdsourcing an effort among 290 volunteers to label each of 2,000 unique interface elements with respect to their perceived tappability. Each element was labeled independently by five different users. We found that more than 40% of the elements in our sample were labeled inconsistently by volunteers. Our model matches this uncertainty in human perception quite well, as demonstrated in the figure below.
The scatterplot of the tappability probability predicted by the model (the Y axis) versus the consistency in the human user labels (the X axis) for each element in the consistency dataset.
When users agree an element's tappability, our model tends to give a more definite answer—a probability close to 1 for tappable and close to 0 for not tappable. When workers are less consistent on an element (towards the middle of the X axis), our model is also less certain about the decision. Overall, our model achieved reasonable accuracy of matching human perception in identifying tappable UI elements with a mean precision of 90.2% and recall of 87.0%.

Predicting tappability is merely one example of what we can do with machine learning to solve usability issues in user interfaces. There are many other challenges in interaction design and user experience research where deep learning models can offer a vehicle to distill large, diverse user experience datasets and advance scientific understandings about interaction behaviors.

Acknowledgements
This research was a joint work of Amanda Swangson, summer intern at Google, and Yang Li, a Research Scientist in Deep Learning and Human Computer Interaction.

Source: Google AI Blog


User experience tips to help you design your app to engage users and drive conversions

By Jenny Gove, Senior Staff UX Researcher, Google Play

We know you work hard to acquire users and grow your customer base, which can be challenging in a crowded market. That's why we've heard from many of you that you find tools like store listing experiments and universal app campaigns are valuable. It's equally important to keep customers engaged from the beginning. Great design and delightful user experiences are fundamental to doing just that.

We partnered with AnswerLab to conduct comprehensive user experience research across a variety of verticals; including e-commerce, insurance, travel, food ordering, ticket sales and services, and financial management. The resulting insights may help you increase engagement and conversion by providing guidance on useful and usable functionality.

The best app experiences seamlessly guide users through their tasks with efficient navigation, search, forms, registration and purchasing. They provide great e-commerce facilities and integrate effective ordering and payment systems. Ultimately, an engaging app begins with attention to usability in all of these areas. Learn tips on:

  • Navigation & Exploration
  • In-App Search
  • Commerce & Conversions
  • Registration
  • Form Entry
  • Usability and Comprehension

You can read the full article, design your app to drive conversions, on the Android Developers website, complete with links to developer resources. Also get the Playbook for Developers app to stay up-to-date with features and best practices that will help you grow a successful business on Google Play.

How useful did you find this blogpost?

Three ways AdMob make UX a priority with Rewarded video

At AdMob, we know how important user experience is to creating a great app. Non-intrusive ads make for happy users and higher ad engagement rates, and we think rewarded videos are the next step in meeting and exceeding a users expectations for a great experience. Here are three ways AdMob can help improve your user experience with rewarded video ads.

1. Consistency

With reliable UX patterns, you’ll form clear breaks in your apps where users are expecting to see engaging ads. And when it comes to rewarded video, ads that appear at just the right moment will provide a mutual benefit to both user and publisher. For example, in a gaming app, you might want to reward your user with extra lives in exchange for watching a video at a moment when the game would otherwise end. Your users will thank you for it!

2. No tricks

A rewarded video is still an ad, so it is important to make the value exchange between user and publisher transparent. Users are presented with a clear description of the action required from them and what they will get in return, before choosing whether to opt-in to view the rewarded video. And while advertisers pay for an install, the option to install an app as a result of watching a rewarded video is not incentivized: the user is rewarded for viewing the message and the action to install is optional.

3. Optimum engagement

We use Firebase Analytics to help you understand your audience better. With Firebase, A/B testing is simple and can ensure you are getting the most out of your exchange with the user. Optimizing reward value, frequency of ads, and ad placement in app all contribute to a successful rewarded strategy that results in happy users.

Until next time, be sure to stay connected on all things AdMob by following our Twitter, LinkedIn and Google+ pages.

Source: Inside AdMob


The Native Way: 4 Ways to Make UX a Priority with Native Ads

At AdMob, we know how important user experience is to creating a great app. Consistent patterns, refreshing simplicity and polished, thoughtful design make for happy users and potentially higher ad engagement rates. Native advertising is the next step in meeting user’s UX expectations. Native ads are less jarring than traditional ads and fit in with app content more naturally, providing a better user experience. Now, that’s smart business.

Here are four ways you can make UX a priority both within your app and ad experience.

1. Consistency

Sometimes it’s good to be predictable. By sticking to consistent UX patterns in your app, you can help users focus less on navigating and more on your app’s content and value. Don’t surprise your user with new elements – stick to a steady user flow like swiping left on a news app to exit an article or navigate the headline feed. Use consistent design elements like dedicated font sizes, colors, buttons, and screen sizes.

Same goes for your users’ ad experience. With reliable UX patterns, you’ll form clear breaks in your apps where users are expecting to see engaging ads. In the example of the news app, you might also want to use that swipe left feature to allow users to seamlessly dismiss ads. This also applies to ad styling. For example, if you use 14px, Lato, bold, dark grey for prominent text, then use that font for your ad’s headline, as well. The result? Users will expect that styling is dedicated to important text.

2. Clarity

Nobody likes clutter. Simplicity in your app says you’re not wasting your user’s time by throwing every possible option at them, without thinking about what they really need. Simplify your app by uncluttering your screen, writing concise copy, keeping design legible and well-spaced and providing single call-to-action buttons where possible. Likewise, clean, beautiful, single call-to-action ads will communicate that you understand your users and help gain their trust.

3. No tricks

Let’s be clear, a native ad is still an ad. Don’t try to distort or overlap ad components (there’s no quicker way to offend a user!). At Google, we care about building trust in the app advertising ecosystem. For instance, we have extended our accidental click protections to native ad formats (including fast clicks and edge clicks) in order to prevent users from a slip of the finger and deliver greater value to advertisers.

4. Thoughtful design

Thoughtful details in your app are important to let your users know you care. That means sharp imagery, curated fonts, specced margins and quick loading times. It’s important to use the same level of care and polish for small details in your native ads design. Simple, intuitive and well-designed native ads can help you say; ‘thanks’ to your valuable user base.

Until next time, be sure to stay connected on all things AdMob by following our Twitter, LinkedIn and Google+ pages.

Posted by Chris Jones, Social Team, AdMob.

Source: Inside AdMob


How to measure translation quality in your user interfaces



Worldwide, there are about 200 languages that are spoken by at least 3 million people. In this global context, software developers are required to translate their user interfaces into many languages. While graphical user interfaces have evolved substantially when compared to text-based user interfaces, they still rely heavily on textual information. The perceived language quality of translated user interfaces (UIs) can have a significant impact on the overall quality and usability of a product. But how can software developers and product managers learn more about the quality of a translation when they don’t speak the language themselves?

Key information in interaction elements and content are mostly conveyed through text. This aspect can be illustrated by removing text elements from a UI, as shown in the the figure below.
Three versions of the YouTube UI: (a) the original, (b) YouTube without text elements, and (c) YouTube without graphic elements. It gets apparent how the textless version is stripped of the most useful information: it is almost impossible to choose a video to watch and navigating the site is impossible.
In "Measuring user rated language quality: Development and validation of the user interface Language Quality Survey (LQS)", recently published in the International Journal of Human-Computer Studies, we describe the development and validation of a survey that enables users to provide feedback about the language quality of the user interface.

UIs are generally developed in one source language and translated afterwards string by string. The process of translation is prone to errors and might introduce problems that are not present in the source. These problems are most often due to difficulties in the translation process. For example, the word “auto” can be translated to French as automatique (automatic) or automobile (car), which obviously has a different meaning. Translators might chose the wrong term if context is missing during the process. Another problem arises from words that behave as a verb when placed in a button or as a noun if part of a label. For example, “access” can stand for “you have access” (as a label) or “you can request access” (as a button).

Further pitfalls are gender, prepositions without context or other characteristics of the source text that might influence translation. These problems sometimes even get aggravated by the fact that translations are made by different linguists at different points in time. Such mistranslations might not only negatively affect trustworthiness and brand perception, but also the acceptance of the product and its perceived usefulness.

This work was motivated by the fact that in 2012, the YouTube internationalization team had anecdotal evidence which suggested that some language versions of YouTube might benefit from improvement efforts. While expert evaluations led to significant improvements of text quality, these evaluations were expensive and time-consuming. Therefore, it was decided to develop a survey that enables users to provide feedback about the language quality of the user interface to allow a scalable way of gathering quantitative data about language quality.

The Language Quality Survey (LQS) contains 10 questions about language quality. The first five questions form the factor “Readability”, which describes how natural and smooth to read the used text is. For instance, one question targets ease of understanding (“How easy or difficult to understand is the text used in the [product name] interface?”). Questions 6 to 9 summarize the frequency of (in)consistencies in the text, called “Linguistic Correctness”. The full survey can be found in the publication.

Case study: applying the LQS in the field

As the LQS was developed to discover problematic translations of the YouTube interface and allow focused quality improvement efforts, it was made available in over 60 languages and data were gathered for all these versions of the YouTube interface. To understand the quality of each UI version, we compared the results for the translated versions to the source language (here: US-English). We inspected first the global item, in combination with Linguistic Correctness and Readability. Second, we inspected each item separately, to understand which notion of Linguistic Correctness or Readability showed worse (or better) values. Here are some results:
  • The data revealed that about one third of the languages showed subpar language quality levels, when compared to the source language.
  • To understand the source of these problems and fix them, we analyzed the qualitative feedback users had provided (every time someone selected the lower two end scale points, pointing at a problem in the language, a text box was surfaced, asking them to provide examples or links to illustrate the issues).
  • The analysis of these comments provided linguists with valuable feedback of various kinds. For instance, users pointed to confusing terminology, untranslated words that were missed during translation, typographical or grammatical problems, words that were translated but are commonly used in English, or screenshots in help pages that were in English but needed to be localized. Some users also pointed to readability aspects such as sections with old fashioned or too formal tone as well as too informal translations, complex technical or legal wordings, unnatural translations or rather lengthy sections of text. In some languages users also pointed to text that was too small or criticized the readability of the font that was used.
  • In parallel, in-depth expert reviews (so-called “language find-its”) were organized. In these sessions, a group of experts for each language met and screened all of YouTube to discover aspects of the language that could be improved and decided on concrete actions to fix them. By using the LQS data to select target languages, it was possible to reduce the number of language find-its to about one third of the original estimation (if all languages had been screened).
LQS has since been successfully adapted and used for various Google products such as Docs, Analytics, or AdWords. We have found the LQS to be a reliable, valid and useful tool to approach language quality evaluation and improvement. The LQS can be regarded as a small piece in the puzzle of understanding and improving localization quality. Google is making this survey broadly available, so that everyone can start improving their products for everyone around the world.