Introduction to Attention Measurement

1 Background
2 What is the process?

Background

The Playground xyz Attention Measurement process uses a machine learning technique called supervised learning to develop a model that can predict attention on ads from information collected via a javascript tag. To create this model we require a training data set with what is known as a target variable (the thing we want to predict) and a set of features (other data points) with which we make the prediction.

Machine learning techniques typically result in a model that is a complex mathematical artefact. Evaluation of these models is not by understanding its internal decision process, but by looking at its performance. This means the number one priority is ensuring the model predicts attention with sufficient accuracy for our clients.

The supervised machine learning approach to large scale measurement is being increasingly used in advertising technology for large scale measurement. Recent examples include: Invalid Traffic Filtering, Contextual Topic Classification, Collaborative Segments, and, to our knowledge, the Facebook shop visits metric.

What is the process?

The Playground xyz attention measurement process involves 4 essential steps.

Collecting data through eye tracking panels
Building an attention model from the eye tracking data
Evaluating & Refining the Model for Business Critical Performance
Deploying that model into a tag based measurement solution.

Eye Tracking Panels

We collect data using digital workers who agree to allow us to capture facial images while they read media. Those facial images are fed through a machine learning algorithm that identifies the gaze fixation points every 100ms of their session. We overlay those gaze fixation points with data about the ad position on their screen to determine the gaze duration for each ad they are exposed to.

The gaze duration measurement becomes the target variable for our attention model.

To power the attention model we collect additional data points during the eye tracking session. These additional data points form the current 40 features used by the model, and fail into two categories: environmental and behavioural.

The environmental variables describe the page and other factors of the way the panellist consumes the media. Examples include the platform it is being viewed on (mobile, desktop), the type / size of ad (banner, mrec,etc), time of day of the browsing session and more.

The behavioural data is collected through a continuous event stream that captures the panellists' interaction with the page. Examples include their scrolling, clicking and navigation behaviour, how the ad moves through the page over time and more. This underlying event stream is aggregated to create behavioural features that describe the entire ad session.

Building the Attention Model

All of the features (environmental and behavioural) are collated into one record per ad exposure that we use to train a model that predicts the observed gaze duration from the eye tracking study. The machine learning algorithms we use are ensembles of tree models. This means we learn large numbers of independent tree structures and then combine these trees together to produce our final prediction. This machine learning technique has a well documented history of providing state-of-the-art performance on datasets with the properties of our attention data (non-deterministic target variable and heterogeneous input data).

We regularly retrain our models as we collect new panel data, onboard new ad formats and develop new features. As such the internal structure of the model changes over time. We manage that change by performing thorough evaluations of the performance of the model.

Evaluating and Refining the Model

Critical to the development of the attention model is the method we use to evaluate its performance. The method of evaluation informs our decisions about how to add and remove features, and changes we can make to improve the performance.

We use the following key criteria to evaluate our models

Evaluate on fresh hold out data (Data the model did not use in training).
Ensure we have a clean Separation of High and Low attention impressions
Ensure we obtain unbiased estimates of mean attention for specific Ad Formats and Platforms

The first criteria is best practice in predictive modelling, it means that when we evaluate we obtain metrics that will reflect real performance on new data.

The second two criteria relate to how our clients will use our models. Our clients need to know whether we can effectively identify a block of high or low performing inventory, and whether we can report mean attention metrics for campaigns and creatives that are reliable.

We use two methods to evaluate and compare models for these criteria.

Mean Attention Decile plots. These plots (like the one shown in Figure 1), illustrate the extent to which the model has ranked the impressions by attention. Each decile contains 10% of the test impressions after they have been ordered by predicted attention. We plot the mean predicted attention in this group, against the mean observed attention for those impressions. Figure 1 illustrates that we are able to make the required discrimination and that the mean attention within each group is well calibrated.
Ad Format + Platform Breakdowns. We analyse performance on our test data by breaking it into Ad Format and Platform specific groups. We then look at the deciles plots again, which demonstrates that we can make the same discrimination on specific types of campaigns. Again we look at how well calibrated the model is by comparing the mean predicted attention against the mean observed attention in these groupings.

Deploying the Attention Model

To use the attention model at scale for measuring attention across campaigns we have 2 approaches:

Attention Measurement Tag (AMT)
We’ve developed a javascript tag that collects the same behavioural and contextual data as is done in the eye tracking study. This tag is attached to a live campaign and will return a raw event stream of user behaviour to our logging systems. The data collected is then aggregated in the same way used for model development so that we have an identically structured set of features that we feed into the model.
Campaign Reports
Campaign reports can also be used to score Attention. We offer a range of API or Cloud Transfer integrations to ingest the required reporting data to our logging systems. Campaign report scoring supports scoring of both live and historical impressions.

The aggregated event data is then scored with the Attention model to produce the Attention reporting and insights presented within the Attention Intelligence Platform.