Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

dataAvailable

States whether the classification request has already been processed. If processed data exists, Verity returns the results from the database. If not Verity starts a new processing request.

status

The current processing status of the analysis request.

pageUrl
Url

The URL of the page, video, image, or text analyzed by Verity, as applicable.

uuid

A unique identifier generated for the classification request.

languageCode

The standard ISO 639-1 code for the language of the content. Verity currently supports content in:

  • English

  • Japanese

Verity video analysis currently supports English only.

Note: If Verity detects an unsupported language, a status of NOT_SUPPORTED is returned.

iab v1

The IAB v1.0 categories identified for the page.

IAB v1.0 categories are widely adopted in programmatic and Real-Time-Bidding (RTB) ad marketplaces. IAB v1.0 categories are organized into the following tiers:

  • Tier 1  identifies broad level categories, such as Pets, defined with the following targeting depths:

    • Category/portal

    • Site section

    • Page

  • Tier 2 and greater identify more granular categories, such as Dogs, and are nested under Tier 1 categories. 

Refer to the Verity Taxonomy document for a listing of IAB v1 categories.

Verity video analysis does not support IAB v1.0 categories.

iab v2

The IAB v2.0 categories identified for the content.

The IAB defined a more granular content taxonomy in IAB Tech Lab Content Taxonomy v2.0 (released in 2017). IAB v2.0 defines additional content classifications and restructures existing IAB v1.0 classifications. 

Each IAB v2.0 category has a unique three-digit ID, and is structured into a tiered hierarchy with up to 4 tiers of categories.

Refer to the Verity Taxonomy for a listing of IAB v2 categories.

keywords

The top Keywords identified for the content, listed in order of prominence.

safe

The final aggregated Brand Safety summary result for the content.  

If any threat classifications are identified with a high-risk level, the safe value is false and the content is considered unsafe.

If no (or low-risk) threat classifications are identified, the safe value is true, and the content is considered safe.

threats

Threat categories are part of GumGum’s brand safety taxonomy. GumGum classifies content into nine threat categories. For a complete list of Threat category IDs and Names, refer to Threat Categories in the Verity Taxonomy document.

To detect possible threats, Verity analyzes and scores all the extracted content. Verity then correlates the scores to determine a per-category threat risk-level for the content.

Possible threat category risk-levels are:

  • VERY_HIGH

  • HIGH

  • MODERATE

  • LOW

  • VERY_LOW

events

The Events classifier identifies seasonal events such as the Olympics (e.g. annual, bi-annual, 4-yearly events) for the purposes of contextual ad targeting. 

Verity lists up to five Event categories, in order of prominence. For a complete list of Event category IDs and Names, refer to Event Categories in the Verity Taxonomy document.

Verity video analysis does not support Events.

sentiments

Identifies and extracts opinions within digital content. 

The positive, neutral, and negative levels of sentiment expressed in the content are evaluated. For contextual targeting purposes, a sentiment level of neutral or positive is generally recommended.

processedAt

The date and time of the classification. 

 

Classification

...

Approaches

Verity analyses threat, contextual categories, keywords and sentiment results in different ways. The data models Verity implements vary for different purposes and are fine-tuned and optimized on an ongoing basis.  

...

IAB Content Categories 
v1 and v2

...

Content classifiers predict the likelihood that the given content belongs to one or more IAB categories.

...

Threats

...

Machine learning predicts threat categories by applying data models trained on collections of various kinds of threatening content. 

...

Events

...

Machine learning predicts event categories by applying data models trained on large-scale collections of event-related content pages.

...

Keywords

...

A set of rules derives, scores, and ranks the most important keywords from content.

...

Sentiments

Partners should be aware that, as with any machine learning technology, performance is highly dependent on the specific data set being analyzed, consequently no single error rate nor range exists. Verity handles proprietary data sets and cannot disclose proprietary partner result data.

Verity calculates and measures error rates in the form of Precision, Recall, F1, and F2 for each machine learning model. As part of this process, GumGum:

  • Engages data annotation leveraging human-annotators to establish Ground Truth for various data sets.

  • Works with third-party vendors and research consultants to conduct relevancy testing.

The following sections outline the data models and scoring used for Brand Safety and Contextual Classification in Verity, and points to a relevant third-party study.

Brand Safety Classification and Scoring

Verity’s brand safety classification relies on GumGum’s threat data model. The threat model is trained on collections of various kinds of threatening content.

As brand safety and content classification serve different purposes, Verity considers different approaches for scoring brand safety versus content classification models. Both approaches use Recall scoring (e.g. out of all the images of weapons in a dataset, how many weapons were identified) and Precision scoring (e.g. the number of times an image identified as a weapon was actually a weapon).

Brand safety is a threat detection algorithm, so in this case Verity favors Recall over Precision. Data Scientists use Precision-Recall curves to maximize Recall with minimum loss in Precision, thereby maximizing the number of potential threats classified. 

Verity results comprise confidence levels for each Threat category. The confidence level represents the risk potential of unsafe content within a page, video, image, or text string.

In traditional statistical measures, confidence in observed results may be assessed according to the number of samples involved in a test. Larger scale sampling leads to a higher confidence score. However, Verity confidence levels are not related to the quantity of sample data.

The goal of Verity threat levels is to determine whether it is safe to display ads on a given page or video. For example, a result “confidence”: “VERY_LOW” should be interpreted as Verity identifying a very low risk for that category within the content, with a high level of confidence. 

Contextual Classification and Scoring

Verity analyses contextual categories, keywords and sentiment results using various methods and data models, outlined in the following table:

IAB Content Categories 
v1 and v2

Content classifiers predict the likelihood that the given content belongs to one or more IAB categories.

Events

Machine learning predicts event categories by applying data models trained on large-scale collections of event-related content pages.

Keywords

A set of rules derives, scores, and ranks the most important keywords.

Sentiments

Machine learning predicts the sentiment of each sentence on within content by applying models trained on content with varying tones of voice. Verity returns an aggregated breakdown of the proportion of sentences in the content that are positive, neutral or negative (referred to as Document Level Sentiment Analysis).

As brand safety and content classification serve different purposes, Verity considers different approaches for scoring brand safety versus content classification models. Both approaches use Recall scoring (e.g. out of all the images of weapons in a dataset, how many weapons were identified) and Precision scoring (e.g. the number of times an image identified as a weapon was actually a weapon).

...

as Document Level Sentiment Analysis).

Content classification is used for targeting purposes so Verity favors Precision over Recall. Data Scientists use Precision-Recall curves to maximize Recall Precision with minimum loss in PrecisionRecall, thereby maximizing the number of potential threats classified. Content classification is used for targeting purposes. In this case, GumGum favors Precision over Recall. Data Scientists use Precision recall curves to maximize Precision with minimum loss in Recall, thereby maximizing the accuracy of the classified targetsthe accuracy of the classified targets.

Contextual Intelligence Relevancy Study

GumGum participates in publicly available third-party media studies, such as the Comparison of Contextual Intelligence Vendors and Behavioral Targeting undertaken with the Dentsu Aegis Network in 2020. The study report found that:

GumGum Verity™ had the highest percentage of relevant pages across all four Contextual Intelligence vendors.

Partners may review the complete report, available from this link Understanding Contextual Relevance and Efficiency.

Verity and the 4A’s Brand Safety Floor

...