...
Once analysis is complete, Verity returns a detailed report featuring a brand safety score for the content, along with contextual targeting categories, prominent keywords, event , and sentiment categories. Verity supports the contextual targeting categories defined in the Interactive Advertising Bureau (IAB) Content Taxonomy v1.0 and 2.0.
...
Verity API Gateway: The Verity API Gateway receives a page URL request, authenticates the client request and passes the URL to the Verity API.
Verity API: The Verity API initiates the request and then orchestrates the Content Extractor, Text and Image analyses systems to extract the page data and perform the analyses.
Content Extractor: The Content Extractor accepts page requests sent by the Verity API from a queue. The Content Extractor loads the page URL, downloads the page title, metadata, and HTML and saves it as a text string in the database. If a prominent image is identified for the page, the Content Extractor downloads and saves the image to the database with identification information for the associated page. The Content Extractor passes the Page URL and image information on for text and image analysis.
Text Analysis: The Text Analysis engine applies Natural Language Processing (NLP) for text classification (e.g. IAB and Threat categories) and information extraction (e.g. Keywords).
Image analysis: The Image Analysis engine houses GumGum’s core Computer Vision capabilities in a modular architecture. The Image Analysis component passes images through multiple data models to determine their classification information.
Verity Report: The Verity API retrieves the text and image classification results, applies weighting and merging logic to the results, and returns the final Verity page report to the client.
...
Verity Video Analysis
Verity analyzes videos for the purposes of content-level contextual targeting and brand safety.
...
The Verity video analysis process involves the following core components:
...
Verity API Gateway: The Verity API Gateway receives a video URL request, authenticates the client request and passes the URL to the Verity API.
Verity API: The Verity API passes the request to the Video Transcribe component to orchestrate video transcription and optical character recognition.
Video Transcribe: Video Transcribe downloads the video from the request URL and stores the video. Verity API initiates a transcription job with the transcription service. If the video is in MU38 format it is transcoded prior to transcription. Once the transcription service finishes a job it sends the results back to the object storage service, triggering a notification to the Verity API.
Verity API/OCR service:The Verity API verifies if the transcription results contain a sufficient sample of words. If not, Verity API requests Video Transcribe to initiate an OCR job. Upon OCR job completion, Verity API receives a notification and retrieves the OCR text results. Verity API passes the concatenated text results (comprising transcription, OCR, Client metadata title and description) to Verity Text Processing.
Verity Text Processing: The Text Processing engine processes the video transcription, OCR, client metadata title and description by applying Natural Language Processing (NLP) for text classification (e.g. IAB Content Categories v2.0 and Threat categories) and information extraction (e.g. Keywords).
Verity Report: The Verity API accepts the text analysis results, applies result weighting and merging logic, then returns the final video analysis Verity Report to the client.
...
Verity Machine learning predicts threat categories by applying data models trained on collections of various kinds of threatening content. Verity’s sophisticated Computer Vision machine learning can identify threatening scenes, such as natural disasters or accidents. Object Object detection picks out potentially threatening objects within an image, such as weapons, exposed skin or drinks.
...
Verity predicts the sentiment of each sentence within content (referred to as Document Level Sentiment Analysis), and returns an aggregated breakdown of the proportion of sentences within content that are positive, neutral or negative. Sentiment thresholds are entirely up to the Publisher to set. Across the web, “neutral” is the most common primary sentiment classification.
Verity Classification and Brand Safety Report
The Verity report includes complete brand safety, keyword, and categorization analysis data for the requested content. Each report contains the following analysis results:
dataAvailable | States whether the classification request has already been processed. If it has, Verity returns the results from the database. If not Verity starts a new processing request. |
---|---|
status | The current processing status of the analysis request. |
pageUrl | The URL of the page or video analyzed by Verity, as applicable. |
languageCode | The standard ISO 639-1 code for the language of the content. Verity currently supports content in:
Verity video analysis currently supports English only. Note: If Verity detects an unsupported language, a status of NOT_SUPPORTED is returned. |
iab v1 | The IAB v1.0 categories identified for the page.
IAB v1.0 categories are widely adopted in programmatic and Real-Time-Bidding (RTB) ad marketplaces. IAB v1.0 categories are organized into the following tiers:
Refer to the Verity Taxonomy document for a listing of IAB v1 categories. Verity video analysis does not support IAB v1.0 categories. |
iab v2 | The IAB v2.0 categories identified for the content. The IAB defined a more granular content taxonomy in IAB Tech Lab Content Taxonomy v2.0 (released in 2017). IAB v2.0 defines additional content classifications and restructures existing IAB v1.0 classifications. Each IAB v2.0 category has a unique three-digit ID, and is structured into a tiered hierarchy with up to 4 tiers of categories. Refer to the Verity Taxonomy for a listing of IAB v2 categories. |
keywords | The top Keywords identified for the content, listed in order of prominence. |
safe | The final aggregated Brand Safety summary result for the content. If any threat classifications are identified with a high-risk level, the safe value is false and the content is considered unsafe. If no (or low-risk) threat classifications are identified, the safe value is true, and the content is considered safe. |
threats | Threat categories are part of GumGum’s brand safety taxonomy. GumGum classifies content into nine threat categories. For a complete list of Threat category IDs and Names, refer to Threat Categories in the Verity Taxonomy document. To detect possible threats, Verity analyzes and scores all the extracted content. Verity then correlates the scores to determine a per-category threat risk-level for the content. Possible threat category risk-levels are:
|
events | The Events classifier identifies seasonal events such as the Olympics (e.g. annual, bi-annual, 4-yearly events) for the purposes of contextual ad targeting. Verity lists up to five Event categories, in order of prominence. For a complete list of Event category IDs and Names, refer to Event Categories in the Verity Taxonomy document. Verity video analysis does not support Events. |
sentiments | Identifies and extracts opinions within digital content. The positive, neutral, and negative levels of sentiment expressed in the content are evaluated. For contextual targeting purposes, a sentiment level of neutral or positive is generally recommended. |
processedAt | The date and time of the classification. |
...
The 4A’s, the leading trade organization for marketing communications agencies, defines the Advertising Assurance Brand Safety Floor and Brand Suitability Framework (revised in May 2020). The following table details the mapping between the 4A’s Brand Safety Floor and GumGum’s threat categories.
4A’s Floor | GumGum’s Verity brand safety categories | ||
---|---|---|---|
Category | Definition | Category | |
1 Adult & Explicit Sexual Content | Illegal sale, distribution, and consumption of child pornography Explicit or gratuitous depiction of sexual acts, and/or display of genitals, real or animated | GGT4 | Sexual; sexually charged |
2 Arms & Ammunition | Promotion and advocacy of Sale of illegal arms, rifles, and handguns Instructive content on how to obtain, make, distribute, or use illegal arms Glamorization of illegal arms for the purpose of harm to others Use of illegal arms in unregulated environments | GGT1 | Violence and gore |
GGT2 | Illegal/criminal | ||
3 Crime & Harmful acts to individuals and Society and Human Rights Violations | Graphic promotion, advocacy, and depiction of willful harm and actual unlawful criminal activity – Explicit violations/demeaning offenses of Human Rights (e.g. human trafficking, slavery, self harm, animal cruelty etc.), | GGT1 | Violence and gore |
GGT2 | Illegal/criminal | ||
4 Death, Injury or Military Conflict | Promotion or advocacy of Death or Injury Incendiary content provoking, enticing, or evoking military aggression Live action footage/photos of military actions & genocide or other war crimes | GGT1 | Violence and gore |
GGT9 | Illness/medical | ||
5 Online piracy | Pirating, Copyright infringement, & Counterfeiting. | GGT8 | Malware |
Note: GumGum Verity classifies content that covers the topics of piracy, copyright infringement, or counterfeiting. Verity does not consider whether the content itself was pirated, counterfeited, or infringes on copyright. | |||
6 Hate speech & acts of aggression | Unlawful acts of aggression based on race, nationality, ethnicity, religious affiliation, gender, or sexual image or preference. Behavior or commentary that incites such hateful acts, including bullying. | GGT6 | Hate; hate speech, harassment and cyberbullying |
7 Obscenity and Profanity, including language, gestures, and explicitly gory, graphic or repulsive content intended to shock and disgust | Excessive use of profane language or gestures and other repulsive actions with the intent to shock, offend, or insult. | GGT5 | Obscene; profanity/vulgarity |
8 Illegal Drugs/Tobacco/ | Promotion or sale of illegal drug use – including abuse of prescription drugs. Federal jurisdiction applies, but allowable where legal local jurisdiction can be effectively managed.
Promotion and advocacy of tobacco and eCigarette (Vaping) & Alcohol use to minors. | GGT3 | Drugs and alcohol |
9 Spam or Harmful Content | Malware/Phishing. | GGT8 | Malware and phishing |
10 Terrorism | Promotion and advocacy of graphic terrorist activity involving defamation, physical and/or emotional harm of individuals, communities, and society. | GGT1 | Violence and gore (both text and image) |
11 DebatedSensitive Social Issue/ Violations of Human Rights | Insensitive, irresponsible and harmful treatment of debated social issues and related acts intended to demean a particular group or incite greater conflict. | GGT6 | Hate; hate speech, harassment and cyberbullying. |
GGT2 | Illegal; criminal | ||
The 4A’s floor categories do not map to this GumGum Threat category. | GGT7 | Disasters |
...
The Verity team constantly runs A/B testing to evaluate alternative data models and competitor results. On a quarterly basis, Verity also maintains a Rolling KPI quality check where URLs are collected randomly from Publisher domains and added to a Gold Standard Data Set.
The URLs are human-annotated for threat and contextual classifications using both individual annotators and data annotation platforms. The Verity team runs classification processes, checks the results, and determines remediation or enhancement steps.
...
Verity applies logic to identify the prominent image on a web page for analysis. Additional images on the page may be subject to image extraction limitations. Supported image formats are:
|
|
|
Video Data Analyzed
The Verity Video analysis pipeline processes and analyzes video content and metadata, specifically:
Audio
Transcription of the video's audio track. The maximum transcription length supported is 14400 seconds.OCR
Text and cursive text detected in the video frames. OCR is included in the process when the video transcription
...
yields fewer than 50 words.
Metadata and Title
Page title and metadata.
Supported formats are MPEG-4, MOV, MP3, FLAC, and M3U8. The maximum video size is 2 GB.
Verity Does Not Process User Information
...