Verity processes videos by applying Computer Vision analysis to sampled image frames and Natural Language Processing (NLP) analysis to text transcribed from the video’s audio track.
Verity runs parallel analysis of both image and audio components of a video. Verity:
Samples frames from the video and sends them for image analysis
Extracts the audio track from the video and sends it for transcription from speech to text. Verity then enriches the text output with metadata (such as video title and description) and sends it for NLP analysis.
The results of these parallel analyses are merged using Verity’s proprietary merging logic.
A comprehensive analysis considering both auditory and visual elements of the video is returned.