Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

Verity processes videos by applying computer vision analysis to sampled image frames, plus text analysis of the video’s transcribed audio track.

  • Verity leverages audio transcription to take audio files (standalone or extracted from video) to convert speak-to-text. The output of this process is then sent to Verity's NLP machine learning models for classification. For video analysis, Verity leverages audio transcription for speech-to-text as well as frame sampling from the video content itself, sending the individual frames captured to Verity's Computer Vision models for classification. In parallel, the speech-to-text output is enriched with any available video metadata (title, description) and sent to Verity's NLP machine learning models for classification. The outputs of these parallel tracks is then merged in our proprietary merging logic and a comprehensive analysis considering both auditory and visual elements of the video is returned. 

...

standalone audio tracks (or audio tracks extracted from video) by transcribing the audio to text. The speech-to-text output is enriched with any available

...

metadata (such as title

...

and description)

...

then sent to

...

Verity’s Natural Language Processing (NLP) machine learning models for classification.

...