Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

Verity processes audio by transcribing and video using transcription services aVerity uses audio transcription to take audio files extracted from a video(standalone or ) to convert speak-to-text. The output of this process is then sent to Verity's NLP machine learning models for classification. For video analysis, Verity leverages audio transcription for speech-to-text as well as frame sampling from the video content itself, sending the individual frames captured to Verity's Computer Vision models for classification. In parallel, the speech-to-text output is enriched with any available video metadata (title, description) and sent to Verity's NLP machine learning models for classification. The outputs of these parallel tracks is then merged in our proprietary merging logic and a comprehensive analysis considering videos by applying Computer Vision analysis to sampled image frames and Natural Language Processing (NLP) analysis to text transcribed from the video’s audio track.

Verity runs parallel analysis of both image and audio components of a video. Verity:

  • Samples frames from the video and sends them for image analysis

  • Extracts the audio track from the video and sends it for transcription from speech to text. Verity then enriches the text output with metadata (such as video title and description) and sends it for NLP analysis.

  • Merges the results of these parallel analyses using proprietary merging logic.

  • Returns a comprehensive analysis that considers both auditory and visual elements of the video

...

  • .

...