Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Status Message

Description

INITIATED

Once Verity has checked that the URL is properly formed and does not already exist in the database, the request is passed to the Verity classification systems and the status updates to INITIATED.

PROCESSING

The Verity classification system is processing the text and images on the specified page.

PROCESSED

The URL has been processed and the Verity analysis JSON is available. The analysis results have been stored.

ERROR

Page processing has been attempted and failed. The page URL is recorded in the Error Cache for 1 hour.

If another request to process the same URL is received within 1 hour, Verity will return an Error status (unless the ignoreCache flag is enabled).

After 1 hour, the ERROR status is cleared and Verity will process a new request for the URL. 

Several different conditions may result in an ERROR status message:

  • Unreachable page.

  • A processing module has returned a value other than a success status code.

NOT_SUPPORTED

The language of the page is not supported (see Language Support Grid ). This status message may also be returned if Verity is unable to process the requested website.

INSUFFICIENT_CONTENT

Verity’s content extraction processes cannot extract sufficient relevant content from a page to adequately perform classification tasks across text and imagery.

INVALID

The HTTP URL request may be malformed, for example:

  • Incomplete URL.

  • Missing HTTP header.

  • Invalid domain-specific information.

PAGE_CONTENT_EXTRACTION_FAILED_WITH_403_FORBIDDEN

This error codes indicates that a A website has blocked our web crawler from downloading content.

PAGE_CONTENT_EXTRACTION_FAILED_WITH_404_NOT_FOUND

This error codes indicates that Verity’s web crawler was not able to locate any content for the provided url.

PAGE_CONTENT_EXTRACTION_FAILED_WITH_500_INTERNAL_SERVER_ERROR

This error code indicates that there There was an unknown issue received from the webpage when an attempt to crawl was made. Note that : The PCE attempts up to three times to extract web content.

PAGE_CONTENT_EXTRACTION_FAILED_WITH_4XX

This error codes indicates a A generic 4XX response was received during a crawl attempt by PCE.

PAGE_CONTENT_EXTRACTION_FAILED_WITH_5XX

This error codes indications a A generic 5XX response was received during a crawl attempt by PCE.

...