Content Classification
Managing content for classification by Scope3
The Scope3 classification system is designed to allow the publisher or media owner to provide content for classification upon publication or update. The content owner can share any moderation or classification that has already been performed (based on pre-defined classification and moderation definitions shared with Scope3).
Processing time
Content classification can take a few seconds or even minutes during high-load periods. To see the current processing time and average lag, please visit the Scope3 API status page.
For priority processing - for instance for a front page story - publishers may add a priority flag to the classification request.
Artifact ID
The artifact ID is a universal identifier of a piece of content. This could represent an article, a blog post, a board, an episode, or a chat. Some contexts - for instance a universal scroll - will have multiple pieces of content, and in these cases an array of artifact IDs will represent the content adjacent to an ad placement. The artifact ID will be shared with brands in reporting, and ideally is represented by a URL or path that can be accessed externally so brands can see the content their ads were adjacent to (recognizing that this will not always be possible).
Long-form video: Each scene or segment of a show, between ad breaks, can be represented as a separate piece of content. This allows an ad break to be described as adjacent to 2-4 scenes.
Describing content
Scope3 recommends that brands share as much context about content as possible, including all images, videos, and other media assets that are present on the page or screen. Metadata is also helpful in providing accurate classification, including title, summary, keywords, transcripts, and descriptions.
The Artifact Schema may be used to pass in rich information about content.
Publisher classifiers
Publishers often perform content moderation and classification manually and algorithmically. Scope3 will model these classifiers in its internal taxonomy and compare them to its reference classifiers. For instance, a publisher may have a policy that prohibits articles about climate denial and instruct its editors to enforce this policy. This no-climate-denial
classifier would be present on all articles from the publisher. Scope3 would compare the internal definition of climate denial to its reference and incorporate this as a data point in its classification of the content according to advertiser brand values and preferences.
Scope3 Signals
Through the classification process, Scope3 will assign various signals to uploaded content. These signals may represent generic criteria like "not pornographic" or brand specific outputs like "meets Mastercard US guidelines". These signals will be returned to the publisher using the Get Classification API or through a cloud storage bucket.
Signals will be added and updated on a regular basis as brands are added to the Scope3 platform and/or change their criteria. We recommend refreshing signals for all active content every few days, and no less than monthly.
Uploading content
Publishers may upload one or multiple pieces of content by sending newline-separated JSON (aka JSONL) to the File Upload API, or by uploading a file to a cloud storage bucket (configurable using CSP).
Getting signals back
Three options:
- Set up a cloud storage bucket where all updated content signals will be sent on a regular (~hourly) basis.
- Set up a web hook that will get a POST with updated signals
- Ping the content signals API asynchronously to get latest signals for all managed content
Output Schema
Note that the Scope3 classification pipeline runs asynchonously and may run more than once for a piece of content (reflecting algorithmic updates or addition of new brands). In other words:
- At 10:00 you upload
/title/81344015#segment-1
- At 11:00, Scope3 writes a content signals file to your S3 bucket with 1,000 rows including the signals for this content ID
- At 15:00, Scope3 writes another content signals file, and it contains an updated set of signals for this content ID. These signals should replace the signals from the 11:00 file.
Field | Value | Example |
---|---|---|
artifactID | the artifact ID uploaded to Scope3 | /title/81344015#segment-1 |
language | the language for this classification. A row will be sent for each uploaded language | en |
timestamp | the timestamp of the classification, used to make sure the latest version of signals is applied | 2024-09-03 10:45:00 |
signals | an array of signals, represented as strings | ["bki3s", "sji38"] |
Updated 3 days ago