Versions of a document

This page intends to create clarity on the concept of “versions” by describing the expected behaviour. Versions are used extensively for two main purposes:

  • Display in the front end. Any (unevaluated) version can be displayed in the front end.

  • Evaluations. When submitting feedback, it is always compared to an existing version. The default is the last prediction, but this can be modified by passing “cmp_version” in the /feedback endpoint.

Evaluating a specific version (using the /feedback endpoint) will not modify the original, but rather create a copy of whichever version is to be evaluated first.

 

Here are a couple of important properties every Version has:

  • source: this determines whether the Version originates from a submission of input (source = “human”) or whether it was generated internally by making predictions, or evaluating (source = “machine”).

  • is_evaluated: this determines whether the version is the result of comparing two versions with each other (evaluation) or not. This means that data to base statistics on should always have “is_evaluated” True.

  • name: this is a chosen name for the version. By default, the following names are “reserved” for internal use:

    • predicted: the original predictions made for a file (source=”machine”). Usually the first version in terms of timestamp

    • submitted: a version that is created by default when a user submits feedback (Pressing done in the front end or sending data to /feedback with default settings (name=”submitted”, evaluate=True))

    • saved: a version only created when a user “saves” a file

    • latest: the version name used for looking up the latest version

 

Here are some scenario’s and the expected versions after each:

Upload of a file: we expect 1 Version with:

"name": "predicted", "source": "machine", "is_evaluated": false,

Reprocess a predicted file. Two Versions with

"name": "predicted", "source": "machine", "is_evaluated": false,

if a different “version” <custom_version_name> is specified;

"name": <custom_version_name>, "source": "machine", "is_evaluated": false,

Submitting feedback with "name" <custom_version_name> and evaluate false. One new version added:

Submitting feedback with "name" <custom_version_name> and evaluate true. Two new versions:

and

Submitting feedback with "name" <custom_version_name> and evaluate true and cmp_version <custom_cmp_version_name>. Two new versions (as above):

and

(of which this last one is the evaluation that is the result of comparing custom_cmp_version_name and custom_version_name)

 

Some important takeaways:

  • Evaluated versions always have

  • The name of the evaluated version will always be the same as the submitted “feedback” that triggers the evaluation

  • A query on Versions with source “human” will return only (and all) feedback that was sent in, while a query with source machine can be both predictions (is_evaluated False) and evaluation (is_evaluated True)