# Train your models

To start automating your document processes, you need to train the relevant models. You can do this once you have sufficient labeled data (see below for more details).

## Model management

Go to Menu > Training > Model management.

<figure><img src="/files/VEwn2jp5G3oEUwktGL1O" alt=""><figcaption></figcaption></figure>

When there are sufficient documents labeled as training documents you can train or retrain the model.

You can start the following trainings:

* Page management for a document type
* Entity extraction for a document type
* Document classification for the project

Note that page management and entity extraction trainings are always linked to a specific document type, while document type trainings always happen for the project as a whole.

In the overview of trainings, you can see the total number of documents and new documents. These new documents have not yet been used to train the model. This allows you to decide whether it is necessary to trigger a new training: if there are very little new documents, the model won't learn anything new, and triggering a training is unnecessary.&#x20;

## Training a model <a href="#training-a-model" id="training-a-model"></a>

### Page management <a href="#page-management" id="page-management"></a>

You can only trigger a page management training if you have at least 10 documents with more than one page of the relevant document type in your training data. If you have more than one language in your data, you need at least 5 documents with more than one page per language. Languages with less documents will be discarded, hence the model will not make accurate predictions for those languages.

When clicking the train button for page management trainings, the following pop-up window opens:

<figure><img src="/files/roKipYEM9Oy2h4rm6BlP" alt=""><figcaption></figcaption></figure>

In the overview you can see the total number of documents for each document type. Choose for which document type(s) to launch a training, you also have to option to choose from which projects you want to use documents for the training.

When clicking the next button, you get an overview for your training and when everything seems fine, you can start a new training by clicking the start training button.

<figure><img src="/files/n4WM4RpecFgz4deMvXFh" alt=""><figcaption></figcaption></figure>

### Document classification <a href="#document-classification" id="document-classification"></a>

You can trigger a document classification training if there are at least 10 documents with status "Processed" in your training data. Additionally, you need minimum 5 documents for at least 2 different document types. If you have more than one language in your data, you need at least 5 documents per language, per type. Languages and document types with less documents will be discarded, hence the model will not make accurate predictions for those documents.

By clicking the train button, a popup will be shown to select the type of training you want to launch. See [Train](/project/training/model-management/train.md#document-classification) for more information.

<figure><img src="/files/pAKED5XM8CdNyTXwxeku" alt=""><figcaption></figcaption></figure>

When clicking next, you get an overview of the training you are about to launch. By clicking on the 'Start Training', a new model will be trained based on the current training data.

If you want to exclude a document type from the document type prediction model, you need to specify this in the document type settings, see [Configure a document type](/getting-started/project-management/create-a-document-type.md). You cannot exclude a document type through the model management module.

### Entity extraction

In order to trigger a training for entity extraction, you need at least 10 documents with status "Processed" for the relevant document type, with at least one entity with at least 10 annotations. The amount of documents per language is irrelevant. Entities for which there are less than 10 annotations will be discarded.

When clicking the "Train" button for entity extraction, a pop-up window will open.&#x20;

<figure><img src="/files/4e6lrJs4i4aGCvs9ACRn" alt=""><figcaption></figcaption></figure>

Select the document type(s) for which you want to launch a training and the type of training you want.

You can choose to do a full or an incremental training. A full training includes all the documents of the relevant document type with status "Processed" in your training data, and trains a model from scratch. Depending on your training data size, a full training can take a long time (even more than a day).&#x20;

An incremental training only includes the documents that have been added since the last training, and simply updates the previous model version. Incremental trainings are much faster than full trainings, and should be selected whenever possible.&#x20;

Full trainings are recommended only if the training is the first training ever, if your previous model version had a very low accuracy (F1 score < 70%), or if you notice that incremental trainings ceased to improve the quality of the predictions. In any other scenario, trigger an incremental training.

You can also choose which project documents you want to include when training a new model. When clicking next you get an overview for your training. By clicking start training a new model will be trained.

<figure><img src="/files/SO5rlzsPAo4haWrnf2ZN" alt=""><figcaption></figcaption></figure>

## Testing and improving your models

At the end of a training, some suggested tasks will be added to the tasks module (see [Tasks](/project/training/tasks.md)). These tasks allow you to do two things:

* get an idea of how well your models perform by looking at real model predictions, without having to deploy your model to production
* improve your training data and model accuracy by executing the tasks

The cycle of annotating data, reviewing annotations and retraining your models can be repeated until the predictions of your models reach the desired accuracy. At that moment, you can roll them out (deploy) to production to start automatic document processing. It is always possible to roll back and deploy an older model (see [Model management](/project/training/model-management.md) for more details).&#x20;

Alternatively, you can decide to go straight to production, even though your models are not very accurate yet or even when you do not have any models. Thanks to the human intervention module (see [Validation](/project/production/validation.md)), your team will be able to correct model predictions or add missing predictions to documents **before** the output is sent back to your system, allowing you to keep the data in your system clean. The annotations and corrections that are added in human intervention will be used to improve your models, and over time, your document process will be fully automated.&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs-old.app.metamaze.eu/getting-started/model-training.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
