# Parsing

When extracting information from documents data types like dates and numbers can be parsed in Metamaze. This has the advantage of:

* not needing to parse dates & numbers in your own application
* parsing rules that can be configured for each of your projects separately
* output provided by Metamaze being formatted the way you need it to be

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2FxMAZtdF2z9GlTZNfZF4O%2Fimage.png?alt=media&#x26;token=6bc727ed-7946-4244-a390-63a4be1c23d3" alt=""><figcaption></figcaption></figure>

## General behavior

Under the hood the parser will always try to look at context within a document to parse ambiguous dates and numbers. This means it will try to find non-ambiguous dates within a document to learn the format and apply that format to the ambiguous dates and numbers within that same document.

## Date parsing

Parsing dates can sometimes be a challenge and depending on the project it can be different for each project. Metamaze allows you to configure how you parse dates.

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2FI1l3gf4gqWF9JtHGjHz0%2Fimage.png?alt=media&#x26;token=97e27cf4-7578-4317-abcf-42995afe2341" alt=""><figcaption></figcaption></figure>

### Missing data

When parts of a date are missing, you can define default rules on how to handle the situation.

When missing the day part of a date you can choose to:

* Go to human validation
* Use the first day of the month
* Use the last day of the month

When missing the year part of a date you can choose to:

* Go to human validation
* Use the closest year
* Use the current year
* Use the next year
* Use the previous year

### AI parsing

You can enable the use of AI as a parsing fallback. This can be enabled when the parsing fails and/or when the parsing stops when dealing with ambiguous dates.

A text field is shown (when the AI functionality is enabled) to allow you to give extra instructions for parsing.

### Failed parsing

Sometimes the parsing just fails. Here you can decide how to deal with that situation:

* Go to human validation
* Make the entity value blank
* Remove the entity

### Parsing ambiguous two-part dates

Metamaze allows you to configure how to deal with two-part dates. You can treat the dates in following formats:

* day - month
* month - year
* year - month
* month-day
* week - year
* year - week
* Closest to upload date

There is a special option "Stop" that allows you to exclude parsing options. When the parsing stops this way you can:

* Go to human validation
* Make the entity value blank

{% hint style="info" %}
**Closest to upload date:** will choose the date that is closest to the upload date, eg. \
\
Upload date: 01-01-2023\
Date on document: 01-03-2023\
Ambgious date because it can be 01-03-2023 or 03-01-2023\
\
This rule will choose 03-01-2023 as it is the closest date to the upload date out of the 2 possible dates
{% endhint %}

### Parsing ambiguous three-part dates

Metamaze allows you to configure how to deal with three-part dates. You can treat the dates in following formats:

* day - month - year
* month-day - year
* year - month - day
* Closest to upload date

There is a special option "Stop" that allows you to exclude parsing options. When the parsing stops this way you can:

* Go to human validation
* Make the entity value blank

{% hint style="info" %}
**Closest to upload date:** will choose the date that is closest to the upload date, eg. \
\
Upload date: 01-01-2023\
Date on document: 01-03-2023\
Ambgious date because it can be 01-03-2023 or 03-01-2023\
\
This rule will choose 03-01-2023 as it is the closest date to the upload date out of the 2 possible dates
{% endhint %}

### Test parser

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2FDE8Tb5L0S3SSLpWvgmqw%2Fimage.png?alt=media&#x26;token=cff97c61-c54f-44fa-b67a-56f635f4c000" alt=""><figcaption></figcaption></figure>

You are able to test your parser configuration. A default set of examples is provided, but you can fill in your own date value and press the "Test value" button to see how the parser parses your input.

## Number parsing

Parsing numbers can sometimes be a challenge and depending on the project it can be different for each project. Metamaze allows you to configure how you parse numbers.

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2FQHoYDI6c200MmzK9JWnx%2Fimage.png?alt=media&#x26;token=cd4ab7b5-f947-420f-b3f1-ed85ca4b07cc" alt=""><figcaption></figcaption></figure>

### AI parsing

You can enable the use of AI as a parsing fallback. This can be enabled when the parsing fails and/or when the parsing stops when dealing with ambiguous numbers.

A text field is shown (when the AI functionality is enabled) to allow you to give extra instructions for parsing.

### Failed parsing

Sometimes the parsing just fails. Here you can decide how to deal with that situation:

* Go to human validation
* Make the entity value blank
* Remove the entity

### Parsing ambiguous numbers with decimals

Metamaze allows you to configure how to deal with ambiguous number formats. You can do the following:

* Treat decimal signs always as decimals
* Treat decimal signs always as thousand separators
* Go to human validation
* Make the entity value blank
* Set default settings
  * Use one of the following as a thousand seperator
    * Dot
    * Comma
  * Use one of the following as a decimal seperator
    * Dot
    * Comma

### Test parser

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2FWh3Tv3v6c84hJSIm1a7j%2Fimage.png?alt=media&#x26;token=42de71d6-2faa-44bb-8425-6321dedb7d7e" alt=""><figcaption></figcaption></figure>

You are able to test your parser configuration. A default set of examples is provided, but you can fill in your own number value and press the "Test value" button to see how the parser parses your input.

{% hint style="warning" %}
The parser currently supports parsing natural language dates and numbers in English, French, and Dutch. If you require natural language parsing in other languages, please contact us via [getting-support](https://docs-old.app.metamaze.eu/getting-support "mention").&#x20;
{% endhint %}
