# Tutorial for creating a new enrichment

In this tutorial, we'll define a new enrichment step by step for looking up a supplier based on a provided VAT number.

The complete code of this tutorial can be found on [GitHub](https://github.com/metamaze/public_enrichments/blob/main/python/SimpleLookupEnrichment/server.py).

## 1. Create and configure a simple test project

For this tutorial, we will assume you are familiar with working in the Python programming language.&#x20;

1. Create a new project in Metamaze with a clear name like `Development project for testing enrichments`.
2. In that project, create a new document type and give it name, like `Purchase Order`
3. Add an entity to that document type called `VAT number`. Make this entity **required** so that all documents go to human validation, which is helpful for testing purposes.

Next, we will configure the enrichments API.

## 2. Create a new API endpoint for matching a supplier

First, we define a boilerplate server Flask server with some Bearer token authentication

```python

import logging
from os import environ as env

# Basic Flask imports and configuration
from flask import Flask, jsonify, request
from flask_httpauth import HTTPTokenAuth

app = Flask(__name__)
auth = HTTPTokenAuth(scheme="Bearer")

# Define your Bearer token
BEARER_TOKEN = env.get("BEARER_TOKEN", "[optionally set bearer token here]")

@auth.verify_token
def verify_token(token):
    return bool(token == BEARER_TOKEN)
    
##### ADD API CALLS HERE LATER

if __name__ == "__main__":
    app.run(host="0.0.0.0", debug=True)

    # to debug locally, use the following command:
    # FLASK_APP=server.py FLASK_ENV=development flask run --port 5001
```

Then, let's add some example data.&#x20;

```python
EXAMPLE_SUPPLIERS_DB = [
    ("ABC Company", "Kerkstraat 1, 1000 Brussel", "BE0123456789"),
    ("XYZ Corporation", "123 Main St, New York", "US987654321"),
    ("PQR Enterprises", "456 Elm St, Los Angeles", "US123456789"),
    ("LMN Corporation", "789 Oak St, Miami", "US543216789"),
    ("DEF Ltd", "10 Baker St, London", "GB987654321"),
    ("GHI SARL", "20 Rue de la Paix, Paris", "FR123456789"),
    ("JKL Srl", "Via Roma 1, Rome", "IT987654321"),
    ("MNO GmbH", "Hauptstraße 10, Berlin", "DE123456789"),
    ("RST S.L.", "Calle Mayor 5, Madrid", "ES987654321"),
    ("UVW Sp. z o.o.", "ul. Główna 15, Warsaw", "PL123456789"),
]
```

In real life, this data would be populated by reading from a database or a reference file.

We'll do some transformations on the data to make it a bit easier to work with. Let's create simple dictionaries from the data and store it in a new `SUPPLIERS` list.

```python
SUPPLIERS = [
    {
        "id": vat_number,
        "company_name": company_name,
        "company_address": address,
        "company_vat_number": vat_number,
    }
    for company_name, address, vat_number in EXAMPLE_SUPPLIERS_DB
]
```

Great! Now, we are ready to define our first API call. We are going to use a `GET` request to the `/api/find-supplier` route, and re-use the token-based authentication we defined earlier.

```python
@app.route("/api/find-supplier", methods=["GET"])
@auth.login_required
def find_supplier():
    content = request.json
```

The body of the API call will contain the whole document ([reference](https://app.metamaze.eu/docs/index.html#tag/Enrichments/paths/~1enrichment-endpoint-configured-by-the-user-in-the-application/get)). In this case, we want to match based on the found `VAT number`, so let's first find the correct value:

```python
    # Find the first value of the entity with name "VAT number"
    vat_number = None
    for annotation in content["annotations"]:
        if annotation["entity"]["name"] == "VAT number":
            vat_number = annotation["text"]
            link = annotation["link"]
            break  # Stop at first match of VAT number

    # Stop if no VAT number was found
    if vat_number is None:
        return jsonify({"enrichments": []})
```

Make sure that the entity name matches the entity name you used to configure in Metamaze precisely, including the casing.

To improve matching rates, it often makes sense to make the lookup a bit more robust instead of matching strings exactly. In real life, you would often even use fuzzy matching on multiple fields to find the correct field. For an example of how you could use fuzzy matching, see the example [FuzzyPurchaseOrderEnrichment](https://github.com/metamaze/public_enrichments/blob/main/typescript/FuzzyPurchaseOrderEnrichment/index.js) (Typescript). In this case, we'll ignore all non-alphanumeric characters by adding

```python
    vat_number = "".join([c for c in vat_number if c.isalnum()])
```

We'll find the first match in our reference data by using the `next` function

```json
supplier = next(
    (supplier for supplier in SUPPLIERS if supplier["company_vat_number"] == vat_number),
    None,
)
```

Finally, all we need to do is return the found supplier object, if it exists

```python
    if supplier:
        return jsonify({"enrichments": [{"name": "Find supplier", "value": [supplier], "link": link}]})
    else:
        return jsonify({"enrichments": []})
```

{% hint style="info" %}
We're only returning one match here. Later, you can extend that by returning multiple potential matches, or custom exception codes.
{% endhint %}

Now, let's start-up the Flask server by opening a terminal and running

```bash
FLASK_APP=server.py FLASK_ENV=development flask run --port 5001
```

To make sure Metamaze can access the little debug server running on your local machine, you can use a free tool like `ngrok`. For example, you could run `ngrok http 5001` and get output like this:

```
Session Status                online
...
Forwarding                    https://7032-109-135-42-38.ngrok-free.app -> http://localhost:5001
```

Grab the public URL (in this case `https://7032-109-135-42-38.ngrok-free.app`) and store it somewhere. We'll need it to configure the enrichment endpoint in the next step.

{% hint style="warning" %}
Note that `ngrok` tunnels are temporary and for debugging purposes only. If the session times out and you restart the tunnel, it will be on a different URL. You will need to change the URL in the enrichment settings too.
{% endhint %}

We have everything up and running now to start configuring Metamaze. Awesome work :tada:!

Remember, if you want the full code example, you can find it [here](https://github.com/metamaze/public_enrichments/blob/main/python/SimpleLookupEnrichment/server.py).

## 3. Configure the Find Supplier enrichment

1. In Metamaze, navigate to the Project Settings > Enrichments and click on the blue `+ Create` button to add a new enrichment.
2. Configure the General settings
   1. We'll give it the name `Find supplier`. Note that this needs to be exactly the same name as you are returning in the API call from before.
   2. Let's enable Human validation, and make the enrichment required. This will make it easier to debug.
   3. We'll link the document type "Purchase Order" we created in the set-up.<br>

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2F9gR0hk2lAdqfAXDhR9zK%2FCleanShot%202023-11-21%20at%2016.05.09%402x.png?alt=media&#x26;token=1f479db2-88e0-4590-9902-951261d94020" alt=""><figcaption></figcaption></figure>

Then, we will define the Triggers.

1. Add a trigger "After entity extraction" of the document type "Purchase Order". This will make sure that the enrichments is triggered when entities are predicted automatically. Note that we didn't train a model in this tutorial.
2. Add a trigger "After labeling" of the entity `VAT number`. This will make sure that when we change an annotation manually, the enrichment will be re-triggered.

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2FphO2e2ljmUnqLQw2bOc7%2FCleanShot%202023-11-21%20at%2016.07.21%402x.png?alt=media&#x26;token=aa6b33a6-3896-4e6d-8bbb-a8fac52a3d87" alt=""><figcaption></figcaption></figure>

In the section "Value types", we will take `Entries` since we are returning full objects, not just simple strings. We can define the columns that we want to show. On the left side, take the exact same name as you will return in the API. On the right side, we can give them some user-friendly labels.&#x20;

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2FvoWfHCNkR2447Rta5oG8%2FCleanShot%202023-11-21%20at%2016.08.21%402x.png?alt=media&#x26;token=3c67bf3a-d8c2-4723-b389-b1923a79e581" alt=""><figcaption></figcaption></figure>

Finally, we'll define the webhook. If you are using `ngrok`, make sure you are using the live tunnel URL, and appending the route (`/api/find-supplier`) we defined in our code. Also make sure you are using the same Bearer token as you are expecting.

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2F9s2usjKqPBAPDY5sra9M%2FCleanShot%202023-11-21%20at%2016.12.17%402x.png?alt=media&#x26;token=89fa8151-2673-4715-a5d1-e62ddd68ca08" alt=""><figcaption></figcaption></figure>

Click the `Create` button to finish your enrichment.

## Upload a document and test

Navigate to your Production Uploads, and upload a new file. For example, we can use the [test file](https://github.com/metamaze/public_enrichments/blob/main/python/SimpleLookupEnrichment/Example%20document%20for%20enrichment.docx).&#x20;

Since we have not trained any model, there will be no predictions at the start, and the document will look empty:

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2FMkbhGixhtWQELGFQqcpW%2FCleanShot%202023-11-21%20at%2016.17.33%402x.png?alt=media&#x26;token=b0235919-5eae-4596-8a7d-a1755ca31e98" alt=""><figcaption></figcaption></figure>

Add an annotation for the VAT number by clicking on the document and choosing the entity `VAT number`.

The enrichment will be triggered automatically, and you will see the result:

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2F66MerH2zhWIIsRjTQ0KT%2FCleanShot%202023-11-21%20at%2016.23.54%402x.png?alt=media&#x26;token=23343981-91db-46f6-a78d-7ce135c66567" alt=""><figcaption></figcaption></figure>

By clicking on the enrichment line, you can see all the details of the object too:

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2FGr4iS5wnFZ8XHYclYVUO%2FCleanShot%202023-11-21%20at%2016.24.23%402x.png?alt=media&#x26;token=55c9a773-cc88-4af3-a1fc-8fa4118f805e" alt=""><figcaption></figcaption></figure>

Congratulations, your first enrichment is complete :tada:!&#x20;

If you look in the "All" tab, you'll notice that you can't search for suppliers here. We'll configure that in the next section.

## (Optional) Add a second API call to list *all* suppliers

### Add a new route to the Flask server

We can also add a second API call to list *all* the suppliers. In the code, add a new route on `/api/list-suppliers` by adding&#x20;

```python

@app.route("/api/list-suppliers", methods=["GET"])
@auth.login_required
def list_suppliers():
    return jsonify(SUPPLIERS)
```

The keys of the dictionaries contained in the `suppliers` list should be the same as the configured column names. These column names are set in the enrichment settings in the Metamaze platform.&#x20;

For example, if you have a column called `company_name`, you should have a key called `company_name` in the dictionary too.

Here's an example object that we are returning

```json
 [
    {
        "id": "BE0123456789",
        "company_name": "ABC Company",
        "company_address": "Kerkstraat 1, 1000 Brussel",
        "company_vat_number": "BE0123456789",
    },  
    ... // other objects
]
```

### Configure the options in the enrichment

Navigate back to the "Find supplier" enrichment in the Project Settings and go to the "Options" section. Fill in the new API route on the correct URL with the correct Bearer token.

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2Fvd3WN0MBYVCrQsqYxgca%2FCleanShot%202023-11-21%20at%2016.26.23%402x.png?alt=media&#x26;token=f79da77f-734f-4492-bf0b-a8977df5c7da" alt=""><figcaption></figcaption></figure>

Click Update.

### Test the options

Now, in the *All* tab, you will be able to search and select from a list of all suppliers

<figure><img src="https://382487820-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-LuWehHupT-Y9kwe-PJb%2Fuploads%2F6kPUFjN0ShAWIzZyemWe%2FCleanShot%202023-11-21%20at%2016.28.02%402x.png?alt=media&#x26;token=0f745803-cd3e-4730-a176-48714b44994d" alt=""><figcaption></figcaption></figure>
