Entity extraction

Whatever the NLU pipeline, the platform always come with two entity extractors

Custom entities extractor

During the training, the model will learn to detect your custom entities by looking at

  • the context: mostly the position of the entities in the utterances
  • the content: the reference values and their synonyms

Then, during the parsing, the model will:

  • capture the entities found (if any) in the user request
  • return the reference value as a meta data

For instance, here is a custom entity we created for food type detection

967

We used this entity in the intent bellow

916

Finding entities in the API output

Here is the available result in the API output for the request: I love pizza

"foodType": [
			{
				"value": "italian food",
				"string": "pizza",
				"learned": false
			}

Some explanations on this format:

  • "foodType": the name of the custom entity
  • "value": the reference value for the detected entity
  • "string": the information captured by the extractor
  • "learned": tells if the reference value is in the training dataset

System entities extractor

The system entity extractor relies on a set of predifined entites that are not list based but still really helpful.
Here is a list of the available system entities than you can use in your intents:

  • numbers
  • phone numbers
  • amount of money
  • time and date
  • temperature
  • emails
  • urls
    -...

Here is an output exemple for a date:

"time": [
			{
				"value": "2021-02-11 00:00:00 +00:00",
				"unit": "InstantTime",
				"string": "the 11/02/2021",
			}

Some explanations on this format:

  • "time": the name of the system entity
  • value: formated value
  • string: the raw text captured by the extractor before formatting
  • unit: the unit of the entity can be for instance euros for amounts of money, kelvin for temperature, etc...