Designing Entities
Here is 4 questions you should ask yourself while designing your entities
1) Am I reinventing the wheel?
We already provide system entities for numbers, time, date, prices ect.
Check them out (and use them) before reinventing the wheel.
Don't use numbers in your items
Using numbers in your entity may result in conflicts between the official system entity and your own custom entity. If you absolutely need to use numbers, you can use a combination of custom and system entity in your intents.
2) Do I have enough items in my entity?
As the entity extraction engine is also using a neural network, you should make sure to provide it enough data to perform.
5 items per entity
We recommend you to use at least 5 unique items per entity.
On top of that figure, synonyms are a bonus.
Avoid using only one item per entity
As we have seen some bots have issues with this kind of entites, it is not possible to use single item entities.
3) Is the length of my items ok?
An entity is something you want to capture, so relative to the rest of the phrase it should not be dominant.
5 words per item
In our experience, you may want to keep your entities and synonyms shorter than 5 words to avoid the model to diverge and over capture.
4) Am I leveraging enough synonyms?
For each reference value of your entity, consider synonyms, they will improve the accurcy of the entity extraction as well as the natural language feeling of your conversation.
Here's an example.
movies:
Star Wars:
- Darth Vader
- Dark Vador
- Anakin
- Luke
- Skywalker
- Guerre des Etoiles
Star Trek:
- Spoke
- Jean-Luc Picard
- Wesley
- Entreprise
With the above "movies" entity:
- if the user inputs either "Star Wars", "Darth Vader" or "Guerre des Etoiles" then "Star Wars" will be returned
- if the user inputs either "Star Trek", "Spoke" or "Entreprise" then "Star Trek" is returned.
Entities advanced features
For SNP 1.06 and above
Please note that the features described in this section are available in the SNP 1.06 model as well as the following SNP models in the future.
Again, while you can perfectly live with the default values for the entity extraction, here is some advanced features you can tweak.
Accept unknow values
- Impact on intent classification: none
- Impact on entitiy extraction: If set to true, new values untrained on the model will be catched and returned
Matching strictness
- Impact on intent classification: none
- Impact on entitiy extraction: While matching entities, this setting has the same behavior as the confidence threshold on intent classification. The higher, the stricter.
Updated about 1 year ago