Anonymization

Created by Sebastian David Garcia Saiz, Modified on Wed, 11 Jun at 3:56 PM by Sebastian David Garcia Saiz

Pangeanic Anonymization Solution (PAS) is based on the use of Neural Networks (NN). These networks process text and output annotated content identifying various entities, such as person names, addresses, etc.


However, Neural Networks only identify patterns they were trained on, meaning they may not detect new entities or adapt to specific use cases.


To improve identification performance, PAS combines Neural Networks with:

  • Dictionaries

  • Rules


Anonymization Dictionaries

The simplest way to help the NN identify an entity is to declare it in a dictionary. These are called Anon Dictionaries as they list terms that can be anonymized.


For example, in a hospital setting where you want to ensure that all doctor names are anonymized, you can create an Anon Dictionary named DoctorsList. This file simply contains a list of names, one per line, in plain text format. The dictionary is used during the anonymization process to force detection.


An Anon Dictionary is linked to an entity type, such as PER (Person Name), or you can create a new entity type (e.g., DOCTOR) and assign the list to that type.


Clear Dictionaries

Rules are based on Regular Expressions and are used to detect patterns in text. Learn more about regex syntax here: https://en.wikipedia.org/wiki/Regular_expression


Like Anon Dictionaries, rules are associated with a specific or new entity type.


Rules are particularly useful for detecting:

  • Driving license numbers

  • Employee IDs

  • Case or file reference numbers

  • Bank account numbers


Anonymization Profiles

When users anonymize a document via ECO, they must provide:

  • The language (e.g., English, Spanish, Japanese)

  • The list of entities to anonymize

  • The anonymization mode (e.g., redaction, pseudonymization)

It would be cumbersome to ask users to manually choose every dictionary and rule each time. To streamline the process, admins can create Anonymization Profiles.


An Anonymization Profile bundles dictionaries and rules under a memorable name. Users simply select the language, entity list, anonymization mode, and optionally one of the admin-defined profiles — which applies the appropriate dictionaries and rules automatically.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article