Purger Service#
Added in version 2.4.0.
The Purger service was introduced in ESS 2.4.0. It can be used as part of a workflow for deleting user data from ESS. This enables organizations using ESS to comply with legislative requirements, such as GDPR/CCPA and the right to have personal data deleted.
Warning
The Purger service will permanently delete a user’s data so the operator must take great care to restrict access to it.
Purging User Data#
ESS’ Purger service allows an operator to delete all of a user’s data. This service exposes HTTP endpoints which can be called by trusted operator agents.
Purging Process#
The Purger service orchestrates the process of sending purge requests to each of the services configured as purgeable
.
The process starts by validating the request and only continues if all services report that the request is
valid. The purge process starts asynchronously and is only complete when all the purgeable services have completed their purge successfully.
The default list of services configured to be purged in a standard ESS deployment is shown below. For each service, there is a description of what would be purged and how it would validate the request.
Service |
Purged Data |
Validation |
---|---|---|
Data related to this WebID such as client credentials is deleted. |
The WebID must be issued by this service. |
|
The WebID Profile Document is deleted. |
The WebID must be hosted on this service. |
|
Metadata and resources within each storage are deleted. |
All storages are hosted on this service and their data subject is the WebID. |
|
All access controls applying to each storage are deleted. |
Not required. |
|
All index entries associated with each storage are deleted. |
Not required. |
|
All credentials where the WebID is the subject are revoked and deleted. |
Not required. |
Purger Service Endpoints#
By default, the ESS Purger service runs from the following root URL:
https://purger.<ESS Domain>
The ESS Purger service consists of the following endpoints:
Endpoint |
Description |
---|---|
|
Start an async purge of a user’s data and return a |
|
Check status of a purge. The purge is complete when the |
Start Purge#
The Purger service provides an endpoint which starts an async process that will purge all user data associated with an Agent from ESS.
Input#
Endpoint |
|
Method |
|
Authorization |
An access token for a trusted agent that is allowed to call the purge endpoints. |
Content-Type |
|
Payload |
A JSON object containing information about the Agent (WebID and Storages) to be purged. The WebID of the Agent and a list of all the Storages where the Agent is the Data Subject. All Storages must be supplied. |
The purge request payload is represented as JSON.
{
"webid": "<the WebID to purge>",
"storages": ["<a storage URI>", "<another storage URI...>"]
}
The operator is responsible for generating this file. This involves determining a user’s WebID and identifying the URIs of the storages associated with it. All storages must be included so that none are orphaned once the WebID is deleted.
Purge request validation rules:
The user identified by the WebID must be the data subject of every provided storage.
Both the
webid
and thestorages
fields must be present.The
storages
list may not be empty.The values in
webid
and thestorages
list must be valid, absolute URIs.
Example request:
POST /purge HTTP/1.1
Host: purger.example.com
Authorization: Bearer xxxxxxxx
Content-Type: application/json
{
"webid":"https://id.{ESS Domain}/alice",
"storages":[
"https://storage.{ESS Domain}/ead20d575gf8/",
"https://storage.{ESS Domain}/b506cb130798/"
]
}
Output#
If the purge request is valid it initiates the purge process asynchronously and the client will receive a 201 response with
a Location
header containing the status URI for this purge.
Example response:
HTTP/1.1 201 Created
Content-Length: 0
Location: https://purger.{ESS Domain}/purge/status/b8ca941a-7b18-4458-85e1-5e14cb9dbb0f
Note
This endpoint is idempotent so a client can make the exact same purge request multiple times and it will behave in the same way as the initial purge request.
Check Purge Status#
When a purge process has started the Purger service provides an endpoint that allows a client to determine when the purge has been completed.
Input#
Endpoint |
|
Method |
|
Authorization |
An access token for a trusted agent that is allowed to call the purge endpoints. |
Example request:
GET /purge/status/xyz HTTP/1.1
Host: purger.example.com
Authorization: Bearer xxxxxxxx
Output#
The response from this endpoint will be the status of the purge. The status
field can be IN_PROGRESS
,
COMPLETED
or FAILED
.
IN_PROGRESS
indicates the client can continue to poll this endpoint until it changes toCOMPLETED
orFAILED
.COMPLETED
indicates that the purge task has successfully concluded across all services.FAILED
indicates that one or more services were unable to complete the purge. ESS will create log and audit entries to indicate the nature of the problem (e.g. a timeout or other unexpected error) for further investigation.
Example response:
HTTP/1.1 200 OK
Content-Type: application/json
{
"id":"b8ca941a-7b18-4458-85e1-5e14cb9dbb0f",
"webid":"https://id.{ESS Domain}/alice",
"storages":[
"https://storage.{ESS Domain}/ead20d575gf8/",
"https://storage.{ESS Domain}/b506cb130798/"
]
"status":"COMPLETED",
"modified":"2025-02-27T10:36:21.036125360Z"
}
Backup Processing and Purging Data#
The ESS Purger does not require any changes to an established backup process. The service is idempotent, allowing purge requests to be submitted repeatedly for the same set of WebID(s) and Storage(s), even if a partial restore of ESS data has been performed.
Operators must retain a history of all purge requests submitted since the last backup so they can be replayed in the event of a backup restore operation. Failure to replay purge requests submitted after the last backup will result in data being restored into the live system, nullifying prior purge requests.
Recommendation: Perform a backup of all ESS data prior to submitting a purge request.
Recommendation: During a restore operation limit ingress to only allow access to the Purger endpoints until the purge
history has been successfully replayed and all https://purger.{ESS Domain}/purge/status/{id}
return a status of COMPLETED
.
Configuration#
As part of the installation process, Inrupt provides base Kustomize overlays and associated files that require deployment-specific configuration inputs.
The following configuration options are available for the service and may be set as part of updating the inputs for your deployment. The Inrupt-provided base Kustomize overlays may be using updated configuration values that differ from the default values.
Required#
- INRUPT_PURGER_PHASES_{phase}_PRIORITY#
This is used to determine the priority order of the purger phases that will be completed first. Phases with lower numbers will be purged before those with higher numbers. Each phase must have a unique priority and the service will not start if the config does not conform to this rule.
- INRUPT_PURGER_PHASES_{phase}_SERVICES_{service}_ENDPOINT#
The URL of the internal purgeable service for a phase. Multiple services can be included in a phase.
Multiple purgeable services can be configured using indexed properties. For example:
INRUPT_PURGER_PHASES_PHASE1_SERVICES_OPENID_ENDPOINT=https://ess-openid
INRUPT_PURGER_PHASES_PHASE1_SERVICES_WEBID_ENDPOINT=https://ess-webid
INRUPT_PURGER_PHASES_PHASE1_PRIORITY=1
INRUPT_PURGER_PHASES_PHASE1_DELAY=PT5M
INRUPT_PURGER_PHASES_PHASE2_SERVICES_PROVISION_ENDPOINT=https://ess-pod-provision
INRUPT_PURGER_PHASES_PHASE2_SERVICES_AUTHORIZATION_ENDPOINT=https://ess-authorization-acp
INRUPT_PURGER_PHASES_PHASE2_PRIORITY=2
etc.
Note
The Purger application ships with a default purging sequence applicable to a default ESS deployment. Operators may need to override the default configuration to remove purgeable service configurations inapplicable to their particular deployments.
Optional Configuration#
- INRUPT_PURGER_BATCH_SIZE#
Default:
100
This config is used internally for the batch size for cleaning up completed purge data and processing stale in-progress purges.
- INRUPT_PURGER_CLEANUP_TASK_EVERY#
Default:
PT5H
This config is used internally for the scheduled task that cleans up completed purge data.
- INRUPT_PURGER_IN_PROGRESS_TIMEOUT_SECONDS#
Default:
120
This config is used internally for finding stale in-progress purges that have not been progressing and need to be processed.
- INRUPT_PURGER_PROCESS_TASK_EVERY#
Default:
PT5M
This config is used internally for the scheduled task that processes stale in-progress purges.
- INRUPT_PURGER_STATUS_RETENTION_WINDOW#
Default:
P2D
This config is used internally for the retention of completed purge data.
- INRUPT_PURGER_NOTIFY_EVERY#
Default:
PT30S
Internal config setting the rate at which the purger will notify listening processes of status updates.
- INRUPT_PURGER_PHASES_{phase}_DELAY#
This optional config is used to introduce a time delay after a purge phase. For example; to allow access tokens from Openid to expire before moving to the next purge phase, this can be set to the access token life span.
- INRUPT_PURGER_POLL_EVERY#
Default:
PT5S
Rate at which the purger will check the ongoing purge statuses.
- INRUPT_PURGER_TIMEOUT#
Default:
PT180M
Timeout for an individual purge task. Beyond this time, the purge will be considered failed.
- QUARKUS_LOG_LEVEL#
Default:
INFO
Logging level.
Kafka Configuration#
Tip
See also ESS’ Kafka Configuration.
- INRUPT_KAFKA_AUDITV1EVENTSENCRYPTED_CIPHER_PASSWORD#
The strong cipher key to use when running auditing with encrypted messages.
Added in version 2.1.5.
- INRUPT_KAFKA_AUDITV1EVENTSPRODUCERENCRYPTED_CIPHER_PASSWORD#
The strong cipher key to use when running auditing with encrypted messages over the
auditv1eventsproducerencrypted
topic.Added in version 2.2.0.
- KAFKA_BOOTSTRAP_SERVERS#
Default:
localhost:9092
Comma-delimited list of Kafka broker servers for use by ESS services, including this service.
Setting
KAFKA_BOOTSTRAP_SERVERS
configures ESS to use the same Kafka instance(s) for all its Kafka message channels (e.g.,solidresource
andauditv1out
message channels). This service uses theauditv1out
message channel.Note
Inrupt-provided overlays default to using
KAFKA_BOOTSTRAP_SERVERS
.To use a different Kafka instance for the
auditv1out
channel, use specific message channel configuration.See also ESS’ Kafka Configuration.
Service Configuration Logging#
- INRUPT_LOGGING_CONFIGURATION_PREFIX_ALLOW#
Default: inrupt
A comma-separated list of configuration property prefixes (case-sensitive) that determine which configurations are logged:
If the list is empty, NO configuration property is logged.
If a configuration property starts with a listed prefix (case-sensitive), the configuration property and its value are logged unless the configuration also matches a prefix in
INRUPT_LOGGING_CONFIGURATION_PREFIX_DENY
(which acts as a filter onINRUPT_LOGGING_CONFIGURATION_PREFIX_ALLOW
list).As such, if the configuration matches prefix in both
INRUPT_LOGGING_CONFIGURATION_PREFIX_ALLOW
andINRUPT_LOGGING_CONFIGURATION_PREFIX_DENY
, theINRUPT_LOGGING_CONFIGURATION_PREFIX_DENY
takes precedence and the configuration is not logged. For example, ifinrupt.
is an allow prefix, butinrupt.kafka.
is a deny prefix, all configurations that start withinrupt.kafka.
are excluded from the logs.
When specifying the prefixes, you can specify the prefixes using one of two formats:
using dot notation (e.g.,
inrupt.foobar.
), orusing the MicroProfile Config environmental variables conversion value (e.g.,
INRUPT_FOOBAR_
).
Warning
Use the same format for both
INRUPT_LOGGING_CONFIGURATION_PREFIX_ALLOW
andINRUPT_LOGGING_CONFIGURATION_PREFIX_DENY
.For example, if you change the format of
INRUPT_LOGGING_CONFIGURATION_PREFIX_ALLOW
, change the format ofINRUPT_LOGGING_CONFIGURATION_PREFIX_DENY
as well.Tip
To avoid allowing more than desired configurations, specify as much of the prefix as possible. If the prefix specifies the complete prefix term, include the term delineator. For example:
If using dot-notation, if you want to match configuration properties of the form
foobar.<xxxx>...
, specifyfoobar.
(including the dot.
) instead of, for example,foo
orfoobar
.If using converted form, if you want to match configuration properties of the form
FOOBAR_<XXXX>...
, specifyFOOBAR_
(including the underscore_
) instead of, for example,FOO
orFOOBAR
.
Added in version 2.2.0.
- INRUPT_LOGGING_CONFIGURATION_PREFIX_DENY#
Default: inrupt.kafka
A comma-separated list of configuration name prefixes (case-sensitive) that determines which configurations (that would otherwise match the
INRUPT_LOGGING_CONFIGURATION_PREFIX_ALLOW
) are not logged. That is,INRUPT_LOGGING_CONFIGURATION_PREFIX_DENY
acts as a filter onINRUPT_LOGGING_CONFIGURATION_PREFIX_ALLOW
. For example:If
foobar.
is an allowed prefix, to suppressfoobar.private.<anything>
, you can specifyfoobar.private.
to the deny list.If
foobar.
is not an allowed prefix, no property starting withfoobar.
is logged. As such, you do not need to specifyfoobar.private
to the deny list.
When specifying the prefixes, you can specify the prefixes using one of two formats:
using dot notation (e.g.,
inrupt.foobar.
), orusing the MicroProfile Config environmental variables conversion value (e.g.,
INRUPT_FOOBAR_
).
Warning
Use the same format for both
INRUPT_LOGGING_CONFIGURATION_PREFIX_ALLOW
andINRUPT_LOGGING_CONFIGURATION_PREFIX_DENY
.For example, if you change the format of
INRUPT_LOGGING_CONFIGURATION_PREFIX_ALLOW
, change the format ofINRUPT_LOGGING_CONFIGURATION_PREFIX_DENY
as well.Added in version 2.2.0.
Log Redaction#
- INRUPT_LOGGING_REDACTION_NAME_ACTION#
Default: REPLACE
Type of the redaction to perform. Supported values are:
Action
Description
REPLACE
Default. Replaces the matching text with a specified replacement.
PLAIN
Leaves the matching field unprocessed. Only available if the redaction target is a field (i.e.,
INRUPT_LOGGING_REDACTION_{NAME}_FIELD
).DROP
Suppresses the matching field. Only available if the redaction target is a field (i.e.,
INRUPT_LOGGING_REDACTION_{NAME}_FIELD
).PRIORITIZE
Changes the log level of the matching message.
SHA256
Replaces the matching text with its hash.
If the action is
REPLACE
(default), see alsoINRUPT_LOGGING_REDACTION_{NAME}_REPLACEMENT
.If the action is to
PRIORITIZE
, see alsoINRUPT_LOGGING_REDACTION_{NAME}_LEVEL
.
For more information on log redaction, see Logging Redaction.
Added in version 2.2.0.
- INRUPT_LOGGING_REDACTION_NAME_ENABLED#
Default:
true
A boolean that determines whether the redaction configurations with the specified
INRUPT_LOGGING_REDACTION_{NAME}_
prefix is enabled.For more information on log redaction, see Logging Redaction.
Added in version 2.2.0.
- INRUPT_LOGGING_REDACTION_NAME_EXCEPTION#
Fully qualified name of the exception class to match in the log messages (includes inner exception). Configure to target an exception message class.
For more information on log redaction, see Logging Redaction.
Added in version 2.2.0.
- INRUPT_LOGGING_REDACTION_NAME_FIELD#
Exact name of the field to match in the log messages. Configure to target a specific log message field for redaction.
For more information on log redaction, see Logging Redaction.
Added in version 2.2.0.
- INRUPT_LOGGING_REDACTION_NAME_LEVEL#
A new log level to use for the log message if the
INRUPT_LOGGING_REDACTION_{NAME}_ACTION
isPRIORITIZE
.Added in version 2.2.0.
- INRUPT_LOGGING_REDACTION_NAME_PATTERN#
A regex (see Java regex pattern) to match in the log messages. Configure to target log message text that matches a specified pattern.
For more information on log redaction, see Logging Redaction.
Added in version 2.2.0.
- INRUPT_LOGGING_REDACTION_NAME_REPLACEMENT#
Replacement text to use if the
INRUPT_LOGGING_REDACTION_{NAME}_ACTION
isREPLACE
.If unspecified, defaults to
[REDACTED]
.For more information on log redaction, see Logging Redaction.
Added in version 2.2.0.
Additional Information#
See also Quarkus Configuration Options.