Pruning
ESS provides a pruning feature to perform hard deletes of soft-deleted resources and orphan data.
Pruning (Hard Deletes)
Prune consists of two Kubernetes CronJobs :
One to delete “prunable” resources. Prunable resources are resources that have been marked for deletion (i.e., soft-deleted) and are past their
INRUPT_STORAGE_PRUNE_RETENTION_WINDOW
.One to delete orphan data. Orphan data are resource data without associated metadata.
Important Pruning operations may negatively affect performance. If possible, schedule the CronJob to run at times when you can minimize its impact. To configure the Prune jobs, see Modify Prune Configuration .
Configuration
Configuration to Prune Soft-Deleted Resources
For soft-deleted resources, Prune has the following configurations:
INRUPT_STORAGE_PRUNE_RETENTION_WINDOW
Required.
Determines which soft-deleted resources are “prunable”.
Specify the value in a format supported by Java Duration.parse() method.
INRUPT_STORAGE_PRUNE_PRUNABLE_BATCH_SIZE
Required.
Limits the number of results returned when querying the metadata.
Set to an integer value.
INRUPT_STORAGE_PRUNE_ORPHAN_BATCH_SIZE
Required.
Set to
0
when pruning soft-deleted resources.
COM_INRUPT_STORAGE_METADATA_JDBC_CONNECTIONLIMITER_OPENCONNECTION_TIMEOUT_VALUE
Required.
Determines how long to keep the connection to the metadata database open.
Set to an integer value. Adjust the value to accommodate changes in
To configure the Prune jobs, see Modify Prune Configuration
Configuration to Prune Orphan Data
For orphan data, Prune has the following configurations:
INRUPT_STORAGE_PRUNE_RETENTION_WINDOW
Optional.
No impact on pruning orphan data.
INRUPT_STORAGE_PRUNE_PRUNABLE_BATCH_SIZE
Required.
Set to
0
when pruning orphan data.
INRUPT_STORAGE_PRUNE_ORPHAN_BATCH_SIZE
Required.
Determines the maximum number of data identifiers selected by Prune during orphan data cleanup.
Set to an integer value.
COM_INRUPT_STORAGE_METADATA_JDBC_CONNECTIONLIMITER_OPENCONNECTION_TIMEOUT_VALUE
Required.
Determines how long to keep the connection to the metadata database open.
Adjust the value to accommodate change to
INRUPT_STORAGE_PRUNE_ORPHAN_BATCH_SIZE
.
To configure the Prune jobs, see Modify Prune Configuration
Observability
Logging for Pruning jobs share a consistent pattern where the messageId
has the prefix STORAGEPRUNE
.
{
"timestamp":<value>,
"sequence":<value>,
"loggerClassName":<value>,
"loggerName":<value>,
"level":<value>,
"message": "STORAGEPRUNE<number>: <description>",
"threadName":<value>,
"threadId":<value>,
"hostName":<value>,
"processName":<value>,
"processId":<value>,
"messageId": "STORAGEPRUNE<number>"
// additional relevant fields, if any
}
For pruning jobs, the additional fields include:
an
mdc
(managed diagnostic context) field that can be used for correlation;various pruning metrics.
The following lists the various pruning metrics that appear in the INFO
level log messages (listed by the messageId
):
STORAGEPRUNE000001
(associated with the pruning start process)FieldDescriptionretentionWindowMilliseconds
The value of the
configured retention window
.prunableBatchSize
The value of the
configured prunable batch size
.orphanBatchSize
The value of the
configured orphan batch size
.STORAGEPRUNE000002
(associated with finding prunable objects process)resourceCount
The number of Solid resource metadata entries found to fall outside the retention window.
contentCount
The number of Solid resource data entries found to belong to metadata entries that fall outside the retention window.
durationMilliseconds
Time taken to find prunable resource metadata.
STORAGEPRUNE000005
(associated with finding prunable resource data in persistence)resultCount
The number of Solid resource data entries listed from S3.
durationMilliseconds
Time taken to list resource data.
STORAGEPRUNE000007
(associated with finding prunable orphan data in persistence)resultCount
The number of orphan data entries listed from S3.
durationMilliseconds
Time taken to list orphan data.
STORAGEPRUNE000009
(associated with pruning/deletion process from persistence)resultCount
The number of Solid resource data entries deleted from S3.
durationMilliseconds
Time taken to delete resource data.
STORAGEPRUNE000010
(associated with pruning/deletion process from metadata)durationMilliseconds
Time taken to delete resource metadata.
Last updated