Pruning#

Starting in version 2.1, ESS provides a pruning feature to perform hard deletes of soft-deleted resources and orphan data.

Pruning (Hard Deletes)#

Prune consists of two Kubernetes CronJobs :

  • One to delete “prunable” resources. Prunable resources are resources that have been marked for deletion (i.e., soft-deleted) and are past their INRUPT_STORAGE_PRUNE_RETENTION_WINDOW.

  • One to delete orphan data. Orphan data are resource data without associated metadata.

Important

Pruning operations may negatively affect performance. If possible, schedule the CronJob to run at times when you can minimize its impact. To configure the Prune jobs, see Modify Prune Configuration.

Configuration#

Configuration to Prune Soft-Deleted Resources#

For soft-deleted resources, Prune has the following configurations:

To configure the Prune jobs, see Modify Prune Configuration.

Configuration to Prune Orphan Data#

For orphan data, Prune has the following configurations:

To configure the Prune jobs, see Modify Prune Configuration.

Observability#

As part of the ESS 2.2 Logging Enhancements, logging for Pruning jobs share the consistent pattern where the messageId has the prefix STORAGEPRUNE.

{
  "timestamp":<value>,
  "sequence":<value>,
  "loggerClassName":<value>,
  "loggerName":<value>,
  "level":<value>,
  "message": "STORAGEPRUNE<number>: <description>",
  "threadName":<value>,
  "threadId":<value>,
  "hostName":<value>,
  "processName":<value>,
  "processId":<value>,
  "messageId": "STORAGEPRUNE<number>"
  // additional relevant fields, if any
}

For pruning jobs, the additional fields include:

  • an mdc (managed diagnostic context) field that can be used for correlation;

  • various pruning metrics.

The following lists the various pruning metrics that appear in the INFO level log messages (listed by the messageId):

  • STORAGEPRUNE000001 (associated with the pruning start process)

    Field

    Description

    retentionWindowMilliseconds

    The value of the configured retention window.

    prunableBatchSize

    The value of the configured prunable batch size.

    orphanBatchSize

    The value of the configured orphan batch size.

  • STORAGEPRUNE000002 (associated with finding prunable objects process)

    resourceCount

    The number of Solid resource metadata entries found to fall outside the retention window.

    contentCount

    The number of Solid resource data entries found to belong to metadata entries that fall outside the retention window.

    durationMilliseconds

    Time taken to find prunable resource metadata.

  • STORAGEPRUNE000005 (associated with finding prunable resource data in persistence)

    resultCount

    The number of Solid resource data entries listed from S3.

    durationMilliseconds

    Time taken to list resource data.

  • STORAGEPRUNE000007 (associated with finding prunable orphan data in persistence)

    resultCount

    The number of orphan data entries listed from S3.

    durationMilliseconds

    Time taken to list orphan data.

  • STORAGEPRUNE000009 (associated with pruning/deletion process from persistence)

    resultCount

    The number of Solid resource data entries deleted from S3

    durationMilliseconds

    Time taken to delete resource data.

  • STORAGEPRUNE000010 (associated with pruning/deletion process from metadata)

    durationMilliseconds

    Time taken to delete resource metadata.