# Pruning

ESS provides a pruning feature to perform hard deletes of [soft-deleted resources](https://docs.inrupt.com/ess/pod-resources#resource-deletion) and [orphan data](https://docs.inrupt.com/ess/pod-resources#modification-of-resource-content).

## Pruning (Hard Deletes)

Prune consists of two [Kubernetes CronJobs](https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/) :

* [One](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#pruning-soft-deleted-resources) to delete “prunable” resources. Prunable resources are resources that have been marked for deletion (i.e., soft-deleted) and are past their [**`INRUPT_STORAGE_PRUNE_RETENTION_WINDOW`**](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#inrupt_storage_prune_retention_window) .
* [One](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#pruning-orphan-data) to delete orphan data. Orphan data are resource data without associated metadata.

{% hint style="warning" %}
**Important**\
Pruning operations may negatively affect performance. If possible, schedule the CronJob to run at times when you can minimize its impact. To configure the Prune jobs, see [Modify Prune Configuration](https://docs.inrupt.com/ess/2.5/installation/customize-configurations/customization-pod-maintenance/modify-prune) .
{% endhint %}

## Configuration

### Configuration to Prune Soft-Deleted Resources

For [soft-deleted resources](https://docs.inrupt.com/ess/pod-resources#resource-deletion), Prune has the following configurations:

* [**`INRUPT_STORAGE_PRUNE_RETENTION_WINDOW`**](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#inrupt_storage_prune_retention_window)
  * Required.
  * Determines which soft-deleted resources are “prunable”.
  * Specify the value in a format supported by Java [Duration.parse()](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/time/Duration.html#parse\(java.lang.CharSequence\)) method.
* [**`INRUPT_STORAGE_PRUNE_PRUNABLE_BATCH_SIZE`**](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#inrupt_storage_prune_prunable_batch_size)
  * Required.
  * Limits the number of results returned when querying the metadata.
  * Set to an integer value.
* [**`INRUPT_STORAGE_PRUNE_ORPHAN_BATCH_SIZE`**](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#inrupt_storage_prune_orphan_batch_size)
  * Required.
  * Set to **`0`** when pruning soft-deleted resources.
* [**`COM_INRUPT_STORAGE_METADATA_JDBC_CONNECTIONLIMITER_OPENCONNECTION_TIMEOUT_VALUE`**](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#com_inrupt_storage_metadata_jdbc_connectionlimiter_openconnection_timeout_value)
  * Required.
  * Determines how long to keep the connection to the metadata database open.
  * Set to an integer value. Adjust the value to accommodate changes in
    * [**`INRUPT_STORAGE_PRUNE_RETENTION_WINDOW`**](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#inrupt_storage_prune_retention_window)
    * [**`INRUPT_STORAGE_PRUNE_PRUNABLE_BATCH_SIZE`**](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#inrupt_storage_prune_prunable_batch_size)

To configure the Prune jobs, see [Modify Prune Configuration](https://docs.inrupt.com/ess/2.5/installation/customize-configurations/customization-pod-maintenance/modify-prune)

### Configuration to Prune Orphan Data

For [orphan data](https://docs.inrupt.com/ess/pod-resources#modification-of-resource-content), Prune has the following configurations:

* [**`INRUPT_STORAGE_PRUNE_RETENTION_WINDOW`**](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#inrupt_storage_prune_retention_window)
  * Optional.
  * No impact on pruning orphan data.
* [**`INRUPT_STORAGE_PRUNE_PRUNABLE_BATCH_SIZE`**](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#inrupt_storage_prune_prunable_batch_size)
  * Required.
  * Set to **`0`** when pruning orphan data.
* [**`INRUPT_STORAGE_PRUNE_ORPHAN_BATCH_SIZE`**](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#inrupt_storage_prune_orphan_batch_size)
  * Required.
  * Determines the maximum number of data identifiers selected by Prune during [orphan data](https://docs.inrupt.com/ess/pod-resources#modification-of-resource-content) cleanup.
  * Set to an integer value.
* [**`COM_INRUPT_STORAGE_METADATA_JDBC_CONNECTIONLIMITER_OPENCONNECTION_TIMEOUT_VALUE`**](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#com_inrupt_storage_metadata_jdbc_connectionlimiter_openconnection_timeout_value)
  * Required.
  * Determines how long to keep the connection to the metadata database open.
  * Adjust the value to accommodate change to [**`INRUPT_STORAGE_PRUNE_ORPHAN_BATCH_SIZE`**](https://docs.inrupt.com/ess/services/service-pod-management/service-pod-storage#inrupt_storage_prune_orphan_batch_size).

To configure the Prune jobs, see [Modify Prune Configuration](https://docs.inrupt.com/ess/2.5/installation/customize-configurations/customization-pod-maintenance/modify-prune)

## Observability

{% tabs %}
{% tab title="Default Logging" %}
{% hint style="warning" %}
Logging for Pruning jobs share a consistent pattern where the **`messageId`** has the prefix **`STORAGEPRUNE`** .
{% endhint %}

```json
{
  "timestamp":<value>,
  "sequence":<value>,
  "loggerClassName":<value>,
  "loggerName":<value>,
  "level":<value>,
  "message": "STORAGEPRUNE<number>: <description>",
  "threadName":<value>,
  "threadId":<value>,
  "hostName":<value>,
  "processName":<value>,
  "processId":<value>,
  "messageId": "STORAGEPRUNE<number>"
  // additional relevant fields, if any
}
```

For pruning jobs, the additional fields include:

* an **`mdc`** (managed diagnostic context) field that can be used for correlation;
* various pruning metrics.

The following lists the various pruning metrics that appear in the **`INFO`** level log messages (listed by the **`messageId`** ):

* **`STORAGEPRUNE000001`** (associated with the pruning start process)

  <table><thead><tr><th width="273.3984375">Field</th><th>Description</th></tr></thead><tbody><tr><td><strong><code>retentionWindowMilliseconds</code></strong></td><td>The value of the <a href="../../services/service-pod-management/service-pod-storage#inrupt_storage_prune_retention_window"><strong><code>configured retention window</code></strong></a>.</td></tr><tr><td><strong><code>prunableBatchSize</code></strong></td><td>The value of the <a href="../../services/service-pod-management/service-pod-storage#inrupt_storage_prune_prunable_batch_size"><strong><code>configured prunable batch size</code></strong></a>.</td></tr><tr><td><strong><code>orphanBatchSize</code></strong></td><td>The value of the <a href="../../services/service-pod-management/service-pod-storage#inrupt_storage_prune_orphan_batch_size"><strong><code>configured orphan batch size</code></strong></a>.</td></tr></tbody></table>
* **`STORAGEPRUNE000002`** (associated with finding prunable objects process)

  <table data-header-hidden><thead><tr><th width="217.624267578125"></th><th></th></tr></thead><tbody><tr><td><strong><code>resourceCount</code></strong></td><td>The number of Solid resource metadata entries found to fall outside the retention window.</td></tr><tr><td><strong><code>contentCount</code></strong></td><td>The number of Solid resource data entries found to belong to metadata entries that fall outside the retention window.</td></tr><tr><td><strong><code>durationMilliseconds</code></strong></td><td>Time taken to find prunable resource metadata.</td></tr></tbody></table>
* **`STORAGEPRUNE000005`** (associated with finding prunable resource data in persistence)

  <table data-header-hidden><thead><tr><th width="205.4140625"></th><th></th></tr></thead><tbody><tr><td><strong><code>resultCount</code></strong></td><td>The number of Solid resource data entries listed from S3.</td></tr><tr><td><strong><code>durationMilliseconds</code></strong></td><td>Time taken to list resource data.</td></tr></tbody></table>
* **`STORAGEPRUNE000007`** (associated with finding prunable orphan data in persistence)

  <table data-header-hidden><thead><tr><th width="205.734375"></th><th></th></tr></thead><tbody><tr><td><strong><code>resultCount</code></strong></td><td>The number of orphan data entries listed from S3.</td></tr><tr><td><strong><code>durationMilliseconds</code></strong></td><td>Time taken to list orphan data.</td></tr></tbody></table>
* **`STORAGEPRUNE000009`** (associated with pruning/deletion process from persistence)

  <table data-header-hidden><thead><tr><th width="207.3203125"></th><th></th></tr></thead><tbody><tr><td><strong><code>resultCount</code></strong></td><td>The number of Solid resource data entries deleted from S3.</td></tr><tr><td><strong><code>durationMilliseconds</code></strong></td><td>Time taken to delete resource data.</td></tr></tbody></table>
* **`STORAGEPRUNE000010`** (associated with pruning/deletion process from metadata)

  <table data-header-hidden><thead><tr><th width="211.8359375"></th><th></th></tr></thead><tbody><tr><td><strong><code>durationMilliseconds</code></strong></td><td>Time taken to delete resource metadata.</td></tr></tbody></table>

{% endtab %}

{% tab title="Prometheus" %}
Prune emits Prometheus metrics with the following labeled names.

All of the following are prefixed with **`application_com_inrupt_storage_prune_`**.

<table data-header-hidden><thead><tr><th width="345.00390625"></th><th></th></tr></thead><tbody><tr><td><strong><code>{type="retentionWindow"}</code></strong></td><td>The value of the <a href="../../services/service-pod-management/service-pod-storage#inrupt_storage_prune_retention_window"><strong><code>configured retention window</code></strong></a>.</td></tr><tr><td><strong><code>{type="prunableBatchSize"}</code></strong></td><td>The value of the <a href="../../services/service-pod-management/service-pod-storage#inrupt_storage_prune_prunable_batch_size"><strong><code>configured prunable batch size</code></strong></a>.</td></tr><tr><td><strong><code>{type="orphanBatchSize"}</code></strong></td><td>The value of the <a href="../../services/service-pod-management/service-pod-storage#inrupt_storage_prune_orphan_batch_size"><strong><code>configured orphan batch size</code></strong></a>.</td></tr><tr><td><strong><code>{type="resource", status="prunable"}</code></strong></td><td>The number of Solid resource metadata entries found to fall outside the retention window.</td></tr><tr><td><strong><code>{type="data", status="prunable"}</code></strong></td><td>The number of Solid resource data entries found to belong to metadata entries that fall outside the retention window.</td></tr><tr><td><strong><code>{type="data", status="listed"}</code></strong></td><td>The number of Solid resource data entries listed from S3.</td></tr><tr><td><strong><code>{type="data", status="orphan"}</code></strong></td><td>The number of Solid resource data entries (out of the total listed) found to lack corresponding listed metadata entries.</td></tr><tr><td><strong><code>{type="data", status="deleted"}</code></strong></td><td>The number of Solid resource data entries deleted from S3.</td></tr><tr><td><strong><code>Pruner_findPrunable</code></strong></td><td>Time (in milliseconds) taken to find prunable resource metadata (i.e., metadata associated with soft-deleted resources).</td></tr><tr><td><strong><code>Pruner_listData</code></strong></td><td>Time (in milliseconds) taken to list resource data.</td></tr><tr><td><strong><code>Pruner_findOrphans</code></strong></td><td>Time (in milliseconds) taken to identify orphan resource data.</td></tr><tr><td><strong><code>Pruner_deleteData</code></strong></td><td>Time (in milliseconds) taken to delete resource data.</td></tr><tr><td><strong><code>Pruner_pruneMetadata</code></strong></td><td>Time (in milliseconds) taken to delete resource metadata.</td></tr></tbody></table>
{% endtab %}

{% tab title="OpenTelemetry" %}
When OpenTelemetry is configured, the application emits a single span named **`prune`** with the following attributes.

<table data-header-hidden><thead><tr><th width="258.303955078125"></th><th></th></tr></thead><tbody><tr><td><strong><code>retentionWindowMilliseconds</code></strong></td><td>The value of the <a href="../../services/service-pod-management/service-pod-storage#inrupt_storage_prune_retention_window"><strong><code>configured retention window</code></strong></a>.</td></tr><tr><td><strong><code>prunableBatchSize</code></strong></td><td>The value of the <a href="../../services/service-pod-management/service-pod-storage#inrupt_storage_prune_prunable_batch_size"><strong><code>configured prunable batch size</code></strong></a>.</td></tr><tr><td><strong><code>orphanBatchSize</code></strong></td><td>The value of the <a href="../../services/service-pod-management/service-pod-storage#inrupt_storage_prune_orphan_batch_size"><strong><code>configured orphan batch size</code></strong></a>.</td></tr><tr><td><strong><code>resourceCount</code></strong></td><td>The number of Solid resource metadata entries found to fall outside the retention window.</td></tr><tr><td><strong><code>dataCount</code></strong></td><td>The number of Solid resource data entries found to belong to metadata entries that fall outside the retention window.</td></tr></tbody></table>
{% endtab %}
{% endtabs %}
