Optimize performance in buckets with hierarchical namespace enabled

This page provides guidance on how you can optimize performance in buckets with hierarchical namespace enabled.

Listing objects

The following are the performance considerations for listing objects:

  • In buckets with hierarchical namespace enabled, listing all objects for the entire bucket or with a prefix is resource-intensive as the operation must traverse each folder and subfolder, similar to the ls -r command in a file system. Consequently, if there are more folders in your bucket, the slower the object listing happens. A large number of empty folders can also negatively impact object listing performance. To avoid negatively impacting performance, we recommend that you maximize the number of objects in each folder and regularly delete empty folders.
  • Listing or retrieving objects and sub folders within a specific folder using a delimiter and a specific prefix is more efficient in buckets with hierarchical namespace enabled as the objects are organized within a folder structure. To optimize listing performance when using a delimiter and a specific prefix, set the includeFoldersAsPrefixes parameter. Otherwise, Cloud Storage performs additional checks to exclude empty folders, which can slow down the operation. For more information about using the includeFoldersAsPrefixes when listing objects, see Listing objects.

Folder management

For efficient folder management, we recommend the following:

  • Pre-create folder structure: Instead of relying on automatic folder creation during object upload, rewrite, and compose operations, use the create folder operation to obtain your intended folder structure in advance. Pre-creating the folder structure improves the performance consistency and predictability.
  • Maximize objects per folder ratio: Aim for a high objects-to-folder ratio as it reduces the overhead associated with folder creation and management.
  • Limit folder creation and deletion requests: Creating or deleting folders is more resource-intensive than working with individual objects due to its hierarchical nature. To ensure a smooth performance, Cloud Storage limits these operations to 1000 requests per second for each bucket. Requests exceeding this limit are not explicitly restricted but resource availability determines whether they can be processed successfully.
  • Regularly delete empty folders: Empty folders can accumulate, especially when using Object Lifecycle Management or deleting objects without explicitly deleting their parent folders. The accumulated folders can impact the performance of object listing operation and other folder related operations. The following are some of the methods that you can use to delete empty folders:
    • When you use Cloud Storage FUSE or Cloud Storage connector to interact with a bucket enabled with hierarchical namespace, deleting a directory deletes the corresponding folder in your bucket.
    • You can use a recursive delete to delete folders automatically when using the Trusted Cloud console or Google Cloud CLI.
    • You can use the delete_empty_folders.py script to periodically delete empty folders using parallel processing. The script provides an option to target a specific folder path prefix, which allows the script to perform folder deletions on a subset of the bucket's directory structure. Additionally, the script deletes all empty folders (created implicitly or explicitly) including managed folders and their associated IAM policies. For details about how to use the script, see the README on GitHub.

What's next