This page provides guidance on how you can optimize performance in buckets with hierarchical namespace enabled.
Listing objects
The following are the performance considerations for listing objects:
- In buckets with hierarchical namespace enabled, listing all objects for the
entire bucket or with a prefix is resource-intensive as the operation must
traverse each folder and subfolder, similar to the
ls -r
command in a file system. Consequently, if there are more folders in your bucket, the slower the object listing happens. A large number of empty folders can also negatively impact object listing performance. To avoid negatively impacting performance, we recommend that you maximize the number of objects in each folder and regularly delete empty folders. - Listing or retrieving objects and sub folders within a specific folder using
a delimiter and a specific prefix is more efficient in buckets with
hierarchical namespace enabled as the objects are organized within a folder
structure. To optimize listing performance when using a delimiter and a
specific prefix, set the
includeFoldersAsPrefixes
parameter. Otherwise, Cloud Storage performs additional checks to exclude empty folders, which can slow down the operation. For more information about using theincludeFoldersAsPrefixes
when listing objects, see Listing objects.
Folder management
For efficient folder management, we recommend the following:
- Pre-create folder structure: Instead of relying on automatic folder creation during object upload, rewrite, and compose operations, use the create folder operation to obtain your intended folder structure in advance. Pre-creating the folder structure improves the performance consistency and predictability.
- Maximize objects per folder ratio: Aim for a high objects-to-folder ratio as it reduces the overhead associated with folder creation and management.
- Limit folder creation and deletion requests: Creating or deleting folders is more resource-intensive than working with individual objects due to its hierarchical nature. To ensure a smooth performance, Cloud Storage limits these operations to 1000 requests per second for each bucket. Requests exceeding this limit are not explicitly restricted but resource availability determines whether they can be processed successfully.
- Regularly delete empty folders: Empty folders can accumulate, especially
when using Object Lifecycle Management or deleting objects without
explicitly deleting their parent folders. The accumulated folders can impact
the performance of object listing operation and other folder related
operations. The following are some of the methods that you can use to delete
empty folders:
- When you use Cloud Storage FUSE or Cloud Storage connector to interact with a bucket enabled with hierarchical namespace, deleting a directory deletes the corresponding folder in your bucket.
- You can use a recursive delete to delete folders automatically when using the Trusted Cloud console or Google Cloud CLI.
- You can use the
delete_empty_folders.py
script to periodically delete empty folders using parallel processing. The script provides an option to target a specific folder path prefix, which allows the script to perform folder deletions on a subset of the bucket's directory structure. Additionally, the script deletes all empty folders (created implicitly or explicitly) including managed folders and their associated IAM policies. For details about how to use the script, see the README on GitHub.