This blog is a Japanese translation of what was written by Mark Kalus (Principal Product Manager with Amazon S3) and Lee Kear (Storage Specialist Solutions Architect specializing in S3). Click here for the original.
If your Amazon S3 buckets are growing and spanning tens to hundreds of accounts, you're looking for a tool that can manage your growing storage and improve cost efficiency. S3 Storage Lens is an analytics feature built into the S3 console that helps you visualize object storage usage and activity trends across your organization and identify cost-cutting opportunities. S3 Storage Lens is free to use with all S3 accounts. You can also upgrade to Advanced Metrics to enable additional metrics, insights, and extended data retention periods.
Readers of this blog will have a basic understanding of how to use S3 Storage Lens to identify typical cost savings opportunities and apply changes to achieve those cost savings. I think.
Identify large buckets that you do not know
The first step in managing storage costs is to get a closer look at usage by bucket. With S3 Storage Lens, you have a centralized view of all your buckets in your account. You can also configure a dashboard per AWS Organization to see all buckets for all your accounts. With S3 Storage Lens, you can easily visualize all your buckets, and you may discover something unexpected, such as a bucket with more objects than you expected.
One way to discover these buckets is to scroll down to the top buckets section of the overview tab of the S3 Storage Lens dashboard to see the ranking of high-capacity buckets. The dashboard sorts the buckets by the Total storage metric for the selected date, as shown in the screenshot below.
You can also adjust the number of buckets to display (up to 25), switch the sort order to display smaller buckets, and adjust the ranking with more than 29 other metrics. The view also shows a small graph that visualizes the 14-day trend (30-day trend if you upgraded to an advanced level), as well as the rate of change from the previous day or the previous week.
From here, you can go to the Buckets tab on your dashboard for more information about buckets. For example, in the following figure, you can see that one bucket has increased significantly from 10GB to 15GB in just 30 days compared to other buckets.
If you find a bucket that has more data growth than other buckets, you can dig into that bucket with S3 Storage Lens to collect more detailed information such as average object size and maximum prefix. Finally, you can navigate to the bucket within the Amazon S3 console to understand the associated workload and identify the bucket owner based on your account number. You can then ask the bucket owner if this increase is expected or unexpected and put it under proper monitoring and control.
Finding and eliminating incomplete multipart upload bytes
The multi-part upload function is useful when uploading very large objects (100MB or more). You can improve throughput and network problems by uploading one object in multiple parts. However, if you do not complete the multipart upload process, incomplete parts will remain in the bucket (unusable) and storage costs will be incurred until you choose to complete the upload process or take specific action to delete it. Occurs.
With S3 Storage Lens, you can see the number of incomplete multipart upload bytes for your account or for your entire AWS Organization. You can see this metric (percentage of total bytes) in the Cost efficiency tab in the snapshot section at the top of the overview tab (see figure below).
You can also select incomplete multipart upload bytes as a metric in other charts on the S3 Storage Lens dashboard. By doing so, you can further assess the impact of incomplete multipart upload bytes on storage. For example, you can assess the impact on the overall growth trend, or identify a particular bucket where incomplete multipart upload bytes are accumulating.
In addition, you can take action by creating a lifecycle policy that clears the incomplete multipart upload bytes from the bucket after a specified number of days.
Expanded use of S3 storage class
One way to reduce storage costs is to use S3 storage classes to optimize storage costs according to access frequency and required performance. Currently, Amazon S3 offers seven storage classes to support a wide range of access frequencies and corresponding pricing. Its contents are as follows.
If you're not sure how you're currently using your S3 storage class, you can easily find out with the S3 Storage Lens. From the overview tab, scroll down to see the Storage class distribution diagram, which looks like this:
If all or almost all storage capacity is listed as S3 Standard storage class, it means that you are not making good use of your S3 storage class. In such cases, one of two cost-optimized design patterns is available. First, you can automate cost optimization by choosing the best S3 Intelligent-Tiering storage class for unknown or changing access patterns. Second, by setting an S3 lifecycle policy, you can reduce storage costs by migrating infrequently accessed data over time to a more cost-effective storage class. .. Please check the S3 pricing page for details on specific reduction effects. Also, be aware of the per-object migration and the additional costs of S3 Glacier.
Then continue your analysis with the S3 Storage Lens to get a deeper look at storage class usage and dig deeper into the distribution of storage classes in specific regions and buckets (prefixes if you're upgrading to the Advanced level). can do. Often there is a subset of buckets that are not optimally configured for storage class utilization, and S3 Storage Lens is a useful tool for screening these buckets before moving on to the next action.
Reduce the number of past versions you have
Amazon S3 versioning allows you to keep multiple versions of the same object, which you can use to quickly recover your data if you accidentally delete or overwrite it. Versioning has a cost impact if you accumulate a large amount of past versions and do not implement the lifecycle policies needed to manage them.
To see if you're having trouble accumulating past versions, open the S3 Storage Lens and go to the Cost efficiency tab in the Snapshot section of the Overview tab. There is an indicator here called% non-current version bytes, which shows the percentage of the total number of bytes (within the scope of the dashboard) for the selected date that is due to the non-current version.
As a rule of thumb, if% non-current version bytes occupy more than 10% of your storage at the account level, it may indicate that you have too many versions stored. This indicator can also be seen in the top buckets section, which, as shown here, can identify specific buckets that are accumulating large amounts of past versions.
You can identify buckets that require further attention, navigate to the buckets within the S3 console, enable lifecycle policies, and revoke past versions after a specified number of days. You can also reduce costs while preserving your data by configuring older versions to migrate to S3 Glacier.
Excavation of a cold bucket
When a bucket becomes Cold, that is, the storage in the bucket is inaccessible (or almost inaccessible), it often indicates that the associated workload is no longer in use. If you have advanced metrics enabled for your S3 Storage Lens, you can access your activity metrics to see how hot (or cold) your bucket is. Metrics such as GET requests and download bytes indicate how often the bucket is accessed daily. By trending this data over several months, you can understand the consistency of access patterns and discover buckets that are no longer accessed at all. The% retrieval rate calculated by Download bytes / Total storage is a useful indicator of the percentage of storage in your bucket that is accessed daily. Note that the download bytes will be duplicated if the same object is downloaded multiple times in a day.
In S3 Storage Lens, the bubble analysis chart in the bucket tab of the dashboard is one of the interesting display items. Here you can select the% retrieval rate as an indicator, and you can plot the bucket in multiple dimensions using three indicators: X-axis, Y-axis, and bubble size (see below).
If you select Total storage,% retrieval rate, and Avg. Object size and focus on buckets with a retrieval rate of zero (or close to zero) and a relatively large storage size, you can see cold buckets and storage. You can find buckets that may be expensive. From here, you can identify the owner of the bucket in your organization, determine the purpose of your workload, and find out if you still need storage. If you don't need it, you can save money by setting a lifecycle expiration policy or archiving your data with Amazon S3 Glacier. And to continue avoiding cold bucket issues, you can apply one of the recommended design patterns in this blog to automatically migrate your data using the S3 lifecycle policy, or S3 Intelligent. -You can enable automatic archiving with Tiering.
summary
In this blog, I introduced you to 5 ways to improve cost efficiency by utilizing the information of S3 Storage Lens. The techniques presented here can quickly lead to cost savings. In addition, continued use of S3 Storage Lens can provide the visibility needed to maintain or further scale these cost-effectiveness over the long term of storage.
In addition to these five ideas, there are many ways to apply S3 Storage Lens information to make storage cost-effective and implement data protection best practices, so stay tuned for future blog posts. please. You can also contact your account team or Amazon S3 Support for information on how to use the S3 Storage Lens.
Thank you for reading this blog on how to reduce storage costs with Amazon S3 Storage Lens.