Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warm performance tier #190

Merged
merged 4 commits into from
Sep 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Management-Utilities/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This subfolder contains tools that can help you manage your FSx ONTAP file syste
| [auto_create_sm_relationships](/Management-Utilities/auto_create_sm_relationships) | This tool will automatically create SnapMirror relationships between two FSx ONTAP file systems. |
| [autto_set_fsxn_auto_grow](/Management-Utilities/auto_set_fsxn_auto_grow) | This tool will automatically set the auto size mode of an FSx for ONTAP volume to 'grow'. |
| [fsx-ontap-aws-cli-scripts](/Management-Utilities/fsx-ontap-aws-cli-scripts) | This repository contains a collection of AWS CLI scripts that can help you manage your FSx ONTAP file system. |
| [fsxn-rotate-secret](/Management-Utilities/fsxn-rotate-secret) | This is a Lambda function to be used with an AWS Secrets Manager secret to rotate the FSx for ONTAP admin password. |
| [fsxn-rotate-secret](/Management-Utilities/fsxn-rotate-secret) | This is a Lambda function that can be used with an AWS Secrets Manager secret to rotate the FSx for ONTAP admin password. |
| [iscsi-vol-create-and-mount](/Management-Utilities/iscsi-vol-create-and-mount) | This tool will create an iSCSI volume on an FSx ONTAP file system and mount it to an EC2 instance running Windows. |
| [warm_performance_tier](/Management-Utilities/warm_performance_tier) | This tool to warm up the performance tier of an FSx ONTAP file system volume. |

Expand Down
72 changes: 53 additions & 19 deletions Management-Utilities/warm_performance_tier/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,29 +2,33 @@

## Introduction
This sample provides a script that can be used to warm a FSx for ONTAP
volume. In other words, it ensures that all the blocks for a volume are in
volume. In other words, it tries to ensure that all the blocks for a volume are in
the "performance tier" as opposed to the "capacity tier." It does that by
simply reading every byte of every file in the volume. Doing that
causes all blocks that are currently in the capacity tier to be pulled
into the performance tier before being returned to the reader. At that point,
assuming the tiering policy is not set to 'all', all the data should remain
assuming the tiering policy is not set to 'all' or 'snapshot', all the data should remain
in the performance tier until ONTAP tiers it back based on the volume's
tiering policy.

Note that Data ONTAP will not store data in the performance
tier from the capacity tier if it detects that the data is being read
sequentially. This is to keep things like backups and virus scans from
filling up the performance tier. Because of that, this script will
read files in "reverse" order. Meaning it will read the last block of
the file first, then the second to last block, and so on.
Note that, by default, Data ONTAP will not store data in the performance
tier from the capacity tier if it detects that the data is being read sequentially.
This is to keep things like backups and virus scans from filling up the performance tier.
You can, and should, override this behavior by setting
the cloud-retrieval-policy to "on-read" for the volume. Examples on
how to do that are shown below.

In an additional effort to try to get ONTAP to keep data in the performance tier
after reading it in, this script will read files in "reverse" order. Meaning
it will read the last block of the file first, then the second to last block, and so on.

To speed up the process, the script will spawn multiple threads to process
the volume. It will spawn a separate thread for each directory
in the volume, and then a separate thread for each file in that directory.
The number of directory threads is controlled by the -t option. The number
of reader threads is controlled by the -x option. Note that the script
will spawn -x reader threads **per** directory thread. So for example, if you have 4
directory threads and 10 reader threads, you could have up to 40 reader
will spawn -x reader threads **per** directory thread. So, for example, if you have 2
directory threads and 5 reader threads, you could have up to 10 reader
threads running at one time.

Since the goal of this script is to force all the data that is currently
Expand All @@ -34,20 +38,41 @@ You can use the `volume show-footprint` ONTAP command to see how much space
is currently in the capacity tier. You can then use `storage aggregate show`
to see how much space is available in the performance tier.

Note that it will not be uncommon for there to still be data in the
capacity tier after running this script. There can be several reasons
for that. For example:

* Space is from snapshots that aren't part of the live volume anymore.
* Space from blocks that are part of an object in the object store, but aren't
part of the volume. This space will get consolidated eventually.
* Some is from metadata that is always kept in the capacity tier.

Even with the reasons mentioned above, we have found that running the
script twice does, typically, get more data into the performance tier so
if you are trying to get as much data as possible into the performance tier,
it is recommended to run the script twice.

## Set Up
The script is meant to be run on a Linux based host that is able to NFS
The first step is to ensure the volume's tiering policy is set
to something other than "all" or "snapshot-only". You should also ensure
that the cloud-retrieval-policy is set to "on-read". You can make
both of these changes with the following commands:
```
set advanced -c off
volume modify -vserver <vserver> -volume <volume> -tiering-policy auto -cloud-retrieval-policy on-read
```
Where `<vserver>` is the name of the SVM and `<volume>` is the name of the volume.

The next step is to copy the script to a Linux based host that is able to NFS
mount the volume to be warmed. If the volume is already mounted, then
any user that has read access to the files in the volume can run it.
any user that has read access to all the files in the volume can run it.
Otherwise, the script needs to be run as 'root' so it can mount the
volume before reading the files.

If the 'root' user can't read the files in the volume, then you should use 'root' user just
If the 'root' user can't read the all files in the volume, then you should use the 'root' user just
to mount the volume and then run the script from a user ID that can read the contents
of all the files in the volume.

Make sure you have set the tiering policy on the volume set to something
other than "all" or "snapshot-only", otherwise the script will be ineffective.

# Running The Script
There are two main ways to run the script. The first is to just provide
the script with a directory to start from using the -d option. The script will then read
Expand All @@ -66,7 +91,7 @@ To run this script you just need to change the UNIX permissions on
the file to be executable, then run it as a command:
```
chmod +x warm_performance_tier
./warm_performance_tier -d /path/to/mount/point
./warm_performance_tier -d /path/to/mount/point
```
The above example will force the script to read every file in the /path/to/mount/point
directory and any directory under it.
Expand All @@ -88,8 +113,8 @@ Where:
-v volume_name - Is the name of the volume.
-n nfs_type - Is the NFS version to use. Default is nfs4.
-d directory - Is the root directory to start the process from.
-t max_directory_threads - Is the maximum number of threads to use to process directories. The default is 10.
-x max_read_threads - Is the maximum number of threads to use to read files. The default is 4.
-t max_directory_threads - Is the maximum number of threads to use to process directories. The default is 2.
-x max_read_threads - Is the maximum number of threads to use to read files. The default is 5.
-V - Enable verbose output. Displays the thread ID, date (in epoch seconds), then the directory or file being processed.
-h - Prints this help information.

Expand All @@ -101,6 +126,15 @@ Notes:
reading files.
```

## Finishing Step
After running the script, you should set the cloud-retrieval-policy back to "default" by running
the following commands:
```
set advanced -c off
volume modify -vserver <vserver> -volume <volume> -cloud-retrieval-policy default
```
Where `<vserver>` is the name of the SVM and `<volume>` is the name of the volume.

## Author Information

This repository is maintained by the contributors listed on [GitHub](https://github.com/NetApp/FSx-ONTAP-samples-scripts/graphs/contributors).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ Where:
-v volume_name - Is the ID of the volume.
-n nfs_type - Is the NFS version to use. Default is nfs4.
-d directory - Is the root directory to start the process from.
-t max_directory_threads - Is the maximum number of threads to use to process directories. The default is 10.
-x max_read_threads - Is the maximum number of threads to use to read files. The default is 4.
-t max_directory_threads - Is the maximum number of threads to use to process directories. The default is 5.
-x max_read_threads - Is the maximum number of threads to use to read files. The default is 2.
-V - Enable verbose output. Displays the thread ID, date (in epoch seconds), then the directory or file being processed.
-h - Prints this help information.

Expand Down Expand Up @@ -98,10 +98,14 @@ isMounted () {
################################################################################
readFile () {
local file=$1
local blockSize=$((4*1024*1024))
local blockSize=$((2*1024*1024))

fileSize=$(stat -c "%s" "$file")
fileBlocks=$(($fileSize/$blockSize))
fileBlocks=$((fileSize/blockSize))
if [ $((fileSize % blockSize)) -ne 0 -o $fileSize -eq 0 ]; then
let fileBlocks+=1
fi

while [ $fileBlocks -ge 0 ]; do
if dd if="$file" of=/dev/null bs=$blockSize count=1 skip=$fileBlocks > /dev/null 2>&1; then
:
Expand Down Expand Up @@ -164,8 +168,8 @@ processDirectory () {
################################################################################
#
# Set some defaults.
maxDirThreads=4
maxFileThreads=10
maxDirThreads=2
maxFileThreads=5
nfsType=nfs4
verbose=false
#
Expand Down