Advanced: Available Scratch Space Options

There are two options when it comes to using a scratch space instead of your home directory when running a job. This is useful when you have many jobs or when you are close to your quota on your home directory. The two options are the local disk on the node and the Lustre parallel file system. Each has its pros and cons.

Local Disk

The local disk is great when you have many small files and temporary files and your jobs are running on a single node. The grid engine creates a directory for each job in $TMPDIR and it is destroyed when the job completes. The main issue is this space can be limited as the local disk in each node is only a few tens of gigabyte.

Lustre File System

This parallel network file system is a good choice when you have a distributed job running on many nodes. Also, this file system has many terabytes available and no quotas on usage. However, please be aware that this space is not backed up and any valuable data should be moved out as needed. Also, should total use on this file system get too high, we will begin deleting files to make space. Another issue to be aware of is Lustre is not great at small files as this can greatly increase the overhead network activity and slow the system down. It is best suited for large files.


How these spaces are used in a job script depends on how you build them. File Staging can be done in different ways, either manually before a job is submitted or programmatically within a job script. It is best to first submit simple short jobs to test for bugs in any job scripts that programmatically stage files. The Lustre file system is available on all nodes and files can easily be staged manually when logged onto a node via qlogin. The local disk space will only be visible from a running job script. Be sure to include commands to clean up after your program completes. In the case of the local disk space, make sure you copy out any generated results files as $TMPDIR will be deleted upon job completion.