Storage to each administrative, login or compute node is provided via a scale-out NAS (network attached storage) Isilon storage array using the 10 Gb/s network backplane, and a 1.5PB IBM GSS GPFS 26 Paralel File System (See figure). Each node using NFS v3 mounts /home, and via GPFS mounts /scratch (/gss_gpfs_scratch) and /shared. Details about IBM GSS GPFS are here.
/home is 1TB in size and has user home directories. The hard limit on each user directory is 30GB.
/scratch (/gss_gpfs_scratch) is 1.1PB (1100TB) in size and is a temporary space for user files during cluster runs. The naming convention for users to follow is to create a temporary space in /scratch/<user_id> (/gss_gpfs_scratch/<user_id>). After completion of cluster runs the data in <user_id> folder must be deleted. Use sub-folders in the <user_id> folder for different runs. File staging should be done (preferably using SLURM submit scripts) before the job starts and copying output data and removing intermediate files should also be done when the job ends (preferably using the same SLURM submit scripts). If file sizes are too large it may be more practical to first stage the files. Then run your jobs, and after the jobs ends to remove them. It is the responsibility of the user to ensure files are moved after a job completes. The NFS mount /home should not be used for large scale computational work. If your jobs involve large I/O use the local /tmp space on the compute nodes local disk for local storage. This is roughly ~ 700 GBytes per compute node. Many parallel applications require common storage and this may not be an option. In such a case use the GSS-GPFS parallel file system common mount across the cluster /scratch (/gss_gpfs_scratch).
/shared has software for the Discovery cluster that is made available using modules. The current size is 122 TB. For more information on software on Discovery go here.