Best practices
Overview
Teaching: 15 min
Exercises: 15 minQuestions
Dos, don’ts and general advice
Objectives
Become familiar with best practices in SCW systems
Understand quota allocation
Understand partitions purpose
This section introduces best practices when working on SCW systems. You can find more information in the SCW portal.
Login nodes
Do NOT run long jobs in the login nodes. Remember that login nodes (cl1, cl2, cla1) are shared nodes used by all users. Short testing, compilation, file transfer, light debugging are typically ok. Large MPI jobs are definitely not. If we detect heavy jobs running on the system the application will be terminated without notice.
Who is at home?
To help you realize the shared nature of the login nodes, try running this command:
$ whoWhat is this showing you?
Consequences of breaking the rules
By default each user is limited to the concurrent use of at most 8 processor cores and 200GB of memory. However, should heavy usage (defined as more than 75% of the allocation for continuous time 3 minutes) be made of this allocation then the user’s access will be reduced for a penalty period of time. There are three penalty levels that are reached if the user mantains the offending behaviour. The user’s access reduced by a fraction of the default quota by a defined period of time, both increase as the user moves through penalty levels. You can find more details in the SCW portal.
The user will receive an email notification if a violation of the usage policy has happened providing details about the offending application and current penalty level.
If you are not sure about why you are receiving this notifications or need further help do not hesitate to contact us.
Your .bashrc
Linux allows you to configure your work environment in detail. This is done by reading text files on login. Typically a user would edit .bashrc to finetune module loading, adding directories to PATH, etc. This is ok in personal systems but on SCW .bashrc is managed centrally. What this means is that in principle you can edit .bashrc but your changes might get lost without warning in case there is a need to reset this file. In turn we encourage users to edit .myenv which is also read on login and should have the same effect as editing .bashrc.
My application insist on editing .bashrc
Some applications include instructions on how to add changes to .bashrc. If this is your case, try doing the changes on .myenv, if this doesn’t work please get in contact with us and we will help you find the best solution.
Running jobs
Whenever possible:
- Estimate the time and resources that your job needs. This will reduce the time necessary to grant resources.
- Try to avoid submitting a massive number of small jobs since this creates an overhead in resource provision. Whenever possible, stack small jobs in single bigger jobs.
- Avoid creating thousands of small files. This has a negative impact on the global filesystem. Better to have a smaller number or larger files.
- Run a small test case (dev partition) before submitting a large job, to make sure it works as expected.
- Use –exclusive, only if you are certain that you require the resources (cpus, memory) of a full node.
- Disadvantages: takes longer for your job to be allocated resources, potentially inefficient if you are using less cpus/memory than provided.
- Use checkpoints if your application allows it. This will let you restart your job from the last successful state rather that rerunning from the beginning.
Partitions
Partitions in the context of SCW systems refer to groups of nodes with certain capabilities/features in common. You can list the partitions available to you with the following command:
$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST compute\* up 3-00:00:00 133 alloc ccs[0001-0009,0011-0134] compute\* up 3-00:00:00 1 idle ccs0010 compute_amd up 3-00:00:00 52 alloc cca[0001-0002,0005-0016,0018-0019,0021-0025,0027-0031,0036,0039-0041,0043-0064] compute_amd up 3-00:00:00 12 idle cca[0003-0004,0017,0020,0026,0032-0035,0037-0038,0042] highmem up 3-00:00:00 5 comp ccs[1001-1003,1009,1014] highmem up 3-00:00:00 2 mix ccs[1023,1025] highmem up 3-00:00:00 19 alloc ccs[1004-1008,1010-1013,1015-1022,1024,1026] gpu up 2-00:00:00 10 mix ccs[2002-2007,2009,2011-2013] gpu up 2-00:00:00 3 alloc ccs[2001,2008,2010] gpu_v100 up 2-00:00:00 9 mix ccs[2101-2103,2105,2108-2110,2112,2115] gpu_v100 up 2-00:00:00 6 alloc ccs[2104,2106-2107,2111,2113-2114] htc up 3-00:00:00 1 down\* ccs3004 htc up 3-00:00:00 2 comp ccs[1009,1014] htc up 3-00:00:00 17 mix ccs[1023,1025,2009,2011-2013,2103,2105,2108-2110,2112,2115,3010,3012,3019,3024] htc up 3-00:00:00 43 alloc ccs[1010-1013,1015-1022,1024,1026,2008,2010,2104,2106-2107,2111,2113-2114,3001-3003,3005-3009,3011,3013-3018,3020-3023,3025-3026] dev up 1:00:00 2 idle ccs[0135-0136]
In the output above the user has access to several partitions and should submit jobs depending on the application requirements since each partition is ideally designed for different kind of jobs.
Partition | Meant for | Avoid |
---|---|---|
compute | Parallel and MPI jobs | Serial (non-MPI) jobs |
compute_amd | Parallel and MPI jobs using AMD EPYC 7502 | Serial (non-MPI) jobs |
highmem | Large memory (384GB) jobs | Jobs with low or standard memory requirements |
gpu | GPU (CUDA) jobs - P100 | Non-GPU jobs |
gpu_v100 | GPU (CUDA) jobs - V100 | Non-GPU jobs |
htc | High Throughput Serial jobs | MPI/parallel jobs |
dev | Testing and development | Production jobs |
Testing your script
Notice the dev entry in the sinfo output above? This is the development partition and is meant to perform short application tests. Runtime is limited to 1 h and you can use up to 2 nodes. This is typically enough to test most parallel applications. Please avoid submitting production jobs to this queue since this impact negatively on users looking to quickly test their applications.
Scratch
If your compute jobs on the cluster produce intermediate results, using your scratch directory can be beneficial:
- The scratch filesystem has a faster I/O speed than home.
- It has a higher default quota (5 Tb) so you can store bigger input files if necessary.
Remember to instruct your scripts to clean after themselves by removing unnecessary data, this prevents filling up your quota. Remember that unused files on scratch might be removed without previous notice.
Your quota
On SCW systems there are two types of quotas: storage and files. Storage quota is measured in bytes (kb, Mb, Tb) while file quota is measured in individual files (independent of their size). You can check your current quota with the command myquota:
[new_user@cl1 ~]$ myquota HOME DIRECTORY c.medib Filesystem space quota limit grace files quota limit grace chnfs-ib:/nfshome/store01 32K 51200M 53248M 8 100k 105k SCRATCH DIRECTORY c.medib Filesystem used quota limit grace files quota limit grace /scratch 4k 0k 5T - 1 0 3000000 -
On account approval all users are allocated a default home quota of 50 Gb and 100 K files. In the example above, the user has 32 Kb of data currently on home, if the user were to go beyond 51200 Mb, it would enter a grace period of 6 days to reduce the storage footprint. After this grace period or if hitting 53248 Mb, the user wouldn’t be able to create any more files. Something similar applies to file quota.
On scratch, new users are allocated a default quota of 5 Tb and 3M files. The main difference with home quota is that in scratch there is no grace period (that is what the 0 under quota tries to signify) but for all intents and purposes behaves in the same manner as home.
My application used to work …
As you can imagine, several errors arise from programs not being able to create files and the error messages quite often are unrelated to this fact. A common case is an application that used to work perfectly well and suddenly throws an error even if running the same job. As a first step in troubleshooting, it is worth checking that your quota hasn’t being reached.
Checking your usage
If you are worried about hitting your quota, and want to do some clean-up but are wondering which might be the problematic directories, Linux has an utility that can help you.
Try running this command in your home directory:
[new_user@cl1 ~]$ du -h --max-depth=1What does it shows you? Now try this command?
[new_user@cl1 ~]$ du --inodes --max-depth=1What is it showing you now? Is it useful?
Tip: you can always find out more about du in its man page (man du).
Backups
Please note that although we try to keep your data as safe as possible, at the moment we do not offer backups. So please make sure to have a backup of critical files and transfer important data to a more permanent storage.
Did you know that Cardiff University offer a Research Data Store? Cardiff researchers can apply for 1Tb of storage (more storage can be provided depending on certain criteria). Find out more on the intranet.
Key Points
Do NOT run long jobs in the login nodes
Check your quota regularly
Submit jobs to appropriate partitions