![]() |
![]() |
![]() | UDCECC | ![]() | ![]() | ChemE | ![]() |
You’re so very excited, you’re in graduate school, on your way to your MS or PhD degree, and to boot you’ve been granted access to a Linux cluster on which to run the calculations that will support said degree! But you’ve never used a Unix-like operating system before, you’ve grown up like so many unfortunate souls thinking Windows is how the world works. Though some areas of computation do indeed subscribe to that belief, scientific computation is not necessarily one of them, and the Windows-way is most often not the Unix-way. But this article is not a condemnation of Windows, so let me, as Monty Python would say, “get on with it!!”
Think about the activities you perform on the computer on a daily basis, and in particular on a computer run by Linux. You sit down with your cup of coffee, ssh cluster.che.udel.edu and enter your password, and immediately enter a whole series of commands that:
qstat to list all jobs you are running on the clusterdf to make sure you’re not filling-up the /home filesystem with your resultsUgh, the repetition! Every day you take your seat and tap out this sequence of commands. But you’re not a machine; the computer is the machine, here, let it do all that work for you!
Shell scripts are basically a file containing the very same commands you would type on the command line, but stored for easy replay again and again. Once you learn how to use the shell, you can easily begin writing scripts for it. This is in opposition to scripted languages, like Perl and Python, where you would need to learn a new programming language in order to write a script.
All scripts, be they shell/Perl/Python/et al., share a common format. The first line indicates what interpreter should be used to execute the statements therein. Since scripted languages remain in a textual form up until the very moment an instruction must be executed, such languages are termed interpreted languages (rather than a compiled language, where the instructions are turned into machine code before the program is ever executed). The interpreter is the program that decodes your textual instructions as the script is run. But enough, let us see a shell script:
#!/bin/sh # # These lines that start with the '#' are comments. Anything after the # '#' character on each line is simply discarded. # echo "Queued Jobs Running Now:" qstat -s r | grep "frey" echo echo "Usage on /home:" df -k | grep "/home" echo echo "Who's Logged In?" who
The echo command is helpful: it writes text to the screen so it can make it more clear what the script is doing when I run it. I sit down, use vi to enter the lines above, and save it to a file named mroutine.sh (short for morning routine, with the .sh suffix to indicate it’s a shell script). But before I can call it a script, there’s one last change that must be applied to the file:
> chmod +x mroutine.sh
There! Now I’ve indicated to Linux that the file is executable. If I now type:
> ./mroutine.sh Queued Jobs Running Now: 7808 0.51889 au4_cytosi frey r 09/11/2006 14:22:40 all.q@node05 1 Usage on /home: 10.0.4.2:/home 558709536 261232416 269096288 50% /home Who's Logged In? frey pts/0 Sep 12 13:52 (turin.nss.udel.edu) joeblow pts/2 Sep 12 09:47 (dion2.che.udel.edu)
Doh! Joe Blow is logged in!
For scripted languages, like Perl and Python, the first line of your script file will not be quite the same as the example above. For Perl, you would use
#!/usr/bin/perl # :
The ‘!’ character is called a bang. The file path that follows it is the path to the interpreter. The interpreter is executed, and the remainder of the script file is passed to the interpreter, same as what happens for a shell script.
When you submit a job to GridEngine for scheduling on a cluster, you do so by giving it a script. This script ends up being run on the cluster node to which GridEngine eventually assigns your job. The same shell script we wrote above could be submitted to GridEngine:
> qsub mroutine.sh
GridEngine will happily find a free processor, hand the script to the node with said processor, and run it. Go ahead, give it a try! But you’ll find out that the qstat command didn’t work for some reason. A good reason, in fact, since the job is running on a compute node and not the cluster’s head node, where GridEngine understands the qstat stuff!
As it turns out, there are special options you can pass to GridEngine to give it a better indication of the resource requirements of your job, to where the output from the job should be directed, whether or not to email you when the job finishes, etc. These options are specified at the top of the script on what would otherwise be considered comment lines:
#!/bin/sh #$ -j y #$ -m eas #$ -M frey@udel.edu # :
The combination of #$ is what indicates that options for GridEngine follow. In this case, I’ve asked that
stderr and stdout should go to one file
That’s it. For programs that run on a single processor, there’s nothing else you need to do beyond writing a script (possibly with options embedded in it) and submitting it with qsub. Take a gander at the qsub manual page (man qsub for you newbies out there) for all of the options that are available.
Jobs that run on multiple processors in parallel are another tank of lobsters, though, and on our clusters I’ve made template scripts available in most cases. For some pieces of software (most notably, Gaussian) there’s a program called gqueue that will actually write and submit the scripts for you!