Supercomputing with SGE

 
May 17, 2015.

This website explains how to use SuperComputer, TOMBO, in OIST. You need OIST ID and password to see some links in this website.

Tombo Users Guide is here (ComputingResources > Available clusetrs > Tombo). The Tombo cluster runs an open-source batch job scheduler using Oracle's N1 Grid Engine. This batch system is know as SGE, "Sun Grid Engine".


Google search

We do not have a formal manual for SGE system in OIST. If you want to look of some options via google (such as array job), type,

sge task array "SGE_TASK_ID" "-t"

"SGE_TASK_ID" is an effective keyword to search about SGE system. We need to add quotation marks to use underbar and hypen as keyword.


Login

From your terminal, type,

ssh YOUR_ID@tombo.oist.jp


When you want to login to TOMBO from outside of OIST, type,

ssh YOUR_ID@loginc01.oist.jp

Please set up the id_rsa.pub key according to "TIDA > IT Services > SSH"


qlogin

We can also run the program from the computation node (not login node):

[jun-inoue:~]$ qlogin
local configuration tombo-login2.oist.jp not defined - using global configuration
Your job 419190 ("QLOGIN") has been submitted
waiting for interactive job to be scheduled ...
Your interactive job 419190 has been successfully scheduled.
Establishing builtin session to host tombo10705.oist.jp ...
[jun-inoue:~]$ hostname
tombo10705.oist.jp

Especially, we need to assign the memory usage when we use Java:

[jun-inoue:~]$ qlogin -l h_vmem=3gb
local configuration tombo-login2.oist.jp not defined - using global configuration
Your job 419192 ("QLOGIN") has been submitted
waiting for interactive job to be scheduled ...
Your interactive job 419192 has been successfully scheduled.
Establishing builtin session to host tombo10514.oist.jp ...
[jun-inoue:~]$ java -Xmx2g -version
java version "1.6.0_37"
Java(TM) SE Runtime Environment (build 1.6.0_37-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode)

Although more than 3gb is ideal, biggerr memory usage sometimes has a bad effect on other users [July 2013].



job script

Make a job file, sge.sh. Run the job with "qsub sge.sh".
The node will be chosen automatically.


Simple example

# This is a simple SGE batch script
#
# request Bash shell as shell for job
#$ -S /bin/bash
#$ -N nodename #[This is just the name shown by "qstat"]
#
#$ -q short
#$ -M email@oist.jp
#$ -m abe
#$ -j yes
#$ -cwd # The outfile will be saved in the current directory.
#
date
sleep 2
sleep 2
date




DynamicTrim

#$ -S /bin/bash
#$ -N DynamicTrim
#
#$ -q long
#$ -M EMAIL@oist.jp
#$ -m abe
#$ -j yes
#$ -cwd
#$ -l h_vmem=1g # Maximum memory to be expected
#$ -l virtual_free=256m # Avairable mamory before start job.
#
perl ~/bin/DynamicTrim.pl ../../Ostbic1-Eye_S2_L001_R1_001.fastq -h 20



Trinity

#$ -S /bin/bash
#$ -N Trinity
#
#$ -q genomics
#$ -M EMAIL@oist.jp
#$ -m abe
#$ -j yes
#$ -cwd
#$ -pe openmpi 4
#$ -l h_vmem=100g # Maximum memory to be expected
#$ -l virtual_free=10g # Avairable mamory before start job.
#
perl /home/j/jun-inoue/bin/trinityrnaseq_r2013-02-25/Trinity.pl --seqType fq --JM 2G --left reads.left.fq --right reads.right.fq --SS_lib_type RF --CPU 4 --no_cleanup --monitoring



Newbler

Newbler is a multithreaded program (not MPI). So we need to set,

#$ -pe smp 2
...
runAssembly ... -cpu ${NSLOTS} ....

Newbler will run even with "#$ -pe openmpi 2" line. But in this case, only one CPU will be used and the remaining CPUs are now used (but other Tombo user cannot use these remaining node). This run is the same run using "-cpu 1" option without "#$ -pe smp 2" line.

One-step assembly

#$ -S /bin/bash
#$ -q genomics
#$ -N NB3in1
#$ -M jun.inoue@oist.jp
#$ -m e
#$ -cwd
#$ -pe smp 2
#$ -l h_vmem=40G

runAssembly -cdna -cpu 2 -het -siod -qo -m OstBic1a.fastq OstBic2a.fastq

Incremental Assembly

#$ -S /bin/bash
#$ -q longP
#$ -N NewbTEST
#$ -M xx.xxx@oist.jp
#$ -m e
#$ -cwd
#$ -pe smp 12
#$ -l h_vmem=40G

#########
#### initiate assembly project
newAssembly -cdna testAssembl
#### sequence input
addRun -p -lib shortPE testAssembl \*.fq
#### assembly
runProject -cpu 12 -het -siod -qo -m testAssembl
### -siod for using efficient memory
## -large for large genome over 100Mb
## -het for diploid genome


RAxML

If you run RAxML with -T 8 option, set "#$ -pe smp 8". Or run using raxmlHPC (single), set "#$ -pe smp 1".
"-pe smp 4" is better for RAxML pthreads analysis.
raxmlSgeTest.tar.gz is my perl script to make sge batch for RAxML analysis.

#$ -S /bin/bash
#$ -N JOBID
#
#$ -q shortp # Use long if the seq file is large.
#$ -M xxxx@oist.jp
#$ -m abe
#$ -j yes
#$ -cwd
# #$ -o ./"outputFile directory name"
# #$ -e ./"errorFile directory name"
#$ -pe smp 4
#$ -l h_vmem=256m # Maximum memory to be expected. Use 1g if the seq file is large.
#$ -l virtual_free=256m # Avairable mamory before start job.
#
raxmlHPC-PTHREADS-SSE3 -f a -x 12345 -p 12345 -# 20 -m GTRGAMMA -s SEQUENCEFILE -q PARTITIONFILE -o OUTGROUP -n OUTFILE -T 4

The pthreads option in RAxML uses 4 slots (CPU) in a single node and assign 256 mega memory for each node.


RAxML via array job

Set "#$ -t 1-100" and "#$ -tc 10". Change values (100, 10, etc) depending on the number of your jobs .
For array job, see GridWiki or Parallel jobs in Tombo (password is required).
raxmlArrayPerl.tar.gz is my perl script to make a job file.

#$ -S /bin/bash
#$ -t 1-100
#$ -tc 10 # Less than 50 is good for the machine.
#$ -N JOBID
#$ -q shortp # Use long if the seq file is large.
#$ -M xxxx@oist.jp
#$ -m abe
#$ -j yes
#$ -cwd
# #$ -o ./"outputFile directory name"
# #$ -e ./"errorFile directory name"
#$ -pe smp 4
#$ -l h_vmem=256m #$ -l virtual_free=256m #
#[add lines as below or write loop script]
./raxmlHPC-PTHREADS-SSE3 -f a -x 12345 -p 12345 -# 10 -m GTRGAMMA -s 010_sequenceFileDir/ENSP00000400626.txt -q 0 10_partitionFileDir/ENSP00000400626.txt -o Chicken_ENSGALP00000015683_NONE -n ENSP00000400626.txt -T 2
./raxmlHPC-PTHREADS-SSE3 -f a -x 12345 -p 12345 -# 10 -m GTRGAMMA -s 010_sequenceFileDir/ENSP00000231668.txt -q 0 10_partitionFileDir/ENSP00000231668.txt -o Drosophila_FBpp0078315_CG2023 -n ENSP00000231668.txt -T 2
./raxmlHPC-PTHREADS-SSE3 -f a -x 12345 -p 12345 -# 10 -m GTRGAMMA -s 010_sequenceFileDir/ENSP00000308315.txt -q 0 10_partitionFileDir/ENSP00000308315.txt -o SeaSquirt_ENSCINP00000030254_NONE -n ENSP00000308315.txt -T 2

Multi-thread or MPI

Before qsub script, please check your program is written as multi-thread or MPI.
Tombo Users Guide is here
(Documentation > Tutorials > Running parallel jobs on the Tombo cluster).

MPI
The option is:

#$ -pe openmpi 4

Usually, MPI program is executed using mpirun or mpiexec as follows:

mpirun -np 8 mrbayes myfile.nxs


Multi-thread
The option is:

#$ -pe smp 4

Multi-threaded program is often executed using -cpu option (see jobscript of newbler).

queue availability for smp
If you want to know a queue (e.g., short) accepts smp run, try "qconf -sq short" and see the line of pe_list.

[jun-inoue:~]$ qconf -sq short
qname short
....
pe_list openmpi openmp matlab smp

Useful commands

qsub test.sh

Run the job.

qsub -cwd test.sh

The outfiles will be saved in the directory including job script.

qstat

The job status will be shown.

qdel 1152367

Quit your job using ID.

qdel -u USERID

Quit your job using USERID.

hostname

Make sure your login node.

qstat -j [jobid]

Make sure [jobid] job status.

qstat -u [username]

Make sure [username] job status.


#$ -q longP

Declaring one week job. shortP declares one day job.


ls | wc -l

Counting the number of files.

ls -l | grep ^d | wc -l

Counting the number of directories.

du -h -s

Checking the disk usage of each user. It takes so long time.

$ df -h

Showing the total usage (not each user) as follows:

Filesystem Size Usage Rest Use% MountPlace
/dev/mapper/vg_system-slash
16G 6.9G 7.7G 48% /
...
ddnsfa10ke-4:/genefs 599T 461T 138T 78% /genefs
tombo-mds1@tcp:tombo-mds2@tcp:/work
393T 222T 167T 58% /work
...


R

How to install R on Tombo.

Type the following commands.

$ wget http://cran.cnr.berkeley.edu/src/base/R-3/R-3.1.1.tar.gz
$ tar xvfz R-3.0.2
$ cd R-3.0.2
$ ./configure --prefix=/home/j/jun-inoue/bin && make && make install

Alternatively, you can wrote the following line in com.sh file

$ wget http://cran.cnr.berkeley.edu/src/base/R-3/R-3.1.1.tar.gz
$ tar xvfz R-3.0.2
$ cd R-3.0.2
$ ./configure --prefix=/home/x/xxx-xxx/bin && make && make install

then type,

$ nohup bash com.sh &

Thank you, Y.L.[October 2014]


How to qsub your job script
It seems that R works with more than 1g memory. In addition, R seems not to run in the shortP or longP queues. I am using short or long queues for R calculation.


0. R script
Make a R script, test.R.

hel <- "hello"
write(hel, file="out.txt")


Method 1. Batch-A
Save the following lines in a file, job1.sh. Assume we are working at "/home/j/jun-inoue/R_dir/" directory.

cd /home/j/jun-inoue/R_dir/
R CMD BATCH test.R

then qsub the job script.

$ qsub -q short -l h_vmem=1g job1.sh

The output will be seved as the out.txt file. test.Rout is R log file.
Log files from SGE system (job1.sh.e346xxx) will be saved in your home directory (usually empty).


Method 2. Batch-B
Save the following lines in a file, job2.sh

#$ -S /bin/bash
#$ -N job2 #[This is just the name shown by "qstat"]
#$ -q short
#$ -M xxx-xxxx@oist.jp
#$ -m abe
#$ -j yes
#$ -l h_vmem=1g
#$ -cwd # The outfile will be saved in the current directory.
R CMD BATCH /home/j/jun-inoue/R_dir/test.R

then qsub the job script.

$ qsub job2.sh


Method 3. Rscript
Save the following lines in a file and qsub it.

#$ -S /bin/bash
#$ -N job2
#$ -q short
#$ -M xxx-xxxx@oist.jp
#$ -m abe
#$ -j yes
#$ -l h_vmem=1g
#$ -cwd
Rscript /home/j/jun-inoue/R_dir/test.R


Link

Setup of HPC:
https://sites.google.com/site/yijyunluo/Bioinformatics/HPC-settings

Recommended readings:
https://sites.google.com/site/yijyunluo/Bioinformatics

Installers for stand alone-machine:
http://mirror.oist.jp/pub/
I downloaded Live CD (CentOS-6.5-x86_64-LiveCD.iso) from Linux/CentOS/6.5/isos/x86_64/