U.S. CMS
Search
uscms.org  www 

Data and Computing Facility Operations

Facility Operations: Batch System

Batch Systems:

The batch system available for users of the UAF is condor which allows the user to submit jobs into the lpc batch farm or the production farm. On this page we will describe how to use this batch system.

The preferred way to access the most LPC and more CPU is through CRAB

  1. A system that makes it easy to submit many slightly different jobs to the batch system (untested 2017) (click here)
  2. A complete, detailed example of how to submit a simple job as described below:
    1. How do I use CRAB to submit batch jobs?
    2. How do I use Condor to submit to the lpc batch farm?
    3. How do I manage file input and output in Condor?
    4. How do I request more memory for my batch jobs?
    5. How do I report worker nodes with problems
    6. How do I make sure my non-CRAB jobs are NOT accessing shared/NFS mounted disk on the job node?
      1. Example with simple scripting
      2. Example with tarring CMSSW and file transfer to EOS
      3. Example with making a new CMSSW during your batch job
    7. How do I troubleshoot my condor job problems?
    8. System status: condor monitor

For any information not covered below, visit the condor user's manual. Find the version of condor running on lpc with condor_q -version

One important note: compilation of code is not supported on the remote worker node. This includes ROOT's ACLiC (i.e. the plus signs at the end of root -b -q foo.C+). You can compile on the LPC interactive nodes and transfer the executables and shared libraries for use on the worker.

1. How do I use CRAB to submit batch jobs?

Note that this is the preferred method to access the most CPU

Guides:

2. How do I use Condor to submit to the lpc batch farm?

The first step to using the condor system is writing the condor submit description file. This file will tell the system what you want it to do and how. Below is an example, which will run a system program that will sleep for one minute, then quit. If you want to try this out, copy everything in red(commands) and green(code) and paste it into your terminal to create the file you see below named "sleep_condor". Click on any of the green lines below to see what it does.

cat > sleep_condor << +EOF

universe = vanilla
Executable = /bin/sleep
Should_Transfer_Files = YES
WhenToTransferOutput = ON_EXIT
### Transfer_Input_Files = file1, file2

Output = sleep_\$(Cluster)_\$(Process).stdout
Error = sleep_\$(Cluster)_\$(Process).stderr
Log = sleep_\$(Cluster)_\$(Process).log
notify_user = ${LOGNAME}@FNAL.GOV
x509userproxy = ${X509_USER_PROXY}
Arguments = 60
Queue 5

+EOF

Condor jobs are executed on their own 40Gb partition. You must use this space for your job output. Use the variable $_CONDOR_SCRATCH_DIR for your output path in your job script or .py configuration files so that your job output will be copied to the submitting directory on job completion.


Condor has options available to automatically transfer input files to the worker where the job runs and then copy any output files back to the directory you submit from. Jobs should use these instead of doing file reads or writes directly from the NFS mounted file systems. Direct reads/writes to NFS disk can cause severe performance problems. If this happens the CMS T1 Facility team may end of up having to kill your jobs.

The options for telling condor to copy files into and out of your job are:

Should_Transfer_Files = YES
### Transfer_Input_Files = file1, file2
WhenToTransferOutput = ON_EXIT

See the section below for details on their use, with additional examples for good and bad NFS usage at this link.

After you've created the file, and authenticated your grid certificate, you can submit it to the condor system using the command condor_submit followed by the name of your submit description file, in this example's case "sleep_condor":

condor_submit sleep_condor

Your output should look something like this:

[langley@cmslpc39 ~]$ condor_submit sleep_condor
Submitting job(s).....
Logging submit event(s).....
5 job(s) submitted to cluster 154.

You can see the status of all jobs submitted from the node you are logged on to by using the following command:

condor_q

Your queue ought to show the processes you just submitted, they may be idle for up to a minute or so, maybe longer if the system is very busy:

[langley@cmslpc39 ~]$ condor_q

-- Submitter: cmslpc39.fnal.gov : <131.225.207.241:37285> : cmslpc39.fnal.gov
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
154.0 langley 7/27 10:33 0+00:00:00 I 0 0.0 sleep 60
154.1 langley 7/27 10:33 0+00:00:00 I 0 0.0 sleep 60
154.2 langley 7/27 10:33 0+00:00:00 I 0 0.0 sleep 60
154.3 langley 7/27 10:33 0+00:00:00 I 0 0.0 sleep 60
154.4 langley 7/27 10:33 0+00:00:00 I 0 0.0 sleep 60
5 jobs; 5 idle, 0 running, 0 held

In condor, each computer has a separate list of requests (queue into condor), so sometimes the Job (cluster) number is not sufficient to uniquely identify a job. In other words, every computer in the condor system has its own list of jobs and job numbers. If you are still logged into the computer you submitted your job from, then just using the job number will work, but if you are on a different computer you must specify which computer you submitted the job from.

You can specifically get a list of all the jobs and their status for a specific user username from any machine using this command:

condor_q -submitter username

If you want to view the entire queue for a machine that you are not logged onto then you can use the following command. This is gives you the same information as condor_q (albeit in a different format) without needing to be logged into that particular machine. Say you submitted the job from cmslpc36.fnal.gov:

condor_status -submitters cmslpc36

This gives all the jobs from all users on the machine in question:

[langley@cmslpc39 ~]$ condor_status -submitters cmslpc36
Name Machine Running IdleJobs HeldJobs
langley@fnal.gov cmslpc36@fnal.gov 5 0 0

  RunningJobs IdleJobs HeldJobs
langley@fnal.gov 5 0 0
Total 5 0 0

You can view information about all requests and their submitters across all the system with this command:

condor_status -submitters

To cancel a job type condor_rm followed by the job number, for this example, 154:

condor_rm 154

Again if you are now logged into a different node you must ssh to the node the job was submitted from and do the remove:
ssh cmslpc36 condor_rm 154

If you don't remember what machine you submitted the job from, use the condor_q -submitter username command from above, it will tell you what machine you used for your requests.


universe = vanilla
The universe variable defines an execution environment for your job, in this example we use the vanilla universe which has the least amount of built in services, but also the least amount of restrictions. For a complete list of universes and what they do, so the condor user's manual under 2. Users' Manual > 2.4 Road-map for Running Jobs > 2.4.1 Choosing a Condor Universe.
BACK

Executable = /bin/sleep
This is the program you want to run. If the program is in the same directory as your batch file, just the name will work, example: yourscript.csh. If it is in a different directory than your batch file then you must give the pathname, example: myscripts/yourscript.csh runs the script yourscript.csh located in the directory myscripts. Be sure to make your script executable (chmod +x yourscript.csh).
BACK

Should_Transfer_Files = YES
WhenToTransferOutput = ON_EXIT
### Transfer_Input_Files = file1, file2

These options tell condor to take input files from your computer and send output files back. If these options are not activated then you must provide input through some other means and extract the output yourself. Users should avoid doing direct reads or writes from the NFS mounted file systems within batch jobs as it can cause severe performance issues. The Trasfer_Input_Files option can not take a directory for input but can take files with an absolute path in case your input files are not in the submit directory.
Note that the ### means this line is commented to make the example work without any input files, remove the comment and replace file1, file2 with your file information. Some more expanded examples can be found below for good and bad NFS usage at this link.
BACK

Output = sleep_$(Cluster)_$(Process).stdout
This directs the standard output of the program to a file, in other words, everything that would normally be displayed on the screen, so that you can read it after it is finished running. Where you see $(Cluster) condor will substitute the job number, and $(Process) will become the process number, in this case, 0-4. Make sure you are NOT using a full path on this setting so the condor scratch area is utilized.
BACK

Error = sleep_$(Cluster)_$(Process).stderr
This is the same as the Output line, except it applies to standard error, this is extremely useful for debugging or figuring out what is going wrong (all most always something). Where you see $(Cluster) condor will substitute the job number, and $(Process) will become the process number, in this case, 0-4. Make sure you are NOT using a full path on this setting so the condor scratch area is utilized.
BACK

Log = sleep_$(Cluster)_$(Process).log
The log file contains information about the job in the condor system, the ip address of the computer that is processing the job, the time it starts and finishes, how many attempts were made to start the job and other such data. It is recommended to use a log file, where you see $(Cluster) condor will substitute the job number, and $(Process) will become the process number, in this case, 0-4. Make sure you are NOT using a full path on this setting so the condor scratch area is utilized.
BACK

notify_user = ${LOGNAME}@FNAL.GOV
Specifies to whom the system will automatically email when the job finishes (your email), in the example, the computer should have put your email address here. You will recieve a seperate email for every process in you job that completes.
The use in the sleep example is done in the context of the cat command, where it evaluates the content of the system environment variable ${LOGNAME}, and places it in the condor job description file sleep_condor. If you are making your own condor job description file with a text editor, using the system environment variable as above would not work, you will need to copy the line from the sleep_condor file instead.

BACK

x509userproxy = ${X509_USER_PROXY}
This line tells your job to bring along your grid proxy, which is needed for accessing files over xrootd (root://). You will have needed to authenticate your grid certificate for this line to work.
The use in the sleep example is done in the context of the cat command, where it evaluates the content of the system environment variable ${X509_USER_PROXY}, and places it in the condor job description file sleep_condor. If you are making your own condor job description file with a text editor, using the system environment variable as above would not work, you will need to use this $ENV(X509_USER_PROXY) instead for jdl to evaluate.
BACK

Arguments = 60
Here you put any command line arguments for your program, if you have none, exclude this line. In this example the program needs one argument for the number of seconds to wait. This argument tells the program to wait for one minute.
BACK

Queue 5
This is how many times you want to run the program, without this line it runs only once. The processes will be numbered starting at zero, so in this example they will be: 0, 1, 2, 3, and 4.
BACK

3. How do I manage file input/output from within Condor?

Condor has options availble to automatically transfer input files to the worker where the job runs and then copy any output files back to the directory you submit from. Jobs should use these or read/write data from dcache instead of doing file reads or writes directlly from the NFS mounted file systems. Direct reads/writes to NFS can cause severe performance problems. If this happens the CMS T1 Facility team may end of up having to kill your jobs.

The options for telling condor to copy files into and out of your job are:
Should_Transfer_Files = YES
Transfer_Input_Files = file1, file2
WhenToTransferOutput = ON_EXIT

In addtion you need to make sure you use the correct pathname for files. Please take into account to not stress the NFS mounted disk with these instructions: How do I make sure my non-CRAB jobs are NOT accessing shared/NFS mounted disk on the job node?

BACK

4. How do I request more memory for my batch jobs?

Any additional job requirements will restrict which nodes your job can run on. Consider carefully any Requirements in a condor jdl given to you from other lpc users.

request_memory = 2100
Any requirements for the machine chosen to run your program can be specified here. The requirements here specify a machine with at least 2100 megabytes of memory available. Use the memory requirement only if you do need it, keeping in mind that you may wait a significant amount of time for slots to be free with sufficient memory available.
A setting documented here in the past included request_disk = 1000000, which restricts to nodes with disk of 1GB or more. As of March 3, 2017, adding the disk requirement of 1GB will remove 1767 CPU from running your job. Increasing these numbers will restrict the number of machines you can run on (completely if too high).
For a complete and long list of possible requirement settings, see the condor user's manual.
BACK

5. How do I report worker nodes that have problems?

Use the following web page with your grid-certificate loaded to report problem nodes:
https://cmsjobmon.fnal.gov/cgi-bin/heldNode.cgi-bin

BACK

6. How to I make sure my non-CRAB jobs are not accessing shared/NFS disk?

The LPC uses shared filesystems, /uscms/home, /uscms_data/d# and /uscmst1b_scratch/lpc1, between the worker nodes and the interactive nodes. Unfortunately, direct access to these shared filesystems from the workers can negatively affect performance for running jobs and more importantly for interactive users on the cmslpc nodes. The T1 team monitors for this activity and an automated system suspends ("holds") jobs when performance is slow and user jobs have too many open file descriptors to these shared filesystems. This short write-up is intended to help you avoid the NFS shared filesystems in your batch jobs.

Find bad examples of condor batch usage at this link.

Example A: A good example

In this case, we avoid the shared filesystem as much as possible. Note that there are no full path names anywhere except for the transfer_input_files. The inputs are transferred to a temporary area on the worker node's local disk, pointed to by the variable _CONDOR_SCRATCH_DIR. More complicated examples can be found in Example B, and Example C, below.
good1.jdl:

universe = vanilla
Executable = good1.sh
Output = good1.out
Error = good1.err
Log = good1.log
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
transfer_input_files = /uscms_data/d2/username/input_file.root, /uscms_data/d2/username/good1.C


good1.sh:

#!/bin/bash
# cms software setup not included here for brevity
cd ${_CONDOR_SCRATCH_DIR}
root -b -q good1.C

good1.C:

{
TFile f("input_file.root");
output_ntuples = process(f); // do some calculation
TFile g("output_file.root", "w");
g.write(output_ntuples);
}

Example B: A good example with tarring - tcsh script


Here is a system that makes it easy to submit many slightly different jobs to the batch system which includes a technique for tarring your files: click here (untested 2017).

In the following case, we avoid the shared filesystem as much as possible. Inputs are tarred and transferred to a temporary area on the worker node's local disk (40GB limit), pointed to by the variable _CONDOR_SCRATCH_DIR. There are two ways to use this technique which will be referred to below:
  1. The tarred CMSSW and large (root) input files are transferred to personal EOS storage. The tarred CMSSW is transferred to the local worker node disk, and large input files are opened in root following the EOS instructions for file reference inside a script. Note that the EOS transfer at the end of the job is best for large root files.
  2. The tarred CMSSW and any input files are transferred to the local worker node disk.
First tar your CMSSW working area. This presumes that in your CMSSW_8_0_25/src is the following files: cmsRun.csh, and ExampleConfig.py
tar -zcvf CMSSW8025.tgz CMSSW_8_0_25
Note that you can exclude large files for instance from your tar with the following argument:
--exclude="Filename*.root"
Note that you can exclude CMSSW caches for instance with a command like this one:
tar --exclude-caches-all --exclude-vcs -zcf CMSSW_8_0_25.tar.gz -C CMSSW_8_0_25/.. CMSSW_8_0_25 --exclude=src --exclude=tmp

  1. To transfer this to your personal EOS area do (for your CERN username):
  2. xrdcp CMSSW8025.tgz root://cmseos.fnal.gov//store/user/username/CMSSW8025.tgz
  3. In your condor.jdl file which you submit, refer to: Transfer_Input_Files = cmsRun.csh, ExampleConfig.py, CMSSW8025.tar.gz

Any large root files should be transferred to your personal EOS area, and referred to following the EOS instructions for file reference inside a script:
xrdcp Filename1.root root://cmseos.fnal.gov//store/user/username/Filename1.root
xrdcp Filename2.root root://cmseos.fnal.gov//store/user/username/Filename2.root

Your executable file cmsRun.csh will take arguments ${1} (the name of the python configuration file ExampleConfig.py), and ${2}, some other variable you are passing to the configuration file, like number of events. As always, be sure to test these scripts and python configuration files interactively for a single test before submitting many condor jobs.

Example cmsRun.csh:


#!/bin/tcsh
echo "Starting job on " `date` #Date/time of start of job
echo "Running on: `uname -a`" #Condor job is running on this node
echo "System software: `cat /etc/redhat-release`" #Operating System on that node
source /cvmfs/cms.cern.ch/cmsset_default.csh  ## if a bash script, use .sh instead of .csh
### for case 1. EOS have the following line, otherwise remove this line in case 2.
xrdcp -s root://cmseos.fnal.gov//store/user/username/CMSSW8025.tgz .
tar -xf CMSSW8025.tgz
rm CMSSW8025.tgz
setenv SCRAM_ARCH slc6_amd64_gcc530
cd CMSSW_8_0_25/src/
scramv1 b ProjectRename
eval `scramv1 runtime -csh` # cmsenv is an alias not on the workers
echo "Arguments passed to this script are: for 1: $1, and for 2: $2"
cmsRun ${1} ${2}
xrdcp nameOfOutputFile.root root://cmseos.fnal.gov//store/user/username/outputFile.root
cd ${_CONDOR_SCRATCH_DIR}
rm -rf CMSSW_8_0_25

Be sure to make your cmsRun.csh executable: chmod +x cmsRun.csh

Your condor.jdl will look something like this (default for case 1. EOS):


universe = vanilla
Executable = cmsRun.csh
Should_Transfer_Files = YES
WhenToTransferOutput = ON_EXIT
Transfer_Input_Files = cmsRun.csh, ExampleConfig.py
Output = sleep_\$(Cluster)_\$(Process).stdout
Error = sleep_\$(Cluster)_\$(Process).stderr
Log = sleep_\$(Cluster)_\$(Process).log
x509userproxy = $ENV(X509_USER_PROXY)
Arguments = ExampleConfig.py 100
Queue 5

Case 2. Transfer needs this line modified:


Transfer_Input_Files = cmsRun.csh, ExampleConfig.py, CMSSW8025.tar.gz

Here, the username and X509_USER_PROXY location are read from the environment variables with a different method than in the cat command above.

BACK

Example C: A good example with making a new CMSSW during the batch job - bash script


Example cmsRun.sh:


#!/bin/bash
echo "Starting job on " `date` #Date/time of start of job
echo "Running on: `uname -a`" #Condor job is running on this node
echo "System software: `cat /etc/redhat-release`" #Operating System on that node
source /cvmfs/cms.cern.ch/cmsset_default.sh  ## if a tcsh script, use .csh instead of .sh
export SCRAM_ARCH=slc6_amd64_gcc530
eval `scramv1 project CMSSW CMSSW_8_0_25`
cd CMSSW_8_0_25/src/
eval `scramv1 runtime -sh` # cmsenv is an alias not on the workers
echo "CMSSW: "$CMSSW_BASE
echo "Arguments passed to this script are: for 1: $1, and for 2: $2"
cmsRun ${1} ${2}

Be sure to make your cmsRun.sh executable: chmod +x cmsRun.sh

Your condor.jdl will look something like this (default for case 1. EOS):


universe = vanilla
Executable = cmsRun.sh
Should_Transfer_Files = YES
WhenToTransferOutput = ON_EXIT
Transfer_Input_Files = cmsRun.sh, ExampleConfig.py
Output = sleep_\$(Cluster)_\$(Process).stdout
Error = sleep_\$(Cluster)_\$(Process).stderr
Log = sleep_\$(Cluster)_\$(Process).log
x509userproxy = $ENV(X509_USER_PROXY)
Arguments = ExampleConfig.py 100
Queue 5

Here, the username and X509_USER_PROXY location are read from the environment variables with a different method than in the cat command above.

BACK

7. How do I troubleshoot my condor job problems?

In addition to the condor status commands, the following techniques may be useful to troubleshoot condor job problems. The condor user's manual guide to managing jobs goes into more depth for troubleshooting.

a. condor_status -submitters tells you the status of all condor jobs. If your jobs are Idle, they are not yet running, and waiting for an empty slot that matches your job requirements. If your jobs are Held they are stopped for some reason.

b. condor_userprio tells you the condor user priority of all the users on the cmslpc cluster. For more about condor user priority, see the condor user's manual guide to user priority.

c. You can find information in your job logs, the jobname_Cluster_Process.log file contains information about where the job is running/ran, the jobname_Cluster_Process.stdout, and jobname_Cluster_Process.stderr files contain the actual executable stdout and stderr.

d. Be sure to know which machine you have submitted the condor jobs from and perform these commands from that machine. You can query all the machines for your jobs with condor_q -submitter username. For instance if you submitted your condor jobs from cmslpc36 and you are not logged into that node, then you will need to:
ssh cmslpc36 from the cmslpc node you are logged into.

e. Find the jobID of the job you are concerned about with condor_q, you can see why it has that status it has with:
condor_q -analyze jobID, for example:
[username@cmslpc36 condor]$ condor_q | grep username
1033322.0   username          2/7  08:53   0+00:00:00 I  0   0.0  condor_test.sh
[username@cmslpc36 condor]$ condor_q -analyze 1033322
-- Schedd: cmslpc36.fnal.gov : <131.225.189.165:48228?...
---
1033322.000:  Request is held.

Hold reason: Error from slot1@cmswn1898.fnal.gov: STARTER at 131.225.191.192 failed 
to send file(s) to <131.225.189.165:48526>; SHADOW at 131.225.189.165 failed to write to file 
/uscms_data/d2/username/filename.root: (errno 122) Disk quota exceeded
In the above case, the user's quota on /uscms_data, the cmsnfs disk is exceeded.

f. In another case, the job requirements didn't allow for the job to run on any machines. It is important to note that due to the way the lpc cluster is partitioned, individual job slots can use more (or less) memory and disk than a standard amount. Therefore in a busy cmslpc farm, job requirements for memory and disk will needlessly restrict a job. It is best unless you need them for certain (and are willing to wait for resources to be available), to not add Requirements.

g. Here is an example output from condor_q -analyze where an old condor jdl was used:

-- Schedd: cmslpc28.fnal.gov : <131.225.190.108:6137?...
	Last successful match: Tue Feb  7 14:25:07 2017
	Last failed match: Wed Feb  8 14:32:46 2017

	Reason for last match failure: no match found 

The Requirements expression for your job is:

    ( ( OpSys == "LINUX" ) && ( Arch != "DUMMY" ) ) &&
    ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) &&
    ( TARGET.HasFileTransfer )


Suggestions:

    Condition                         Machines Matched    Suggestion
    ---------                         ----------------    ----------
1   ( OpSys == "LINUX" )              0                   REMOVE
2   ( Arch != "DUMMY" )               0                   REMOVE
3   ( TARGET.Memory >= 2500 )         1302                 
4   ( TARGET.Disk >= 1000000 )        3736                 
5   ( TARGET.HasFileTransfer )        6504 
In the above case, condor_q -better-analyze is a better tool. However, condor_q -better-analyze wouldn't print the error about being over quota. It turned out the user above had extra commands in the condor job file to hold the job if it went over requested memory. Additionally there were some no longer supported requirements being implemented, which analyze suggested removing.

h. Here is an example with condor_q -better-analyze, where the user requested a machine with more memory available than any in the cluster.

[username@cmslpc36 testJobRestrict]$ condor_q -better-analyze 1545569


-- Schedd: cmslpc36.fnal.gov : <131.225.189.165:48228?...
User priority for username@fnal.gov is not available, attempting to analyze without it.
---
1545569.000:  Run analysis summary.  Of 6076 machines,
   6076 are rejected by your job's requirements 
      0 reject your job because of their own requirements 
      0 match and are already running your jobs 
      0 match but are serving other users 
      0 are available to run your job
	No successful match recorded.
	Last failed match: Thu Feb 16 17:02:40 2017

	Reason for last match failure: no match found 

WARNING:  Be advised:
   No resources matched request's constraints

The Requirements expression for your job is:

    ( TARGET.Arch == "x86_64" ) && ( TARGET.OpSys == "LINUX" ) &&
    ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) &&
    ( TARGET.HasFileTransfer )

Your job defines the following attributes:

    RequestDisk = 10000000
    RequestMemory = 210000

The Requirements expression for your job reduces to these conditions:

         Slots
Step    Matched  Condition
-----  --------  ---------
[0]        6076  TARGET.Arch == "x86_64"
[1]        6076  TARGET.OpSys == "LINUX"
[3]        1718  TARGET.Disk >= RequestDisk
[5]           0  TARGET.Memory >= RequestMemory

Suggestions:

    Condition                         Machines Matched    Suggestion
    ---------                         ----------------    ----------
1   ( TARGET.Memory >= 210000 )       0                   MODIFY TO 24028
2   ( TARGET.Disk >= 10000000 )       1718                 
3   ( TARGET.Arch == "x86_64" )       6076                 
4   ( TARGET.OpSys == "LINUX" )       6076                 
5   ( TARGET.HasFileTransfer )        6076 
BACK


Condor monitors can be found in the System Status: condor section. BACK



Webmaster | Last modified: Monday, 10-Apr-2017 10:47:11 CDT