< < |
Running a Grid Job using GANGA
Brief description of GANGA components
- Job-registry component allows for storage and recovery of job information, and allows for job objects to be serialized
- Script-generation component translates a job's work flow into the set of instructions to be executed when the job is running
- Job-submission component submits work flow script to target (batch system, creates JDL file and translates resource request)
- File-transfer component handles transfer between sites of input & output files, adds commands to work flow script on submission
- Job-monitoring component performs queries of job status
Detailed Information about GANGA can be found in http://documentation.hepcg.org/res/ap3/w_301106.pdf.
How to install, setup and use GANGA
(a) Install GANGA
To intall GANGA, follow the instructions from https://twiki.cern.ch/twiki/bin/view/Atlas/WorkBookGanga
It is optional to modify the GANGA start-up parameters, descriped in Section 1.4 in https://twiki.cern.ch/twiki/bin/view/Atlas/GangaTutorial427
(b) Setup Grid environment and GANGA
(c) Setup Athena
In the case for ttH analysis: myAthenaSetup.sh
(d) Run GANGA
Start GANGA from the cmt or run directory of the Athena working area that has been setup before with just typing: ganga
To execute a script to submit a job in GANGA, type in GANGA command line (not GUI version):
execfile('/home/bernius/testarea/11.0.42/PhysicsAnalysis/AnalysisCommon/ttHHbb/run/mygangajob.py')
(This script can be found here: mygangajob.py)
The job can also be submitted by just typing: ganga mygangajob.py
Some GANGA commands and things to know
- exit GANGA: ctrl-D
- get online help: help
- exit help: ctrl-D
- repository for input/output files for every job is located by default at: $HOME/gangadir/workspace/Local
- view job repository: jobs
- view subjobs with: subjobs
- to get info about specific jobs: j (if defined) or jobs[jobid]
- get jobid number: print j.id
- summary of a job: j
- retrieve job ojects from a registry: print jobs[jobid].status
- get output info: j.outputdata
- kill running or queued job: jobs[jobid].kill()
- remove job with status new: jobs[jobid].remove()
- delete job with status failed/completed: del jobs[jobid]
- view job output directory of finished jobs that is retrieved back to the job repository: jobs[jobid].peek()
- view stdout log file of finished job: jobs[jobid].peek('stdout', 'cat')
- export job configuration to a file: export(jobs[jobid], '~/jobconf.py')
- load job configuration form file: load('~/jobconf.py')
- status of job: j.status
- IPython prompt tab expansion (methods with underscores are methods related to the implementation and should not be used directly): j.
- get list of backends: backends
- get list of application: applications
- Input Sandbox:
- GANGA keeps the input sandbox for all jobs in $HOME/gangadir/workspace so there might be quota problems
- The size is by default 10MB -> Submission failes because "JobSizeException: Job Size exceeds limits." , look at tarfile in /gangadir/workspace/Local/jobid how big the file is
- Output Sandbox: * the output can be found by default in /gangadir/workspace/Local/jobid/output (j.outputdata.local_location='/home/bernius/outputGanga') is not working for me) * to specify which files you want to receise: j.outputsandbox=['*.dat','*.txt','*.root'] or j.outputsandbox=['*'] (to receive all)
- there are more options for the Input and Output Sandboxes, see https://twiki.cern.ch/twiki/bin/view/Atlas/GangaUpdates420
More Information about GANGA can be found in the Links 5.-8.
To search for Datasets with the ATLAS Metadata Interface (AMI): http://lpsc1168x.in2p3.fr:8080/opencms/opencms/AMI/www/index.html |