Difference: AtlasGanga (3 vs. 4)

Revision 42009-11-13 - JamesRobinson

Line: 1 to 1
 
META TOPICPARENT name="HEPGroup.AtlasStuff"

Running a Grid Job using GANGA

Changed:
<
<
Ganga is a frontend tool for job definition and management with access to all grid infrastucture supported by ATLAS. Detailed Information about GANGA can be found in http://documentation.hepcg.org/res/ap3/w_301106.pdf.
>
>
Ganga is a frontend tool for job definition and management with access to all grid infrastucture supported by ATLAS. Detailed Information about GANGA can be found in http://documentation.hepcg.org/res/ap3/w_301106.pdf.
 
Changed:
<
<

How to setup and use GANGA

>
>

How to setup GANGA

 

(a) Setup Grid environment and GANGA

The following two lines set up the Grid interface and GANGA using the newest version available on AFS:
Deleted:
<
<
 
  • source /afs/cern.ch/project/gd/LCG-share/current/etc/profile.d/grid_env.sh
  • source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh
Deleted:
<
<
 

(b) Setup Athena

Deleted:
<
<
Set up your Athena environment as usual, for example under 12.0.6:

* source ~/cmthome/setup.sh -tag=12.0.6

 
Added:
>
>
Set up your Athena environment as usual, for example under 15.6.0:
  • source ~/cmthome/setup.sh -tag=15.6.0
 

(c) Run GANGA

Deleted:
<
<
Start GANGA from the cmt or run directory of the Athena working area that has been setup before with just typing: ganga To execute a script to submit a job in GANGA, type in GANGA command line (not GUI version): execfile('/home/bernius/testarea/11.0.42/PhysicsAnalysis/AnalysisCommon/ttHHbb/run/mygangajob.py') (This script can be found here: mygangajob.py) The job can also be submitted by just typing: ganga mygangajob.py For more information about submitting your own jobs see the GANGA tutorial: https://twiki.cern.ch/twiki/bin/view/Atlas/GangaTutorial427
 
Added:
>
>
Change directory to the run directory of the whichever package you are working on and then start ganga with:
  • ganga

Using GANGA

The ganga command line is a python shell which can be used to submit jobs. A sample job script is shown here:

j = Job()
j.application = Athena()
j.name='PTResolution.LowPT.SmallEta.J5'
j.application.option_file=[ '/home/robinson/athena/15.6.0/PhysicsAnalysis/ForwardJets/run/jobOptions.PTResolution.LowPT.SmallEta.py' ]
j.application.athena_compile = True
j.application.atlas_release='15.6.0'
j.application.prepare()
j.inputdata=DQ2Dataset()
j.inputdata.dataset=[ 'mc08.105014.J5_pythia_jetjet.merge.AOD.e344_s479_s520_r809_r838/' ]
j.outputdata=DQ2OutputDataset()
j.outputdata.outputdata=['PTResolution.root']
j.splitter=DQ2JobSplitter()
j.splitter.numsubjobs=500
j.backend = LCG()
j.backend.requirements.cloud='UK'
j.submit()

The most important options here are

  • j.application.option_file which contains your Athena jobOptions
  • j.outputdata.outputdata which contains the output specified by your Athena jobOptions

To execute this script (which should be in the run directory from which you ran ganga), simple type

  • execfile('scriptname')

Alternatively, the job can be submitted from outside the ganga shell by typing

  • ganga scriptname

For more information about submitting your own jobs see the GANGA tutorial: https://twiki.cern.ch/twiki/bin/view/Atlas/FullGangaAtlasTutorial

*Useful GANGA python shell commands

  • exit GANGA: ctrl-D
  • get online help: help (exit help: ctrl-D)
  • view job repository: jobs
  • view subjobs with: jobs(jobid).subjobs
  • to get info about specific jobs: jobs(jobid)
  • to get the job status: jobs(jobid).status
  • remove job: jobs(jobid).remove()
  • view job output directory of finished jobs that is retrieved back to the job repository: jobs(jobid).peek()
  • view stdout or stderr for debugging failed jobs: jobs(jobid).peek('stdout.gz','emacs')
  • export job configuration to a file: export(jobs[jobid], '~/jobconf.py')
  • force a job into a particular status: jobs(jobid).force_status("failed")
The repository for input/output files for every job is located by default at: $HOME/gangadir/workspace/username/LocalAMGA

Common GANGA Problems

The datasets belonging to the container that you want to run on must all be present on the same cloud (although not necessarily at the same site). You can check where datasets are available by running:

  • dq2-ls -r "datasetname" (outside ganga) The Athena version that you request must be present at all sites that your job is sent to. You can check which versions are available at which sites by running:
  • lcg-infosites --vo atlas ce tag

Using the Panda backend

The Panda backend has to be used for jobs sent to US sites. It requires a slightly different form of job submission script. A sample job script is shown here:

j = Job()
j.application = Athena()
j.name='PTResolution.LowPT.SmallEta.J5'
j.application.option_file=[ '/home/robinson/athena/15.6.0/PhysicsAnalysis/ForwardJets/run/jobOptions.PTResolution.LowPT.SmallEta.py' ]
j.application.athena_compile = True
j.application.atlas_release='15.6.0'
j.application.prepare()
j.inputdata=DQ2Dataset()
j.inputdata.dataset=[ 'mc08.105014.J5_pythia_jetjet.merge.AOD.e344_s479_s520_r809_r838/' ]
j.outputdata=DQ2OutputDataset()
j.splitter=DQ2JobSplitter()
j.splitter.numsubjobs=500
j.backend=Panda()
j.submit()

-- JamesRobinson - 13 Nov 2009

OLD BUT MAY STILL BE RELEVANT
 

GANGA on NorduGrid

By default GANGA submits your jobs to the LCG. Since GANGA version 4.3.0, you can also submit your jobs to NorduGrid using the new NG backend. More information on how to change your jobs from LCG to NG can be found here:

https://twiki.cern.ch/twiki/bin/view/Atlas/GangaNGTutorial430

Deleted:
<
<

Some GANGA commands and things to know

  • exit GANGA: ctrl-D
  • get online help: help (exit help: ctrl-D)
  • repository for input/output files for every job is located by default at: $HOME/gangadir/workspace/Local
  • view job repository: jobs
  • view subjobs with: subjobs
  • to get info about specific jobs: jobs(jobid)
  • to get the job status: jobs(jobid).status
  • remove job: jobs(jobid).remove()
  • view job output directory of finished jobs that is retrieved back to the job repository: jobs(jobid).peek()
  • export job configuration to a file: export(jobs[jobid], '~/jobconf.py')

 

Sandbox fun

* Input Sandbox:
Added:
>
>
 
    • GANGA keeps the input sandbox for all jobs in $HOME/gangadir/workspace so there might be quota problems
    • The size is by default 10MB -> Submission failes because "JobSizeException: Job Size exceeds limits." , look at tarfile in /gangadir/workspace/Local/jobid how big the file is
Changed:
<
<
  • Output Sandbox: * the output can be found by default in /gangadir/workspace/Local/jobid/output (j.outputdata.local_location='/home/bernius/outputGanga') is not working for me) * to specify which files you want to receise: j.outputsandbox=['*.dat','*.txt','*.root'] or j.outputsandbox=['*'] (to receive all)
>
>
  • Output Sandbox: * the output can be found by default in /gangadir/workspace/Local/jobid/output (j.outputdata.local_location='/home/bernius/outputGanga') is not working for me) * to specify which files you want to receise: j.outputsandbox=['*.dat','*.txt','*.root'] or j.outputsandbox=['*'] (to receive all)
 

When you submit a job, GANGA will try to tar up your whole testarea to send with the job, which will inevitably be much larger than the 10MB limit for most sites. If it's only a little bit over then you can try and delete some things but a useful strategy is to create a separate testarea just for GANGA. The only things you need to run your job successfully are the job options and your testarea/InstallArea folder so if you just copy those into the fake testarea, your job should still run fine and fit in under the size limit.

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback