AtlasPathena < Main

Distributed Analysis on Panda

This page describes how to submit user analysis jobs from LCG/OSG/NG to the OSG production system (Panda) from UCL. Detailed Information about DA on Panda can be found here: https://twiki.cern.ch/twiki/bin/view/Atlas/DAonPanda

In order to set up pathena at UCL, follow the instructions below:

First, make sure you have a grid certificate. See Starting on the Grid. You should have usercert.pem and userkey.pem under ~/.globus. Then setup Athena because pathena works in the Athena runtime environment.

Then checkout PandaTools which contains pathena (for 13.0.X or 12.0.X, instructions for 11.0.X can be found here https://twiki.cern.ch/twiki/bin/view/Atlas/DAonPanda):

cd /somewhere/workarea      //this is your workarea where your analysis code is in, eg: testarea1206/12.0.6/
export CMTPATH=`pwd`:${CMTPATH}
export PATHENA_GRID_SETUP_SH=/usr/local/glite/etc/profile.d/grid_env.sh
cmt co PhysicsAnalysis/DistributedAnalysis/PandaTools
cd PhysicsAnalysis/DistributedAnalysis/PandaTools/cmt
source setup.sh
make
cd /somewhere/workarea/.../somedirectory              //this means go to your run directory in your analysis code
mkdir run       //create a run directory if you don't have one
cd run

When you run Athena with:

athena jobO_1.py jobO_2.py jobO_3.py

all you need is

pathena jobO_1.py jobO_2.py jobO_3.py [--inDS inputDataset] --outDS outputDataset

where inputDataset is a dataset which contains input files, and outputDataset is a dataset which will contain output files. For details about options, see pathena. More options can be found here:https://twiki.cern.ch/twiki/bin/view/Atlas/DAonPanda#pathena

Update your pathena version

If you get an error message: 52 ERROR : could not access DQ2 server then you have to update your pathena version. Here are two links for updating dq2 and pathena: https://twiki.cern.ch/twiki/bin/view/Atlas/UsingDQ2#How_to_migrate_to_DQ2_0_3

https://twiki.cern.ch/twiki/bin/view/Atlas/DAonPanda#How_to_migrate_to_DQ2_0_3

The following describes how to update your existing version of pathena:

(1) Firstly Remove Existing Panda Tools

cd ~/somewhere/12.0.6/PhysicsAnalysis/DistributedAnalysis/PandaTools/cmt
gmake clean
cd ../../../
rm -r DistributedAnalysis

(2)Now check it out again:

cd ~/somedirectory/12.0.6
cmt co PhysicsAnalysis/DistributedAnalysis/PandaTools
cd PhysicsAnalysis/DistributedAnalysis/PandaTools/cmt
source setup.sh
make

(3)Do a CVS update

cd ~/somedirectory/12.0.6/PhysicsAnalysis/DistributedAnalysis/PandaTools
cvs update

(4) Check that you have the latest version

cvs status ChangeLog

You should see Working Revision: 1.7

(5) Last step:

cd cmt
setup.sh
make

Monitoring

The Panda interface for monitoring your jobs is pretty good. You can visit a webpage with a url like this:

http://gridui02.usatlas.bnl.gov:25880/server/pandamon/query?ui=user&name=adamdavison

Obviously replacing adamdavison with your own username. Your username appears to be the name field of your grid certificate.

Compiling takes all day

So it's good that you can avoid it.

Once you've got a job running with a:

pathena mycooljoboptions.py --inDS dataset --outDS dataset

You can go to the panda monitoring page and find the name of the library dataset produced by the build step.

As long as you're happy with this set of binaries and you only want to change your top level job options for your next job, you can do:

pathena mycoolerjoboptions.py --inDS dataset --outDS dataset --libDS library dataset

And go straight to running.

-- AdamD - 23 Jul 2007 -- CatrinBernius - 19 Jun 2007

Topic revision: r5 - 2007-11-14 - AdamD