Minutes of the T2UK face-to-face meeting
RHUL, Wednesday 11th Oct., 2006
Present: John Baines, Gordon Crone, Dmitry Emeliyanov, Simon George,
Barry Green, Richard Hugh-Jones, Nikos Konstantinidis, Andrzej
Misiejuk, Mark Sutton, Pedro Teixeira-Dias, Thorsten Wengler, Fred
Wickens.
Apologies: Bill Scott, Antonella De Santo.
1. Minutes from last meeting & actions
The minutes from the previous face-to-face meeting were approved.
Actions: Fred reported that he has been in contact with Supermicro and
Compusys re procurement of new machines; it seems that dual cores with
8GB RAM will be available at ~ £2.5k each (ongoing).
2. News
Fred summarised the ATLAS UK CB meeting (10/10/06):
- The requests for HLT LTA (Tania McMahon and Ricardo Goncalo) were approved. The CB will consider requests for LTA twice a year.
- The ATLAS UK tracker upgrade project request is being finalised and will be submitted to the PPRP next week (Oct. 19)
- FP420: representatives of the UK group (Brian Cox and Jon Butterworth) had had initial and positive discussions with the ATLAS management (Peter Jenni) and the forward detectors coordinator (Per Grafstrom). The UK FP420 group will have to be integrated within the ATLAS UK structure somehow; details are still being worked out.
- Very soon the CB will be formally requesting permission from PPARC to use savings (~£500k) in the approved ATLAS UK funds to meet the UK's contribution to the shortfall in the ATLAS cost-to-completion (CtC). PPARC's initial reaction to this had been very positive.
- The next meeting of the ATLAS UK Oversight Committee (OsC) will be at CERN, 2-3 November. The OsC will be visiting the experimental area.
3. Work packages - progress over the last 6 months and forward look
WP1: ROS hardware and software
ROBIN installation / commissioning / support
Andrzej and Barry reported; the UK production of ROBins (350 units) at
CEMgraft was finished earlier this summer. The cards were configured &
tested and are now in CERN. The ROBins have been installed in ROS PCs
and commissioned, and are already in racks in USA15. Half of the ROS
are already integrated with detector RODs. Gordon participated in
initial integration tests of a ROS PC for the endcap SCT. The aim is
to have all ROSes fully commissioned and integrated with RODs by the
end of this year.
Six cards in the UK production (and about 10 in the German production)
had faults and will have to be produced at no extra cost. A final
production run of about 50 cards will be done in Germany/Gebauer to
replace the faulty cards and supply additional cards for the ATLAS
forward detectors and for testbeds dedicated to several ATLAS
sub-detectors.
A new system for providing ROBin card support for the data-taking
phase has been devised. ROBin cards can be plugged in the CERN test
system and can be tested and diagnosed remotely by an expert at
NIKHEF, Mannheim or RHUL (a rota of support shifts has been agreed by
the three groups). The information about the card faults, fixes, etc
will be kept in an expert system so that the knowledge is not lost.
ROS software
Gordon is interfacing the existing software to the ERS error reporting
system, in preparation for release tdaq-1.7. Changes to accommodate
the new event format are also needed, but these are likely to only be
ready for a later release.
WP2: HLT farms and networks
- Latest news on CPUs - Needs & Limitations - Purchasing plans
Fred reported that all the racks aer now in the upper level of
SDX. Cooling, local switches, fibres and copper cables are being
installed. The tendered local file servers are being evaluated; a
decision will be made in the next 2 weeks. It is likely that a single
file server per rack will be enough. However, this will hinge on the
yet unclear requirements of the HLT configuration database and, to a
lesser extent, on the amount of data flow required by the muon
calibration system.
The HLT machines (CPUs, disk, switches) need to be bought in several
steps, up to April 2008. Quad-cores from AMD and Intel will become
available for testing in the first half of 2007. The information we
have about the performance and cost of these machines is quite
encouraging viz a viz our requirements.
One HLT rack (out of a total of 75) complete with about 30 CPUs,
switches, local file server, cooling, cable management, power
distribution will be paid for by the UK, at an estimated cost of
£170k.
- Update on network installation - Further purchasing plans
Fred reported that three large chassis and some blades were purchased,
for switching between the SFI, EF farms and the SFO.
Richard gave a detailed presentation (see slides) on the Remote
Real-Time Computing Farms for event processing for T/DAQ. Extensive
tests were carried out on the 10Gb Polish network and on the CERN <->
Krakow link data throughput.
WP3: PESA software
Core software
[The following report on Releases - Automated tests - Steering -
Region Selector - Persistency, was provided by Simon in advance of the
meeting.]
Simon, John and Dmitry have coordinated the
AtlasTrigger project of the
offline release within the 12.0.X series. The latest, 12.0.3, is currently
being tested for production by Monika. Dmitry has done a lot of work to
understand memory-related problems.
The special release 12.0.3-LST has been set up, first step is to integrate the
new database-based configuration into the 12.0.X-branch version of the
steering. After that, tags from 12.0.4 will be copied across in batches. Simon
has trained and is assisting Joerg Stelzer (CERN) in coordinating 12.0.3-LST.
12.0.3 is the first offline release to have several of the online trigger
slices already working in the main release. Muon (L2 only), e/g and jet slices
are all working, and their configuration is included in the release.
The approach which has been taken to achieve HLT tests on offline software nightly
builds is to integrate the HLT release into the offline infrastructure. Then
the tests will be run following the offline nightly builds. They can be viewed
by the familiar offline interface, and automatic notification of failures can
be set up.
Jiri Masik (CERN/Prague and about to become RAL consultant) has set up the
following:
- HLT project in offline tag collector
- Automatic environment set up for the HLT project into the
AtlasLogin
settings
- Offline release building scripts work with HLT project
- Offline kit-building scripts work
- HLT nightlies are built with the offline nightlies using NICOS and ATN for
nightly tests.
Meanwhile, it is understood from Andre and Haimo that the ability to automate
sophisticated tests has advanced to the point where they are pretty confident
that they can automatically set up and run simple partitions from a nightly
build. So it should not take long to get a good suite of tests working in the
HLT release.
The plan now is to set up the online slice joboptions as tests in
TriggerRelease (running in athena) and in the HLT project (running in
AthenaMT/PT).
12.0.3 works to some extent to write ESD - several problems and crashes solved
in the last few days of 12.0.3. A few remain and are being worked on. Several
algorithm/slice developers are providing useful feedback. Tomasz Bold has
taken over coordination of the steering working group now that Andreas Hoecker
has been appointed Deputy Data Preparation Coordinator.
The new steering (12.3.0) has advanced a lot. See parallel session
presentations at TDAQ week. Plan is to upgrade algorithms in Nov/Dec so
everything is working in time for rel 13 (Jan 07). This could be a possible
milestone for UK algorithms.
No news on
RegionSelector except that some small bug fixes were
organised earlier in the year.
Thorsten reported (see slides) on the rapid progress in the area of
trigger configuration. The UK has clear and important responsabilities
in this area, namely full responsability for the development and
implementation of the
TriggerTool GUI to configure the L1 (TW) and HLT
(Tania
McMahon). This software will be used, in particular, at Point 1
for accessing and populating the Trigger configuration database during
data-taking.
Work on the L1 started earlier and is quite advanced, and includes the
full functionality; most of the work at the moment is dealing with
specific requests from the various sub-systems. For the HLT part Tania
has implemented the basic functionality and is now working to provide
a more comprehensive set of HLT tools. The
TriggerTool is currently
undergoing a Review by external users; this has already generated some
useful feedback.
Thorsten reported (see slides) on the continued work on the medium
scale tests of the HLT farm infrastructure and selection software. The
aim of these medium scale tests is to run on a few hundred nodes using
tdaq release 1.4 (includes algorithms) and release 1.6 (algorithms not
yet available). (Later on in the year it is planned to carry out tests
on a larger scale -- LST.) The tests are currently being run using the
GRID cluster in Manchester, with help from A Forti and Richard
Hughes-Jones. Using release tdaq-1.4, Thorsten reported a successful
run of an ~80-node L2 cluster for 48 hours, and 20 nodes EF cluster
for several hours. The tests are being carried out in close
collaboration with the CERN team that will be carrying out the LST
later on.
Ricardo was at CERN for another meeting, and therefore there was no
report on the status of the work to persistify "TriggerDecision" and
other Trigger objects to be included in the AODs and used by the
Physics community for trigger-aware studies for the CSC notes.
L2 tracking
- z-finder tuning & other components
Nikos reported (see slides) on the recent progress in the IDscan
z-finder, on behalf of Erkcan Ozcan. Work has concentrated on trying
to improve the accuracy of the z-finder for low pT tracks, as this is
the crucial link for the overall good efficiency of the IDscan
algorithm as a whole. Significant improvements in the accuracy have
been achieved, with even some small improvements for the high-pT
tracks as well. The overall resolution of the z-finder has been
improved by ~10%. This has been achieved at a cost in both CPU
and memory usage, which is now being looked into.
Dmitry reported on his work on other IDscan components (see notes and
plots attached to agenda). He identified and fixed problems in the
code that were causing additional CPU time overhead, as had been
reported in the last PESA
InDet meeting. The problems fixed, he
demonstrated that the previous performance timings were restored. A
bug in the TRT
TrigOfflineDriftCircleTool which seriously affected the
TRT track extension was also found and fixed. Finally, Dmitry reported
that he had written a new tool for a realistic extrapolation of tracks
to the face of the calorimeter, taking into account the magnetic field
inhomogeneities.
4. Oversight Committee (OsC)
As the meeting was already running late it was decided to move on to
the preparation of the next
OsC meeting (2-3 November). In particular,
two issues were discussed.
- Updating the risk register: the risk register was reviewed. In
particular, after discussion, it was agreed that the previously-listed
risk of "lack of enough CPU power" should be downgraded, in light of
the recent developments from Intel/AMD.
- Definition of the HLT project completion: the OsC has requested that we specify the metrics that should be used to define the HLT project construction phase as "completed". The definition of the completion of the hardware parts of the project is essentially straight-forward. The completion of the softawre parts is trickier to define and the discussion concentrated on this. After discussion, it was agreed that e.g. the L2-tracking software would be considered as completed once it was used in the online event selection, for the various trigger slices (e/gamma, muons, taus, b-jets), meeting targets such as very low level of memory leaks, length of continuous running without crashes (in terms of time or events processed). This would preferably be tested on real data (if that was available in time for the project completion review in early 2008); alternatively, simulated data could be used.
5. Finances, bids
Fred reported the status of the various budget lines. The finances are
sound.
None.
Next meetings
It was agreed that the next meetings would be
- phone meeting on Wed 6 Dec at 11h (organised by RHUL)
- face-to-face meeting on Wed 24 Jan at UCL, starting 1.30 pm
--
PedroTeixeiraDias - 7 Nov 2006