From rc@hep.ucl.ac.ukWed Jun 12 14:45:28 1996
Date: Thu, 6 Jun 1996 17:10:09 +0100 (BST)
From: Robert Cranfield <rc@hep.ucl.ac.uk>
To: Gordon CRONE <gjc@hep.ucl.ac.uk>, Owen BOYLE <boyle@na48-1.cern.ch>,
    Robert MCLAREN <mclaren@cn.msm.cern.ch>
Subject: Notes on ROB-IN mtg 3/4-Jun-1996

Notes on ROB-IN meeting held at RHUL on 3/4-Jun-1996
====================================================
(RC 05-Jun-1996)

PRESENT:

Gary Boorman
Owen Boyle
Bob Cranfield
Gordon Crone
Barry Green
Robert McLaren
John Strong

1) CURRENT BUFFER:

1.1) Hardware:

Barry described the current buffer hardware. The buffer consists of
a hardware-controlled paged memory (T2B) attached via multiplexed data
bus and FIFOs to a DSP processor (C40). Physically the T2B is a 3U-size board
and the C40 processor board is a 6U VME board. The T2B plugs on to the
C40 board and the two operate together as a ROB.

Barry drew diagrams of existing relevant RHUL hardware:
   The RHUL "embedded C40" board (with JTAG, 6 C40 comms ports, optional
      (un-debugged) VME interface, and C40 bus connection via either P2
      or daughter-board socket).
   The new-style T2B (3U-size with front-end flat cable data input).
   The old-style T2B (6U-size)
   An S-LINK board (6U-size with 3 S-LINK sites, for cable attachment
      to 3 T2Bs -- 32-bit wide data with end-of-event mark,
      plus clock and data-valid.)

1.2) Buffer Manager Software:

Gordon described the buffer-manager software which runs in the C40 processor.
This is essentially a large polling loop which services free and used paged
FIFOs, RoI requests, and LVL2 decisions, outputting RoI data to LVL2
and sending LVL3 data to a sink. Apart from the FIFOs, all I/O is on C40
comms links (<= 12 MByte/s approx), using C40 on-chip DMA facilities where
appropriate.

     Sidetrack re targetted RoIRs for Demo_ROBs:
        RM: Farthouat has estimated that VME is ok
        at 5x (2 longword messages)/event. Should be enough messages
        for a crate, but are 2 longwords enough? Should be: all you
        need is:
           event id
           local destination
           global destination
        However, VME not likely to be a strategic solution.
        There was some discussion of local RoIR distribution options
        and Rudi Bock's ideas on broadcasting a mask-word of hit ROBs
        followed by an RoIR broadcast, possibly over S-LINK.

        We also talked about LVI v. shared VME-addressable memory for
        communication of RoIRs over VME.

1.3) Discussion Points:

   - events deleted as soon as output DMA has finished
   - decision records grouped for maximum performance
   - 1 page = 1 kByte
   - sequential event IDs --> easier indexing alogrithm
   - checking TTC info against ROD info?
     can then resync every orbit (BCID is re-sync'd every orbit), but price
     paid is another channel to every ROB (necessary for RODs(?), but is it
     necessary for ROBs?)
     RM: problem is that clocks are being sent instead of numbers.
   - TTC ID should arrive BEFORE BCID from ROD.


2) USE OF CURRENT BUFFER IN TESTLAB:

Gordon briefly described the current testlab setup (note: descriptions are
available on WWW under: http://www.hep.ucl.ac.uk/atlas/t2)


3) DEMONSTRATOR ROBs:

The original plan had been to add a PMC interface to the current C40-based
ROBs in order to connect to the LVL2 network (e.g. to DS links via a PCI-DS
converter from Zeuthen, or to ATM). Barry is designing a 3 PMC site board
to sit alongside the current ROB, connected to the C40 bus via the VME P2
connector. One of the PMC sites would be optionally an S-LINK site for use,
for example, with Architecture-A.

However, Architecture-C are doing their own thing, Architecture-A could always
connect directly to SLATE via S-LINK, and there are problems with using the
Zeuthen DS interface. (Currently this does not do T9000-style 32-byte
packetisation automatically, so it would have to be driven appropriately by
the ROB. Brian Martin is negotiating with Rome an upgrade which will overcome
this, but details have not yet been agreed.)

John suggested extrapolating the idea, discussed at the recent CERN Demo-B
meeting, of simply using the C40 ROBs as they stand together with the
Copenhagen C40/DS interfaces. The extrapolation would be to use all
available C40 boards to provide a mixture of SLATE-driven and emulated
C40 ROBs. This removes the need for PPC-based ROB emulators, though Bob
raised the problem of the software required to feed the C40 emulators.
It was suggested that this option be proposed to the Demo-B group.
John thought it important to avoid wasting effort on dead-end development
work.

The question remains of how to distribute RoIRs to the C40 ROBs i.e.
LVI vs shared VME memory. Perhaps the LVI performance should be measured
and the C40 VME option tested in anger to assess the practicalities of
these options.


4) ROB REQUIREMENTS

Robert outlined what he thought was the consensus picture:

    |       |       |
    v       v       v  1 GBit/s data channels
    |       |       |
   ---     ---     ---
  |   |   |   |   |   | ROB-INs
  |   |   |   |   |   |
   ---     ---     ---
    |       |       |
 ------------------------- PCI bus
        |               |
        |               |
       ---             ---
      |   | uP        |   | ROB-OUT
      |   |           |   |
       ---             ---
                        |

John and Bob questioned the reasons for concentrating data from several
ROB-INs into a single ROB-OUT. Is this purely because of estimated costs
and is it technology dependent? The drawback with a simple bus-style
concentration is that you can't always concentrate all the data for an
RoI. Switches with "sideways" interconnections could ensure that RoI
data can always be sent to a single uP (which would then be an Architecture-B
FEX). Alternatively, with above diagram, you might hope to "usually" be
able to process an RoI with the in-crate uP, only occassionally leaving this
job to processors downstream.

    |       |       |
    v       v       v  1 GBit/s data channels
    |       |       |
   ---     ---     ---
  |   |   |   |   |   | ROB-INs
  |   |   |   |   |   |
   ---     ---     ---
    |       |       |
   -----------------------
 -|                       |--> to next C104
  |      C104 switch      |
   -----------------------
        |               |
        |               |
       ---             ---
      |   | uP        |   | ROB-OUT
      |   |           |   |
       ---             ---
                        |

ATM interfaces may be expensive, but C104s might be as cheap as PCI.
What about alternatives to PMC? Could consider PC cards for PCI bus!
or Compact PCI (Robert noted this is available in both 3U and 6U forms,
but he had some reservations).

Problem is that PMC is v.tight for board space. Current proposal
is to have one ROB-IN and one ROB-OUT on a baseboard containing the
uP (e.g. RIO2):

    -----------------------
   |                       |
   |    base-board         |
   |                       |
   |                       |
   |                       |
   | --------------------- |
   ||                     ||
   ||   ROB-IN PMC    ||  ||
   ||                 ||  ||
   ||                     ||
   | --------------------- |
   | --------------------- |
   ||                     ||
   ||   ROB-OUT PMC   ||  ||
   ||                 ||  ||
   ||                     ||
   | --------------------- |
    -----------------------

We looked at Barry's ROB-IN design options (A,B,C,D,E) and Bob's FIFO
options (F,G). FIFO option F is attractive in principle, but you can't fit
FIFOs onto the ROB-IN PMC board. FIFO option G uses a 2nd PMC site on the
baseboard, leaving no room for a ROB-OUT in the case of the RIO2.
Robert suggested emulating the FIFOs in memory.

In considering processor choices it was thought dangerous to rely on
any features that are too special e.g. transputer links.
It was felt that DSP DMA and dual busses were probably ok, though a
single-bus design would be more versatile.


5) ROB-IN URD

We decided to work through the ROB-IN URD. Bob distributed the UCL comments
arising from discussion between Bob, Gordon and John Lane. There follow
notes on new or key points (Owen and Robert made detailed annotations on
the URD itself).

UR DI-RATE
   Data input WILL fluctuate --> mean of 10 uS & 1 kByte

UR DI-EFS
   ROB fragment sizes should be approximately equal (1 kByte), but could
   tabulate anticipated fluctuations (of size and arrival rate).

UR DI-IDR
   100 MBytes/sec data rate is safer spec than 1 Gbit/s.
   Current ROB can do.

UR DI-FC
   Does ROL flow-control double cost? RM: hard to estimate.
   It's probably decided to use XOFF.
   Could implement L1-inhibit as a writable bit and lemo for ROB-IN.

UR DI-EXP
   ROB-IN may be required to extract e.g. event ID from compressed data.
   --> "will be able to perform limited data expansion"
   Can be done with DSP.

UR L1-ALL
   Remove "Level-1 Trigger Requirements" as a title -- this should be
   part of DI requirements list.
   Change description to "The ROB must buffer data for each L1-accept".
   ROB-IN requires end-of-event marker (bit) --> should be a data input
   requirement above.
   Event nos will INCREMENT (by one).

UR RO-BS
   need UR for ROB-IN -> ROB-OUT -> LVL2
           and ROB-IN -> ROB-OUT -> LVL3
   also UR for max latency that must be accommodated.


6) ROB URD

We then decided it was necessary really to look at the ROB URD, since the
ROB-IN was planned to largely function as a ROB talking to the ROB-controller
and ROB-OUT instead of the outside world. So we picked up the trail in the
ROB URD:

(Note: the following also contain notes from the SECOND pass through the
ROB URD, when we compared the current ROB with the URD.)

UR DI-TRIG
   Should be UR DI-RATE
   Current ROB can do.

UR DI-EFS
   Current ROB can do.

UR DI-IDR
   Current ROB can do.

UR DI-FC
   Barry suggested programming free-FIFO threshold and using this to generate
   XOFF to ROD.

UR DI-EXP
   Can be done with DSP.
   Maybe it should be an assumption that event header and trailer are NOT
   compressed.

UR L1-ALL
   Done by current ROB.

UR L2-BS
   Should not be an obvious limitation i.e. ensure FIFOs and address busses
   are wide enough. (How easy would it be to make page-width adjustable?).

UR L2-ROI
   For ROB-IN should refer to ROB-IN-ROIR handling rather than ROIR.
   Need to add PMC to current ROB, and be able to talk to memory on ROI2
   side of PMC.

UR L2-REF  : handled by ROB-controller or ROB-OUT
UR L2-PRE  : handled by ROB-controller or ROB-OUT
UR L2-COMP : handled by ROB-controller or ROB-OUT
UR L2-REJ  : ROB-IN requirement -- include grouped rejects.
UR L2-ACC  : not essential

UR L3-REF  : handled by ROB-controller
UR L3-PRE  : handled by ROB-controller
UR L3-COMP : handled by ROB-controller
UR L3-DAH  : handled by ROB-controller
UR L3-OUT  : ROBIN-LVL3-REQUEST. Remove ref to timeout.

UR CTL-SC
   Yes.
   How does C40 boot? -> need some sort of boot ROM to enable DSP to then
   boot over PMC.

UR CTL-CONF : load progs and parameters

UR TTC-ID   : negotiable? Remove from ROB-IN spec.

(Note: error handling must be as light as possible):

UR ERR-SYNC  : discuss with others! (Ellis, Farthouat)

UR ERR-TRANS
   Detect and respond (to be defined)
   Up to DSP to read last word.

UR ERR-FBIG  : detect and respond if fragment > preset max. Send error packet

UR ERR-FERR
   Detect and respond? (last place err is known down to ROB level).
   ROD responsibility.
   (ROD error flag is in data e.g. last word, ROL transmission error flag
   is in word after that, which could then become the new last data word).

UR ERR-MON
   Error counters incremented.
   Error counters maintained by DSP and reported (to shared memory) on request.

UR ERR-DATA
   Owen constructed a table of possible error types:
   2 types of event mismatch with TTC-ID : lots of discussion. John not
   convinced that TTC connection gains you anything. Robert to ask Farthouat
   to produce a note on this.
   There were 5 other error types corresponding to the rest of the list in
   the URD.
   - forgotten-fragment: remove ref to age
     Gordon suggested we could move pages whose index slot is needed to a
     separate list of "forgotten fragments" (i.e. move their indices, not
     their data). Could do the same with LVL2-accepts -- this list would be
     short and could be searched as a linked list. --> keep LVL2-accept in
     for the moment e.g. for use with ROB-IN when no LVL3 is present.
   - LVL3-done-not-received: treat as forgotten fragment.

UR ERR-REC
   looks ok (at 1st glance!)
   Add assumption of start-of-event-marker to enable e.g. ROB to restart
   cleanly at event break.

UR GBL-MON
   Special kind of accept/reject sets data aside? (as for forgotten fragments).
   Requires extra data-out request and acknowledge (i.e. a monitoring request
   analogous to LVL3-request).

UR GBL-DTST
   Let's assume there's always a baseboard processor (e.g. RIO2).
   What about a built-in data source? (to test full ROB-IN hardware).
   -> not enough room to put a SLATE on board! -> could have a memory or a
   pattern generator.
   However UR is phrased as a general test requirement so these are
   implementation details.
   Last point in test chain = RIO2.
   First point in test chain = RIO2.
   Intermediate point in test chain = normal behaviour.

UR GBL-STST
   Owen and Robert to look again at GBL-DTST and GBL-STST and clarify.

UR GBL-HIST
   Done by software on ROB-IN processor in debug mode + writable register
   for access by e.g. logic analyzer (or LEDs?!).
   Sliding history window could be automatically provided with ROB-IN comms
   via software FIFOs in shared PCI memory.

UR GBL-PERF
   Buffer occupancy -> derivable from counter of difference between numbers
   of free and used pages. How important is this? It's something people might
   naturally ask for. However it may be expensive to provide to the resolution
   of one page -- could easily, however, monitor e.g. up to 5 preset FIFO
   levels. For test purposes, since one of these presets could be programmed
   and is what is planned to control XOFF to the ROD, we could easily
   experiment with different effective buffer sizes. Gordon noted that the
   software is only missing one piece of info: i.e. ONE of the two FIFO
   levels.
   Several ways to do this (including LED displays!) -> think about.

UR GBL-ACC

UR GBL-AUTO
   On request!
   Incorporate simple data generator in PLD? e.g. for the one used for
   double-buffering input.

UR CON-SIZE
   yes, we'll work to PMC constraint (because it's what people will expect).

UR CON-POW
   PMC constrains this to be v. low -- could be major constraint on processor
   choice (PPC604 on Motorola board is a small chip with an ENORMOUS heatsink!)

UR CON-COST: ignore (!)
UR CON-RC
UR CON-MON
UR CON-EB
UR CON-L2

UR INT-RO


7) DESIGN

What could/should change on current T2B?

Input FIFO could be removed, but need some buffering since input clock is
separate from memory clock -> it could be a v.short FIFO (or a double-buffer?),
and could possibly be implemented in PLD.

Could use only one bus on DSP (to make it more universal).
Processor-settable XOFF (+ XOFF should be power-on default).

Pinging ROD from ROB? According to Robert: ROD->ROB should be ONE-WAY
(bidirectional is too expensive, though a single XOFF is probably ok).


8) TASK LIST

Timetables produced by Bob, Gordon and Barry were circulated.

For large project following ESA we should have URD->SRD->AD... (iteratively)
and a cross-reference matrix of URD vs AD, but maybe this is not appropriate
for prototype ROB-IN --> shrink SRD/AD into draft/full descriptions?

Defined short-term tasks:
-------------------------

Minutes [RC: asap]

Block diagram [BG: 20/06/96]
Draft description (incl paper model) [RC: 20/06/96]

Updated ROB & ROB-IN URDs [OB/RM: 15/07/96]

Comparison with URD [Everyone: 31/07/96]

Full description [25/12/96?]

...............................................................................

ADDENDUM FROM ROBERT:

The URL for the PowerPC evaluation is:
http://www.cern.ch/ECP-ESS/OS/OS_write-ups/PPC_Evaluation/Draft_1.1.3/Title.
html

...............................................................................

APPENDIX 1:
===========

Comments on ROB-IN URD
======================
(31-May-1996 (GJC, RC, JBL))

1   INTRODUCTION
1.1 PURPOSE OF THE DOCUMENT

  >> ROB-IN is component of PROTOTYPE ROB
   
1.2 SCOPE

  >> ROB Controller should be in user list (and what about booting?)
  >> Why are users listed here?
  >> What about prototype builders as users (clients)?

1.3 DEFINITIONS, ACRONYMS & ABBREVIATIONS
1.3.1 Definitions
1.3.2 Acronyms

  >> EB, TTC are abbreviations, not acronyms. What about ROL?

1.3.3 Abbreviations
1.4 REFERENCES
1.5 OVERVIEW OF DOCUMENT
2   GENERAL DESCRIPTION
2.1 Product perspective
2.2 General capabilities
2.3 General constraints
2.4 User characteristics

  >> Users include ROB Controller (and booter?)
  >> Where are the user CHARACTERISTICS?

2.5 Operational environment
2.6 Assumptions and dependencies

  >> 3) is important!
  >> 4) EVEN rate cannot be true! Related 2.6 6) in ROB URD is WRONG!!! --
  >>    "front-end is required to have buffering" is what is meant.


3   SPECIFIC REQUIREMENTS
3.1 CAPABILITY REQUIREMENTS

3.1.1 Data Input Requirements

[UR DI-RATE]

  >> See 2.6 4) above
  >> ROB-IN won't be ready on timescale of demo prog!

[UR DI-FC]

  >> L1 inhibit would not stop ROL

[UR DI-EXP]

  >> Can we not decide to remove this from ROB-IN now? Doesn't the
  >> defn of preprocessing cover this?

3.1.2 Level-1 Trigger Requirements

[UR L1-ALL]

  >> Should this say "the ROB-IN must buffer the L1 data"? -- or
  >> does this UR refer to buffering TTC info?
  >> Requires sequential event IDs!
  >> TTC info requires buffer (what happens if it fills?)

3.1.3 ROB-OUT Requirements

[UR RO-BS]

  >> Formula wrong! assumes zero transfer time to ROB-OUT (If ROB-OUT buffers
  >> for LVL3 -- but handshaking is essential).
  >> THERE IS NO UR FOR DATA TO ROB-OUT!!! (see UR L2-ACC in ROB URD)
  >> THERE IS NO UR FOR DELETING DATA FROM ROB-IN!!! (see UR L2-REJ in ROB URD)
  >> Formula is misleading! -- tails are important!

[UR RO-REF]
[UR RO-PRE]
[UR RO-COMP]

  >> ROB-OUT not in ROB-IN

3.1.4 Timing, Trigger and Control Requirements

[TTC-ID]

  >> Requires buffering and handshake?

3.1.5 Error Handling

[UR ERR-TRANS]

  >> Barry should note! How can errors be detected? (S-LINK adds error word?)

[UR ERR-FBIG]

  >> This error should be avoided as far as possible e.g. by sending fragments
  >> in blocks and using handshaking. We want to allow large fragments to be
  >> sent if necessary.

[UR ERR-FERR]

[UR ERR-DATA]

  >> Require sequential event ID
  >> No fragment/TTC match --> send error, DON'T delete!
  >> Check bunch-crossing ID: ROD should send TTC info too?
  >> Don't DELETE data! esp aged fragments -- only do on demand.
  >> Extra error type: the ROB-IN must deal with "unknown L2 reject" (see ROB
  >> URD)

[UR ERR-REC]

[UR ERR-MON]

  >> Monitor and keep statistics?
  >> Statistics kept by ROB Controller?
  >> What kind of monitoring path? (assume it'll be like RoID, but to different
  >> destination.

[UR ERR-SYNC]

  >> Should this be part of ERR-DATA?
  >> Need is essential! JBL doesn't believe "error may be eliminated by
  >> appropriate dataflow strategies".

  >> From ROB URD [UR CTL-CONF]: required also by ROB-IN
     (configured on power-up...)

3.1.6 Global Requirements

[UR GBL-ACC]

  >> Access - ROB Controller? also booting

[UR GBL-AUTO]

  >> Test - ROB Controller

[UR GBL-HIST]

  >> History - ROB Controller, not ROB-IN? Can this really mean ALL?

[UR GBL-DTST]

  >> Test - to verify log ROB-OUT?
  >> Separate progs in buffer manager processor? Stored nearby e.g. ROM?

[UR GBL-STST]

  >> Test - ROD and ROB Controller too

[UR GBL-PERF]

  >> Performance - send to ROB Controller too

3.2 CONSTRAINT REQUIREMENTS

3.2.1 Physical constraints

[UR CON-SIZE]

  >>  Aren't these statements inconsistent?

[UR CON-POW]
[UR CON-COST]

3.2.2 Interaface constraints

[UR INT-RO]

  >> [UR INT-RC (typo?)]
  >>      typo: ROB-IB --> ROB-IN


...............................................................................

Additional comments from JBL:
-----------------------------

Clarify commands, test, errors.

Data in buffers needs protecting:
   better to lose data in Front-End than ROB?
   (because ROB data is being processed in LVL2).
   L1-inhibit would not stop data from ROD.
   Front-End needs L1-inhibit,
   Next-buffer-full (instead of BUSY) to ROD means:
      ROD doesn't have to wait for handshake,
      but requires fragments sent in block to be robust.
   System should behave well if LVL3 goes slow.

Buffer size and no deadtime make requirements on:
   LVL2 latency, EVB latency, LVL2 accept rate (see UR RO-BS)
   Tails of the distributions are important.

Play ping-pong:
   e.g. if ROB has TTC data but no data is coming from the ROB, then ping
   the ROD with "are you alive and well?". The pong fragment can be discarded.
   Apply throughout system (see UR GBL-STST).

Latency of ROB sending data - e.g. if a queue builds up affects LVL2 latency,
   EVB latency --> buffer size and deadtime.

...............................................................................

JBL's ROB URD notes:
====================

2.1 Perspective  point 1  If L1 trigger data lost is event "dead"?
                 fig 1    L1-inhibit from ROD (or Front-End)

2.6 Assumptions  point 5  there may be errors
                 point 6  wrong -- should be ROB-IN point 4?

3.1.1
    DI-TRIG      Wrong -- should read ROB-IN DI-RATE?

3.1.3
    L2-REJ       Required by ROB-IN in context of ROB Controller
    L2-ACC       Required by ROB-IN in context of ROB Controller

3.1.4
    L3-WAIT      Time-outs are dodgy! do on command?

3.1.5
    CTL-SC       Define states more (stability comment drafting?)
    CTL-CONF     Required by ROB-IN in context of ROB Controller

3.1.7
    ERR-SYNC     Need is not negotiable. Stability - don't believe!
    ERR-DATA     Unknown-L2-reject - required by ROB-IN in some form.

3.2.1
                 ROD too?

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Bob Cranfield

/----------------------------------------------------------------------------\
|     telephone:  +44-(0)171-380-7223 |  High Energy Particle Physics Group, |
|           FAX:  +44-(0)171-380-7145 |  Department of Physics & Astronomy,  |
|  email(TCPIP):  rc@hep.ucl.ac.uk    |  University College London,          |
| email(DECnet):  UCLVA::RC           |  Gower Street, London, WC1E 6BT      |
\----------------------------------------------------------------------------/