Dear Friends,

I tried to summarize the discussion we had on the 10th. Chris provided me
a helpful summary of his notes. Please comment, I have certainly left out
something.

best Regards, Roberto

 ============================================================================

 Minutes of the meeting on MVD electronics
 =========================================

 10-7-97
 
  
 OPTICAL LINKS:
 ==============
 
 Leo showed a page of requirements, which is available on the WEB (from
 his own MVD page, which have a pointer in the MVD-electronics page)
 
 DECISION: 
 
 [ A copper link (buffer amplifiers on the cryo-tower) can be considered
 a fall-back solution for the optical links. A decision will be taken
 at the end of the year. ]
    
   + We will have to test the performances of the chip with a copper link
     as soon as HELIX will be available
   + We will ask HERA-B to provide us a prototype of their link for an
     evaluation
   + Nikhef is anyway willing to be responsible of the analog link
   
   Qiang pointed out that BPC is sending up to the rucksack its analogue
   data, through a coax cable, with a rise time degradation of only few ns.

 Other discussions:

 - Optical link usage would require a test system to measure the linearity
   of the system. This would have to feed signals into the link input for
   readout with the ADC system followed by evaluation (plots,fits etc.).
   It was pointed out that the dynamic range of the HELIX test pulses was
   not large enough.
   This needs interactions with the C+C system in the cryo tower.
 
 - Nikhef suggested to have the ADC on the cryo-tower. This was felt to be
   risky in term of reliability, and painful in term of setting-up.
   Nevertheless, a solution in which:
   
   + Just the FADC is in the cryo-tower.
   + Digital data is clocked through a 80(100) MHz serial optical link 
    (per channel)
   + The (xilinx) data processor (pedestals, clustering etc) is in the
     rucksack
     
   has to be considered in more detail, particularly if the analogue
   transmission will be shown to be noisy or unreliable.
  
 - Katsuo proposed that the analogue link (either opto or copper) should be
   gated by the DataValid signal (of the module producing the data, not the
   OR of all of them). In absence of the DataValid, the line should go to 
   some clearly identifiable analogue level. In this way, the ADC will be
   able to detect errors (missing or too-long DataValid). See also the
   error handling discussion below.
   
 - We discussed the requirements in term of linearity and uniformity of
   gains. The uniformity of the gains are not so important at first order,
   because cluster will be formed with strips handled by the same analogue
   chain (optical links, ADC etc) => with the same gain. A severe non
   linearity will instead affect the resolution. Both H1 and HERA-B indicate
   (only verbal informations from the latter) that the non linearities of
   the optlinks will be limited to one or few percent.
   
 - The modularity of the optical links cards should be the same (or some easy
   multiple) of that of the ADC, to ease the cabling.

 ADC:
 ====
 
 DECISION: 
 
 [The readout clock will be HERA clock]
 
   + Reading out 8 chips (1024 strips), will introduce an extra latency
     of only 50 us.
   + The optical link (if any) and the ADC will have easier life,
     for instance the flat-top of the analogue data on the ADC will be
     longer => no need of fine tuning of delays or cable lengths.
   + No problem to synchronize two different clocks (readout, sampling).
   
 [ The handling of the ABORT (fast-clear) will be done after the ADC
   conversion (this was indeed clear before, but now is written down). ]
   
 [ The pedestal should be taken during the run (using test events which are
   always on empty bunches) and analyzed by some local (before shipping to
   the EVB) processor. ]
   
 [ The trailer information, containing the helix pipeline column index,
   is a valuable info and should be kept. Only one trailer per token ring
   is needed. ]
   
 Other discussions:
 
 We discussed the format of the output buffer. It can be either a paged
 memory (every event is put in one page of fixed length and address, indexed
 by the FLT number), or just a circular buffer. There are advantages in both:
 in the paged memory the event is in a fixed position, and the initialization
 of the readout from VME may be faster. On the other hand the page has to 
 be large enough to contain the largest event possible. It was noticed that
 we will have very large events when we deliver raw data (e.g. for the
 pedestals runs).
 
 The quantities needed by the ADC (downloaded at setup) will be
 - pedestals,
 - dead channels (coded as special values of pedestals),
 - threshold for each strip,
 - cluster threshold for each ADC (different to be able to accomodate for
   possible gain non-uniformities in the analogue chains),
 - run configuration (e.g. whether to send clustered data or just data after
   the low threshold cut).
   
 Japan will build a prototype module for the end of the year. It will be a
 normal VME card, wit no attempt to fit in the largest number of channels
 possible.
 
 CLOCK and CONTROL, and ERROR HANDLING:
 ======================================
 
 Q : the connection between the cryo-tower and the rucksack has to be
     copper or opto ? 
 Postponed, will follow the decision on the analogue link.
       
 We realized that the number of chips in the token ring for the barrel(8) 
 and the wheels(10) are not the same in the present design. The present
 design is quite preliminary for the wheels. This has nasty consequences for
 the synchronicity of the readout. Possible solutions are:
 
 - Ask Heidelberg to have a TokenDelay counter long enough to accomodate
   more than one extra chip as it is now. Ideally it should be a 10 bits
   counter. This seems to be by far the easiest solution (for us).
 - Split the clock and control system in two parts, one serving the wheels,
   the other the barrel. This would mean doubling the number of cables between
   the HelixInterface (cryo-tower) and the rucksack, and doubling the logic
   of the events bookkeeping in the rucksack. We will have to have indeed two
   almost independent systems.
 - Build a completely non synchronous system, with the bookkeeping being
   done at the single ADC level.     
 - Use the TransmitEnable system to enforce a "pull" type readout architecture.
   This was not really discussed this time but in the previous meeting, and it
   is not granted to work. In any case, this would need 256 cables more to 
   the detector modules.
   
 We discussed the problems of the fail-safe token ring, which are very
 similar to what was descibed above for rings of different lenghts.
 The possible soultions are:

 - The chip which sends the token to the alternative line (to skip the dead
   chip) should introduce also a proper number of dummy data to simulate the
   dead chip (preferred)
 - Use of the TokenDelay to keep at leas the DataValid equalized
 - Implement an asynch readout.
 - Use of TransmitEnable.
 
 In all solution apart the first:
 + The ADC will have to know which chip is dead (it will not receive data
   from it). This has impacts in the cluster processing: cluster have not 
   to be created between the two view of one module, or across a dead chip.
 + The different length of DataValid may introduce a difference in the
   pipeline columns in which the data is stored for different token chains
   (the buffers get freed at DataValid end).
   
 We discusssed the problem of cabling (how many cables, how connected)
   
 DECISION:
   
 [ While the output cables (analogue, data-valid, error) have to be
   independent for every module (token ring), the input signals can serve
   an entire ladder (1/2 wheel). ]
   
   + The affected signals will be: Rclk,Sclk,TrigIn,SerLoad,notReset,FcsTp
   + Advantage: less cables
   + Disadvantage: reliability, if something breakes, a larger region will
     be unavailable. Normally, broken chips do not affect their inputs signals
   + Verify the maximum length of the SerialData chain. From the manual it
     seems 64 chips, more than enough to address one ladder
   The power supplies (low voltage and bias) will keep their cell modularity
   We will study how to connect cables to hybrids: soldering is the solution
   with less material, but it may be dificult in the detector assembly phase.
   
 We started a discussion on errors and error recovery
   
 DECISION:   
 [ C+C must implement a to be specified error recovery scheme which minimizes
 any expert interaction especially at nights and weekends. ]

 We tried to identify the possible types of errors and the error recovery
 procedures needed. Notice that what described below does not apply in case
 we decide to implement an async readout scheme.
   
 Type of errors:
   
 1) Error line asserted by one chain (means some synchronicity failure
    detected by the monitoring circuitry of HELIX).
      
      This system requires an implementation similar to that of the BMUON,
      i.e. hold BUSY, flush all pipelines, and mark events as empty. Needs
      specification.
      
      Q: How to mark the data of the failing chain as bad, while performing
         the recovery ? Possibility: the C+C will mask off the DataValid for
         the same chain, so that the ADC will recognize wrong data.
      
      Q: Need the C+C provide to the local event builder the pattern of
         Error bits on a per-event basis, or only for the event where the
         error was found (so, to be read-out during the error recovery
         procedure, this means no need for a FIFO etc) ?
         
      Q: What happens if the errors are too frequent in a chain ? Are we
         allowed to disable on the fly one module, or shell an expert be
         called ? To disable one chain, should be enough to mask-off both
         Error and DataValid, if the analogue data is gated by the DataValid.
         
 2) Chip failures, which may result either in DataValid to be stuck high
    or missing.
      
    - DataValid stuck high:
     
      + A whatchdog is needed in the C+C. The only reaction possible is to
        mask-off the offending chain.
        Q: again, are we allowed to do this automatically (clearly, a limit
           of disabled chains before declaring fatal failure has to be set)
        Q: What will the C+C have to provide to the data stream ? Same
           as for the Error. Does it make sense to send the DataValid sampled
           per event (and if so, sampled when, at the beginning or at the end,
           and what happens if they are not all equally long ?) Or shell we
           provide only the offending pattern once an error is detected ?
      
    - Data Valid missing:
    
      + This may be identified by the ADC, instead.  
      
      
 OTHER ITEMS
 ===========
 
 - Setup:
 
 Discussion of SETUP activity. The target for the SETUP transition is <30sec.
 Two possibilities exist for downloading. The C+C associated CPU does all the
 work word-by-word or the CPU puts a block of data into an accessible
 location in the C+C and says go. Needs some thought. The question of whether
 a meaningful test could be made of the entire system before activate as a 
 check of the correctness was discussed. No decision was made but it was
 pointed out that no other components do this (?).

 - (other) modifications of HELIX
   
 + U.Koetz expressed worries about imbalanced CMOS signals. Conclusion: R.Kluit
   should ensure that balanced (differential) signals are provided.
 + We have to ensure that DataValid and Error output buffer of HELIX can drive
   a cable long enough (question already forwarded to HD)
 + can the shaping time of the analogue output be changed ? We may gain
   something in signal/noise with a longer shaping

   
 SLOW CONTROL: 
 =============
 
 Chris described the current state. the intention is to buy the
 following equiopment early August:

    1 VME board - current choice MVME2600 PowerPc 200MHz/32MB
                  wait until the end of July for the HERAB VME CPU review
                  they want to buy 50 boards!

    1 LynxOS development system (diskless,unix like, IP(NFS,), real time,
                multiuser, on board developmemnts.
                Only negative feature is hardware manufacturer preference
                for drivers on VxWorks,OS-9 i.e. not LynxOS. Use HERAB
                software initially.
                
     1 Janz VMOD-10 + IMOD2 board for CAN interface. Same board used
                at HERAB and NIKHEF.

    Need a cheap CAN test board initial tests.

 This system would boot from an existing LINUX PC. Some disk space should
 be foreseen.

 Leo gave a quick summary of on detector slow control requirements: ca. 40
 temp measurements, humidity/leak (?) etc.
 
 ==============================================================================


-------------------------------------------------------------------------------
*   Roberto Carlin             ZEUS experiment            DESY-Hamburg        *
*                                                                             *
*   Phone +49-40-8998-3202 (DESY)               +39-49-827-7075 (Padova)      *
*   http://www-zeus.desy.de/~carlin/TOP.html                                  *
-------------------------------------------------------------------------------