Dear Friends, I tried to summarize the discussion we had on the 10th. Chris provided me a helpful summary of his notes. Please comment, I have certainly left out something. best Regards, Roberto ============================================================================ Minutes of the meeting on MVD electronics ========================================= 10-7-97 OPTICAL LINKS: ============== Leo showed a page of requirements, which is available on the WEB (from his own MVD page, which have a pointer in the MVD-electronics page) DECISION: [ A copper link (buffer amplifiers on the cryo-tower) can be considered a fall-back solution for the optical links. A decision will be taken at the end of the year. ] + We will have to test the performances of the chip with a copper link as soon as HELIX will be available + We will ask HERA-B to provide us a prototype of their link for an evaluation + Nikhef is anyway willing to be responsible of the analog link Qiang pointed out that BPC is sending up to the rucksack its analogue data, through a coax cable, with a rise time degradation of only few ns. Other discussions: - Optical link usage would require a test system to measure the linearity of the system. This would have to feed signals into the link input for readout with the ADC system followed by evaluation (plots,fits etc.). It was pointed out that the dynamic range of the HELIX test pulses was not large enough. This needs interactions with the C+C system in the cryo tower. - Nikhef suggested to have the ADC on the cryo-tower. This was felt to be risky in term of reliability, and painful in term of setting-up. Nevertheless, a solution in which: + Just the FADC is in the cryo-tower. + Digital data is clocked through a 80(100) MHz serial optical link (per channel) + The (xilinx) data processor (pedestals, clustering etc) is in the rucksack has to be considered in more detail, particularly if the analogue transmission will be shown to be noisy or unreliable. - Katsuo proposed that the analogue link (either opto or copper) should be gated by the DataValid signal (of the module producing the data, not the OR of all of them). In absence of the DataValid, the line should go to some clearly identifiable analogue level. In this way, the ADC will be able to detect errors (missing or too-long DataValid). See also the error handling discussion below. - We discussed the requirements in term of linearity and uniformity of gains. The uniformity of the gains are not so important at first order, because cluster will be formed with strips handled by the same analogue chain (optical links, ADC etc) => with the same gain. A severe non linearity will instead affect the resolution. Both H1 and HERA-B indicate (only verbal informations from the latter) that the non linearities of the optlinks will be limited to one or few percent. - The modularity of the optical links cards should be the same (or some easy multiple) of that of the ADC, to ease the cabling. ADC: ==== DECISION: [The readout clock will be HERA clock] + Reading out 8 chips (1024 strips), will introduce an extra latency of only 50 us. + The optical link (if any) and the ADC will have easier life, for instance the flat-top of the analogue data on the ADC will be longer => no need of fine tuning of delays or cable lengths. + No problem to synchronize two different clocks (readout, sampling). [ The handling of the ABORT (fast-clear) will be done after the ADC conversion (this was indeed clear before, but now is written down). ] [ The pedestal should be taken during the run (using test events which are always on empty bunches) and analyzed by some local (before shipping to the EVB) processor. ] [ The trailer information, containing the helix pipeline column index, is a valuable info and should be kept. Only one trailer per token ring is needed. ] Other discussions: We discussed the format of the output buffer. It can be either a paged memory (every event is put in one page of fixed length and address, indexed by the FLT number), or just a circular buffer. There are advantages in both: in the paged memory the event is in a fixed position, and the initialization of the readout from VME may be faster. On the other hand the page has to be large enough to contain the largest event possible. It was noticed that we will have very large events when we deliver raw data (e.g. for the pedestals runs). The quantities needed by the ADC (downloaded at setup) will be - pedestals, - dead channels (coded as special values of pedestals), - threshold for each strip, - cluster threshold for each ADC (different to be able to accomodate for possible gain non-uniformities in the analogue chains), - run configuration (e.g. whether to send clustered data or just data after the low threshold cut). Japan will build a prototype module for the end of the year. It will be a normal VME card, wit no attempt to fit in the largest number of channels possible. CLOCK and CONTROL, and ERROR HANDLING: ====================================== Q : the connection between the cryo-tower and the rucksack has to be copper or opto ? Postponed, will follow the decision on the analogue link. We realized that the number of chips in the token ring for the barrel(8) and the wheels(10) are not the same in the present design. The present design is quite preliminary for the wheels. This has nasty consequences for the synchronicity of the readout. Possible solutions are: - Ask Heidelberg to have a TokenDelay counter long enough to accomodate more than one extra chip as it is now. Ideally it should be a 10 bits counter. This seems to be by far the easiest solution (for us). - Split the clock and control system in two parts, one serving the wheels, the other the barrel. This would mean doubling the number of cables between the HelixInterface (cryo-tower) and the rucksack, and doubling the logic of the events bookkeeping in the rucksack. We will have to have indeed two almost independent systems. - Build a completely non synchronous system, with the bookkeeping being done at the single ADC level. - Use the TransmitEnable system to enforce a "pull" type readout architecture. This was not really discussed this time but in the previous meeting, and it is not granted to work. In any case, this would need 256 cables more to the detector modules. We discussed the problems of the fail-safe token ring, which are very similar to what was descibed above for rings of different lenghts. The possible soultions are: - The chip which sends the token to the alternative line (to skip the dead chip) should introduce also a proper number of dummy data to simulate the dead chip (preferred) - Use of the TokenDelay to keep at leas the DataValid equalized - Implement an asynch readout. - Use of TransmitEnable. In all solution apart the first: + The ADC will have to know which chip is dead (it will not receive data from it). This has impacts in the cluster processing: cluster have not to be created between the two view of one module, or across a dead chip. + The different length of DataValid may introduce a difference in the pipeline columns in which the data is stored for different token chains (the buffers get freed at DataValid end). We discusssed the problem of cabling (how many cables, how connected) DECISION: [ While the output cables (analogue, data-valid, error) have to be independent for every module (token ring), the input signals can serve an entire ladder (1/2 wheel). ] + The affected signals will be: Rclk,Sclk,TrigIn,SerLoad,notReset,FcsTp + Advantage: less cables + Disadvantage: reliability, if something breakes, a larger region will be unavailable. Normally, broken chips do not affect their inputs signals + Verify the maximum length of the SerialData chain. From the manual it seems 64 chips, more than enough to address one ladder The power supplies (low voltage and bias) will keep their cell modularity We will study how to connect cables to hybrids: soldering is the solution with less material, but it may be dificult in the detector assembly phase. We started a discussion on errors and error recovery DECISION: [ C+C must implement a to be specified error recovery scheme which minimizes any expert interaction especially at nights and weekends. ] We tried to identify the possible types of errors and the error recovery procedures needed. Notice that what described below does not apply in case we decide to implement an async readout scheme. Type of errors: 1) Error line asserted by one chain (means some synchronicity failure detected by the monitoring circuitry of HELIX). This system requires an implementation similar to that of the BMUON, i.e. hold BUSY, flush all pipelines, and mark events as empty. Needs specification. Q: How to mark the data of the failing chain as bad, while performing the recovery ? Possibility: the C+C will mask off the DataValid for the same chain, so that the ADC will recognize wrong data. Q: Need the C+C provide to the local event builder the pattern of Error bits on a per-event basis, or only for the event where the error was found (so, to be read-out during the error recovery procedure, this means no need for a FIFO etc) ? Q: What happens if the errors are too frequent in a chain ? Are we allowed to disable on the fly one module, or shell an expert be called ? To disable one chain, should be enough to mask-off both Error and DataValid, if the analogue data is gated by the DataValid. 2) Chip failures, which may result either in DataValid to be stuck high or missing. - DataValid stuck high: + A whatchdog is needed in the C+C. The only reaction possible is to mask-off the offending chain. Q: again, are we allowed to do this automatically (clearly, a limit of disabled chains before declaring fatal failure has to be set) Q: What will the C+C have to provide to the data stream ? Same as for the Error. Does it make sense to send the DataValid sampled per event (and if so, sampled when, at the beginning or at the end, and what happens if they are not all equally long ?) Or shell we provide only the offending pattern once an error is detected ? - Data Valid missing: + This may be identified by the ADC, instead. OTHER ITEMS =========== - Setup: Discussion of SETUP activity. The target for the SETUP transition is <30sec. Two possibilities exist for downloading. The C+C associated CPU does all the work word-by-word or the CPU puts a block of data into an accessible location in the C+C and says go. Needs some thought. The question of whether a meaningful test could be made of the entire system before activate as a check of the correctness was discussed. No decision was made but it was pointed out that no other components do this (?). - (other) modifications of HELIX + U.Koetz expressed worries about imbalanced CMOS signals. Conclusion: R.Kluit should ensure that balanced (differential) signals are provided. + We have to ensure that DataValid and Error output buffer of HELIX can drive a cable long enough (question already forwarded to HD) + can the shaping time of the analogue output be changed ? We may gain something in signal/noise with a longer shaping SLOW CONTROL: ============= Chris described the current state. the intention is to buy the following equiopment early August: 1 VME board - current choice MVME2600 PowerPc 200MHz/32MB wait until the end of July for the HERAB VME CPU review they want to buy 50 boards! 1 LynxOS development system (diskless,unix like, IP(NFS,), real time, multiuser, on board developmemnts. Only negative feature is hardware manufacturer preference for drivers on VxWorks,OS-9 i.e. not LynxOS. Use HERAB software initially. 1 Janz VMOD-10 + IMOD2 board for CAN interface. Same board used at HERAB and NIKHEF. Need a cheap CAN test board initial tests. This system would boot from an existing LINUX PC. Some disk space should be foreseen. Leo gave a quick summary of on detector slow control requirements: ca. 40 temp measurements, humidity/leak (?) etc. ============================================================================== ------------------------------------------------------------------------------- * Roberto Carlin ZEUS experiment DESY-Hamburg * * * * Phone +49-40-8998-3202 (DESY) +39-49-827-7075 (Padova) * * http://www-zeus.desy.de/~carlin/TOP.html * -------------------------------------------------------------------------------