JRA4 Face-to-Face meeting ================================== 14/10/2004 Afternoon Session ================================== Kostas: stuff for AOB: Den Haag attendance, Handover of knowledge Introductions: Pete, me Ratner: is an applications programmer, without any particular Gloria: GARR. } John?: GARR. } together with Mauro they make 1 post on JRA4 Kostas: is taking over from Javier as project manager. Edoardo Martelli: CERN. IPv6 engineer Yee-Ting Li: working at UCL until December. Robert Stoy: DFN, network engineer. Nicholas Simar: Dante; coordinating an activity on performance monitoring. Anand: Dante; interested in BAR in JRA4 Maarten: Charakt: EPCC. software development Andrea di Donato: UCL. Telecommunications engineer. Benjamin: CNRS. JRA4. Boutelle: CNRS. SA2. ???: CNRS. 50/50 JRA4 and ???? Nicholas: Monitoring ==================== No presentation, so extensive use of the board! What has come out of the monitoring phase so far: A document has come out on Requirements, from Operational people (GOC, SA1), Middleware (JRA1), generic users (NA4). This morning we have realised that the requiremetns are very high level, and we need to introduce some more detailed requirements. Also some are quite unrealistic. For example full-mesh iperf measurements... In general the most important metric is available bandwidth. Nearly everybody has requested this. Next comes reachability of sites, then RTT. Also packet loss and where the packet loss is occuring. For most requirements, the most important thing is measurements between end-sites. A O----O----O----O B For iperf tests for example, this is not very good. For Iperf also you are measuring the host also. It might be very interesting to provide middleware and operations information from the network directly. There does not seem to be a very good solution: the end to end solution is the one that everyone wants but, for eg iperf: there are many host-tuning parameters. for concatenating measurements, htis is also not very viable, because it requires every node on every path to deploy a compliant monitoring infrastructure. Another point about the requirements: they are good high-level requirements, but they are high level. We need to refine those requirements, then re-submit the requirements back to the users to ensure that they are still ok. We also need to decide which requirements are the most useful, and which we should implement first. Also forgot to mention: there are two types of measurement which can be made: regularly scheduled measurements, which shows you general trends and so on on demand tests, which allow you to concentrate on a particular problem when it arises. Types of measurement that should be made: RTT-type, Iperf-type, reachability type Pete: whilst I understand from the pov of operators that they might wish to run ping. For most other clients, they do not know that they want to run, for example, ping, it just wants round-trip-time. Gloria: in the requirements, it does say things like "we need ping every 5 minutes". It shouldn't. It should say that they need a particular metric, so that the best way of measureing those things can be chosen. Second thing Nicolas will present: the EGEE Big Picture document ============================================== Tried to start from what we knew: 1) Requirements expressed by the Users, the ROCs and the Middleware 2) That the backbones will provide network monitroing data. It is their business to do this. 3) That the end-sites (for whom the network is extra) need something light to deploy. Is it part of JRA4 to provide tools to make measurements, or do we just say that they have to provide particular metrics.. Pete: probably we should provide an implementation of a measurement infrastrcutre. The clients need some things provided by JRA4: Users: some visualisation ROCs: also visualisation, and on-demand tests Middleware: some access to information; with requests between 10 and 1000 times a second. For the middleware it is not clear whether the intelligence will be in the middleware, or do we provide the intelligence, or just provide a load of data for them to do what they want with. There is a big bridge between what the backbones and the end-sites can provide, and the clients need. This is the GLUE. The Glue will talk to the monitoring systems with NMWG. What the Glue talks to the top parts with is unknown. The backbone things will all provide an NMWG interface. For the end-sites: EDG-WP7: UK Gridmon: piPES end-site nodes: IEPM BW: There are plenty of monitoring infrastructures already. This is a brief overview of the document. The goal of the document is to provide everyone with a common overview of the problem, and a common understanding of where to go. At a later stage it might become an architecture docuemnt. Third document: ideas for a prototype Glue. =========================================== What can we do to quickly get something working for December. A list of the currently available tools and infrastructures. A list of all the parts that we think we will need soon enough: communications discovery A trial. Description of the diagram in the document. For december, on the top we would like to put something a little sexier, like some visualisation. Add on-demand tests In the wrapper, we might also want to add some caching in the middle for performance issues. Another item in the wrapper might allow the discovery of monitoring points, and another might allow the making of measurements. On the top, how does this talk to other things. For NMWG: This morning, we discussed having a wrapper for particular regions, then a hierarchy of them. The idea being that it might be more scalable. For December: we work in three areas: 1) refinement of requirements 2) looking at the big picture 3) generation of the prototype Pete: Assuming that there is a pre-existing defined information structure (there was in Datagrid, say). Not sure how much EGEE has taken on RGMA, or indeed any preexisting information infrastructure. *** If there is one already something like this, then is it worth, as an early demonstration, putting the data into the JRA1 information system that already exists. WP7 (specifically Franck) wrote much visualisation stuff taking data from R-GMA. If we publish into R-GMA, then we can use this visualisation stuff for free. Blimeh. Security? aaaaaaaaaargh Pete: this needs to be done early: if anyone is expert, or wants to become expert... then feel free. But one point is: access to the data, should not be EGEE specific. It must be general. Pete's comments on the presentation of the first paper: Not read it. If the presentation said what was in the paper, then excellent. That is exactly perfect. Just to add a few points: The biggest step forward is that we should have the NMWG interface everywhere. This needs to be done by christmas, just because it hasn't been done before. JRA4 shouldn't be driven on examining fine detail, but on the general philosophy of getting people talking in the same interface. For packaging tools, there is nothing new. These things need to be reused. We need an implementation soon, because it's too useful to do that. Deliverable for MPM is due in December: The deliverable is: Definition of interfaces with simple authorisation. The latter two words... is interesting. Anand: Bandwidth allocation and reservation =========================================== Fortunately, he has a presentation... Second session ============== Edoardo Martelli: IPv6 ====================== Pete: why does JRA4 have anything to do with IPv6? It's political. The Commission put a lot of funds into rolling out IPv6 on Geant... so they would like to see it used. Two things we would think about doing: 1) General awareness raising. How it's useful, how to write code which is IP version agnostic. 2) Testing of IPv6 Formal milestone and deliverable review ======================================== Pete: What's the purpose: I think we need to review the project plan anyway Also, Kostas needs to write a more detailed plan for the rest of hte project. AOB (we'll do the ones we've thought of today) ============================================== 1) EGEE conference in Den Haag There's a joint JRA1,3,4 meeting. 2) PTF Project Technical Forum Next face-to-face Probably ought to go to Paris, as it's easy for everyone to get there... though expensive. Run parallel sessions for at least half of it 3) Handover: I should hand over to... probably whoever it is at CNRS that takes over the WP7 stuff. The decision should be made tomorrow, when the others from CNRS are here. 4) Timesheets: Meh. Apparently Andrea did his on Brian's account. So I better had too. 5) Questionaire from Dante to EGEE. The subactivity needs to look at it as part of its work. Then 15/10/2004 ========================================= Critique of activities: NPM ========================================= Things that JRA4 should write software to do: Answer the questions that the Grid Resource Manager will ask Put data into Grid information services Diagnosic tool for GOCs (this would be the first thing to provide) Proposals on the board: What we want to achieve and by when (in Pete's opinion, and not deliverables): * Standardised NPI publication * Early release deployment at end sites, providing facilities >= WP7 * Diagnostic client GOC (& NOC) * Provision of Information to HLM (once in GIS?)