(I appear to have got confused near the middle of this list, which is why it doesn't seem to match reality) Victor Reijs - Multi domain monitoring infrastructure across DANTE Matt Zekauskas - Internet 2. Does some work for abeline Eric Boyd - Internet 2, e2e pi. Before that did various monitoring tools Paul Bright-Thomas - Helping Clarke ---? - Ukerna. Talking to researchers to see what they would do Duncan Rogerson - Ukerna. Looking at architecting. Looking at getting more into monitoring than just packet scanning Henry Hughes - Helping ^^ Mark Godfrey - Ukerna. Looking at end to end performance monitoring Warren Mattews - SLAC. Working on ESNET. Working on Babar, and transferring lots of data around to computing resources. Yee-Ting Li - PhD at UCL. Monitoring. Also helping piPES. Also looking at general monitoring Mark Leese - Looking after UK Gridmon. Nicolas Simar - dante. Looking at performance monitoring. Trying to look at setting up a system to transfer monitoring information across different domains. Paul Mealor David - Ukerna ============== Looking at including Mark Leese' monitoring, then using Duncan, Mark's for publishing with OGSA perhaps. Ukerna are monitoring service level stuff: capability, uptime. Clarke: from apps perspective. For UK e-science. Mostly from iperf type stuff: flow. Eric Boyd - piPES / AMI/ OWAMP ============================== AMI: Abeline measurment infrastructure Internet 2 - non-profit setup. Stitches together some universities and corporations. Allows researchers to collaborate. Also some developers. e2epi seeks to bring all the efforts together. Trying to build a tool which will tell you: Where the problem is The type of problem Who to call 4 things: e2e pi: seeking to solve the end to end performance problem blends into internet2 apps group and BNI Build a framework to look at link status along an end-to-end path Contribute this to the abeline infrastructure. BNI AMI ------- Goal: Intrument netxt gneration network with extensive performance measurement capabilities. They have four measurement boxes next to each router. The have a powerful PMP next to each POP. Start with Abeline in the center. Then try to build out to piPES. Collaboration with BNI, Engineering and E2E. E2E piPES --------- Goal is to do all the bits that others don't want to do to a complete e2e pi system. If you're gonna instrument the Abeline, may as well do the campuses as well, and so may as well do the hosts too. The hope is that if a particular GigaPOP or whatever gets *correct* warnings from the system, then they can trust that the signal to noise ratio is good. Therefore they may be more willing to fix problems for people for whom they have no responsability. Shibboleth is being developed in house, which is why they use it. They are looking at other ways of doing it. Public access to monitoring results: maybe access to read results should be restricted like write access is. Would like to make it as modular as possible so that people can bung bits in if they do somethng better, say. Also would like to make it as Open Source as possible. Ukerna have had internal talks about this security. If you anonymize the information for lower access people then this is acceptable. Thinking: Flow data anonymized. Looking at route information. Like Looking Glass Looking at SNMP data from the routers. Testing/Analysis Engine ----------------------- Problem: Encode Matt Zekauskas' brain :) Questions: What measurement results are acceptable for a given application (app family)? What tools generate those results? How do you handle incomplete data? How do you rank multiple result-generators? What is the iterative decision tree to understand e2e problem? Measurement schema ------------------ NMWG. Working on schemas. Also next stage document. Measurement types? BW, latency, loss, jitter Measurement Units? seconds vs microseconds Map tools to measurements OWAMP -> 1 way latency Measurement metadata? Database table design? MRTG works by losing resolution over time. Want DB to scale and to work across different domains. Don't particularly want one super-schedular. Want loose scheduling where you trust that there is a lot of free space so the PMPs can do local scheduling. Access, Auth, Auth ------------------ Roles for access: Standard end user Near neighbour (test buddy) NOC staff/network engineer Shibboleth for implementation Each campus decides who in each rold Others trust campus designation We absolutely do not want piPES to be used to perform a DOS attack or some such. Architecuter v1 --------------- Starting with schedular, admin interface, pmp, db Starting with: Testing between AMI PMPs with OWAMP, and stoing it in a database. Database is published with a web service. Need this extended out to iperf, traceroute, snmp as well. Four machines on each router on AMI. Internal resources: Eric Boyd - piPES development project Jeff BooteOWAMP, piPES development Prasad Calyam piPES development Chris Heermann - API Matt Zekauskas - AMI Susan Evett - Documentation Russ Hobby - Campus deployment George Brett - Schema, "grid service" Warren Matthews outside Us, too. Problem: existing platforms are not interoperable (SURVEYOR, RIPE...) Solution is that we make the standards. OWAMP ----- Current draft: draft-ietf-ippm-owdp-05.txt Sample implementation: http://owamp.internet2.edu Abeline OWAMP deployment ------------------------ 2 overlapping full meshes (ip4,6) 11 nodes =============================================== Dante already has UK e-science certificate authorities. Within piPES there is not particularly any authentication/authorisation stuff. Within Internet 2 they are developing Shibboleth. VOMS: Virtual organisation membership? service. Perhaps this should be looked at because it seems to match our requirements. But there are other complicated security systems. Dante are building the monitoring points first, plus the domain tool. The driver interface will allow the domain tool to talk to any one-way measurement tool. eg. the OWMP driver will generate settings, SCP them to a RIPE box, then will extract the data later. For the trial, data is stored on the RIPE boxes, but they will develop a new interface to allow the domain tool to extract data from wherever. If there was a common interface between piPES and the Dante stuff for extracting results, then this would be a useful common point. Warren Matthews =============== There is a whole world of difference between bandwidth measurements from iperf, bbcpmem, bbcpdisk, bbftp (decreasing in that order) Guthrie, Arena A web service interface to extract particular NMWG properties, via SOAP. Arena Is a very e2e piPES type thing: a culprit database and a list of people you should talk to about problems. MonaLisa FrontEnd visualisation http://monalisa.cern.ch/MONALISA Slac have plugged in their measurements to this. Diurnal changes --------------- Either performance varies during the day, or it doesn't (which is a special case of variation=0) Either performance varies during an hourly bin, or it doesn't (which is a special case of variation=0) Therefore parameterise each bin, and changes in the parameters indicate a problem ->Calculate median and standard deviation of last five measurements in bin (i.e. the bin Monday 7-8pm, for the last five weeks) If it's out by more than 1s.d., worry If it's out by more than 2s.d., panic Victor Reijs ============ HEAnet Ireland's national eduaction & research network Cooperation with I2, TF-NGN, NIMI, GGF/OGSA, E2E piPES, Monalisa =========== So Eric Boyd's idea is: PMCs, (performance measurement controllers) which say they can do whatever measurements of particular characteristic. And *they* control the PMPs. This has an interface which gives the *option* of on-demand measurements or sched ule changing, and provides results. So for a RIPE backend, you can just expose the RIPE database. If you really want a particular tool, then you can annotate the request. And the database interface would annotate the results with the name of the tool used or whatever. It would be nice if you could annotate results with exactly how the measurements were done... say how many concurrent iperf tests were done.