![]() |
![]() ![]() ![]() |
![]() |
![]() |
|||||||||||||||||
![]() |
||||||||||||||||||||
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
![]() |
![]() |
![]() |
|||||||||||||||||
![]() ![]() ![]() |
A collaborative project to demonstrate end-to-end traffic management and high performance data transport applications required for Grid operationsAbstract: Investigators: C.Cooper, D.Salmon (CLRC,RAL)
R.Tasker (CLRC, Daresbury Laboratory) P.Clarke*,(University College London - Physics) S.Bhatti, J.Crowcroft (University College London – Computer Science) R.Hughes-Jones (Manchester University - Physics) J.Sharp, R.Samani (UKERNA) IntroductionThere are at present many initiatives in the research community to develop "computing Grids" for scientific data processing for a wide variety of applications. Grids are based upon middleware layers that interconnect data and applications in a seamless way across a widely distributed environment. Core to the Grid fabric is a high capacity network offering the services needed to manage the different classes of traffic flow required for Grid operations. Excluding planning and provision of the infrastructure itself then Grid networking issues can broadly be broken down into the following areas: · Network services based upon traffic engineering, such as managed bandwidth and Quality-of-Service provision. · High rate, high volume, robust data transport applications. · Network information service · Middleware access to data replication services. All of these areas are manifestly crucial to PPARC science goals, and therefore feature prominently in the consortium proposal for a DataGrid for Particle Physics (PP-DataGrid) where support is sought for work in all of the application specific areas. This proposal is submitted in concert with the PP-DataGrid proposal but is complementary and focuses upon the first two items which embody generic core e-science aspects of networking (i.e. those areas common to all e-science operations).
Traffic management & QoS Requirements from the PP-DataGrid application perspective include: · Short-term data set replication: To be able replicate data sets (~ 1-100 TBytes) between Tier-N sites in ~1 day (0.1-10 Gbits/s). This leads to the requirement for short-term end-to-end managed bandwidth reservation on demand. · Pseudo-Continuous data replication: Tier-N sites need to replicate processed data to Tier-M sites and vice versa. Typically 1-500 Mbits/s continuous equivalent rate. This leads to the need for managed bandwidth services. · QoS: In the longer term more sophisticated services are required to differentiate classes of traffic ranging from high quality interactive, control and video applications through to less time critical applications. This leads to the need for production Quality of Service provision. · Traffic classification: To be able to configure packet classification at ingress to a domain based upon client requirements. · Class based routing and forwarding behaviour: The ability to handle pa ckets within routers differently according to the associated traffic class. · Inter-Domain: Development of “Service Level Agreement” mechanisms to operate at inter-domain interfaces. · Scalability: Solutions which scale easily with the complexity of the network and the number of clients. Many techniques are potentially available for implementing traffic management: - Packets can be examined at ingress to a network and classified using various pieces of information from the IP header. - Diffserv allows different next hop behaviour for each traffic class. - RSVP can be used to reserve resources within a network. - Congestion control can be addressed using intelligent packet dropping algorithms such as weighted random early detection (WRED) to signal back to applications. - Explicit congestion notification (ECN) is designed to interact with suitably aware clients and servers. - Multi Protocol Label Switching (MPLS) is one of the important technologies that is likely be used for, or as part of, the underlying traffic engineering.
The Project aimsThe mechanisms mentioned above are
themselves not new, and have already been demonstrated within limited environments,
and upon homogeneous domains (same supplier’s routers within a single administrative
domain). However their use for end-to-end services across a heterogeneous
WAN consisting of multiple domains is not straightforward, and at present
there is no "production" method with which these services can
be provided for live end-to-end applications. The thrust of this project is
therefore to meld existing knowledge with a focused application (the PP-DataGrid)
in order to demonstrate the end-to-end services needed for Grid operations
across the WAN, both within the UK and also to both the US and Europe. The PP-DataGrid will be the specific driving application, although the results will be relevant to all other Grid applications. We believe that having such a clear and high profile application is crucial to provide the focus for concrete deliverable targets. The project will make extensive use of the SuperJANET4 Development Network (SJDN), which is intimately connected to the SuperJanet academic network (ac.uk) upon which live Grid applications will depend. This is a substantial infrastructure which will be available to us through the involvement of UKERNA directly with this project. In particular UKERNA have procured routing equipment which is MPLS capable specifically with this work in mind. In more detail the project aims are: ·
To understand the applicability and limitations of various
traffic engineering tools to implement the traffic management services required
by the Grid. · To specifically understand the use of MPLS in this context. · To address the inter-domain interface problems. ·
To demonstrate end-to-end network services over several domains
within the UK ·
Where possible to demonstrate end-to-end network services
to the USA in collaboration with leading US Grid groups. ·
Where possible to demonstrate end-to-end network services
to CERN as part of our EU-DataGrid commitments. The detailed breakdown
of tasks and deliverables is given in the appendices. The infrastructure and routing equipment which will be required for this project are described in detail in the appendices, along with a detailed costing.
Transport Applications
Grid data rates will exceed 100s of Mbit/s over long latency routes. Such transfers will rely upon the availability of high rate/high volume/reliable data transport applications. All three elements are key: the combination of rate and volume are clear, but it is also essential that such applications can recover from faults and complete the data transfer. The transport protocols will also have to deal with efficient replication and update to multiple sites. It is already apparent that protocols such as “standard” TCP based FTP are unlikely to be adequate and therefore today we do not know how to satisfy the DataGrid demands. This is an area where significant work has been done in the CS community based upon modeling and controlled measurements, however it is vital to understand these effects in the context of real Grid traffic patterns running on the WAN. The issues of transport applications are directly related to those of traffic management since applications need to interact with QoS services. Project aimsThe aims of this part of
the project are: ·
Investigate a variety
of high performance data transport mechanisms including advanced TCP and
non-TCP applications. ·
Demonstrate high performance
data transport in a live Grid context aiming for > 1 Gbit/s ·
Understand how to integrate
suitable applications with the traffic management services previously described
and the higher Grid middleware layer. Industrial CollaborationUKERNAUKERNA provision and manage the SuperJANET4 academic network in the UK. UKERNA is already highly involved with e-science activities and have made clear their wish to support Grid operations upon the SJ4 backbone. In the wider sense UKERNA have identified the need to develop QoS services and, for example, have constituted a working group to bring together expertise from different organisations in this area. As part of this strategy UKERNA has great interest in development of traffic engineering based upon MPLS. The routing equipment which is now deployed for SJ4 was procured to be MPLS capable for just this purpose. UKERNA fully support the opportunity to develop this in collaboration with the PP-DataGrid project. - The routing equipment required and the associated maintenance support. - The annual costs of the testbed fibre infrastructure. - Engineering effort from within the JANET Network Operations and Service Centre (NOSC) - Project management effort from within the UKERNA Strategic Technologies Group, specifically Jeremy Sharp and Rina Samani. CISCO<Awaiting words from Jane Butler> Links and external collaborations Request for support from PPARC Staff posts:
-
We request 2.0 FTE staff posts to work on traffic management
and QoS for two years. 1.5 FTE will be used to specifically underpin
the development of end-to-end services using MPLS in line with the specific
objectives of TM1 and TM2, which embody the immediate strategic interests
of UKERNA and CISCO. A further 0.5 FTE will be used
to support other QoS work such Diffserv, and the international collaborative
work. -
We request a 1.0 FTE staff post to work on data transport applications.
Equipment: - RAL Nortel Edge router (fractional cost to this project). 3,000 - PC ancillary equipment and internet access £2000/site. 8,000 - Specialised equipment £3,100/site. 12,000 - Test equipment at three C-PoPs. 17,000 -
Edge and backbone domain routers - Router support at sites and software updates. 25,000 -
Travel and subsistence. 4,000 Total request to PPARC £319K Infrastructure for local loops: We do not request significant
resources for local loops (connections between SJDN and sites) in
this proposal. This is for two reasons: - The work proposed here is aimed at developing end-to-end mechanisms. Whilst this work does rely upon suitable routing equipment being available it does not inherently rely upon local loops in order to meet its objectives, although having such makes demonstrations both easier and more convincing. Nevertheless we do not believe that it is reasonable for local loop costs to dominate the equipment costs in this proposal. -
It is likely that such local loops will in any case be needed by
the wider context of the PP-DataGrid testbed deliverables. Therefore we
assume that some infrastructure will be available from that project. The total cost of local loops is
£800,000 at full price, although we expect significant discounts. We will
seek the majority of this elsewhere. If difficulties still remain then (i) the test domains
can be centred at the SJDN C-PoPs, hence avoiding much of the cost and/or
(ii) local loops may rented for limited periods corresponding to the end-to-end
tests. Therefore in this proposal we request only minimal costs of £50,000
over two years.
Summary of resources requested in this proposal. Relation to other proposals.
The resources sought here are targeted
at the development and demonstration of core networking services. As such
the work described here can form the basis for a programme to be submitted
to the OST/DTI generic e-science support lines. We believe that such a submission
would be significantly enhanced by the demonstration of PPARC support for
both the generic and collaborative aspects of this proposal. The resources sought here are complementary to those sought through the PP-DataGrid proposal which would support the application specific areas of networking (testbed application traffic, support at Grid sites, use of services through middleware). Appendix: Expertise and resources within the proposing collaboration.
UCLPeter Clarke
is a Reader at UCL and leads the LHC experimental group, working primarily
on computing requirements of the LHC programme. His group interests are
OO software design, Grid data transport applications, Grid traffic management
and network information services. He chairs a PPARC networking committee
(PPNCG). Within the EU-DataGrid project he is a member of the Project Technical
Board and represents the UK in the network work-package (WP7). Within the
UK PP-DataGrid project he is responsible for networking coordination, and
a member of the interim project management board. He will
dedicate at least 50% of his research time to this project, and possibly
more if circumstances allow. Saleem Bhatti is a Lecturer in the Department
of Computer Science at UCL. His
areas of research include QoS (applications and networks), network management,
network security and mobile systems. He was on the programme committee of
the 7th International Workshop on Quality of Service (IWQoS99)
and is on the programme committee of Networked Group Communication 2001
(NGC2001). J. CrowCroft is Professor in the Department of Computer Science at UCL. He researches in multi-media Communications. He is a member of the ACM, a member the British Computer Society, a fellow of the IEE and the royal academy of engineering and a Senior Member the IEEE as well as a member of the editorial team for Computer Networks, Transactions on Networking , IEEE Networks, Monet , and Cluster. We anticipate an additional Grid
networking post through PP-DataGrid
direct support. Approximately 50% FTE of this will be directly connected
to the application side of this work. We have applied for at least one
studentship to be provided through the PPARC industrial collaboration scheme
(CASE), and/or the e-science studentship scheme. Manchester University.
Richard Hughes-Jones is a researcher in the Physics department. His interests are in areas of computing and networking within the context of the LHC programme including the performance, network management and modeling of Gigabit Ethernet switches. He is secretary of the PPNCG. Within the UK PP-DataGrid project he is a member of the management board and a member of the networking workpackage (WP7). He is currently investigating the performance of LANs, MANs and SuperJANET4 using UDP and TCP flows. He will dedicate approximately 50% FTE to this project. We anticipate an additional Grid networking post through PP-DataGrid
direct support. At least 50% FTE of this will be directly connected to investigating
the performance of the network for different QoS conditions in relation
to the high performance data transport mechanisms.. We have applied for at least one studentship to be provided
through the PPARC e-science studentship scheme, and would expect the student
to contribute to this work. Central Laboratory for Research Councils (CLRC)
Chris Cooper is currently network strategist
at Rutherford Appleton Laboratory. He
is a consultant to UKERNA and chairs a ‘think tank’ on the introduction
of QoS into SuperJANET. He holds
a visiting professorship at Oxford Brookes University where he teaches masters
courses in networking. His research
interests are in all aspects of multiservice networking, recently extended
to multiservice middleware in which context he is a co-investigator in EPSRC
project ‘Visual Beans’. David Salmon works in the Scientific
Computing Support group of the Information Technology Department of the
Rutherford Appleton Laboratory. Prior to this he worked for UKERNA where
latterly he was Operations Manager for the European academic and research
backbone network TEN-34, a post contracted by DANTE. He is currently involved
in network related activities for the DataGrid project and investigating
MPLS in a small lab testbed based on Linux systems. He attends the TERENA
TF-NGN meetings as a representative of UKERNA. He is a member of the PPNCG
and a member of the EU-DataGrid
networking workpackage (WP7). Robin Tasker is Head of Network Development at the Daresbury Laboratory. His research and development interests include wide area QoS across the Internet, the lower-layer (switching) environment and network monitoring. He has been a voting member of the IEEE 802.1 (Internetworking) working group since the late 1980s, and since 1997 has led the UK representation to ISO/IEC JTC1 SC6 (Data Communications) meetings. He is a member of the PPNCG. Within the EU DataGrid he is a member of the networking workpackage (WP7) where he is managing the monitoring activity. C.Cooper and R.Tasker will provide expertise on QoS and traffic engineering in both IP and lower-layer networks environment. D.Salmon is already active on a small-scale
MPLS pilot project using Linux systems as routers to emulate a backbone
network with both core and edge routing elements. The initial aim is to
understand label based routing and to combine this with existing QoS/CoS
techniques for traffic classification, prioritisation and rate control to
implement a protected bandwidth path across the network. Bulk data-transfer
applications will be tested over the protected path and experience gained
here will be extended to the wider area tests once the SuperJANET development
network has been suitably equipped and commissioned. UKERNAJeremy Sharp is manager of the UKERNA Strategic Technologies
Group, which is part of the Network Development Division.
The role of the Strategic Technologies Group is to provide a view
of how network technologies and applications will shape the future of
JANET and SuperJANET. In particular it is responsible for the development
and implementation of initiatives (such as the present SuperJANET4 Development
Strategy and its implementation) that lead to the development of specific
new services to the community. Prior to joining UKERNA in 1992, Jeremy worked
in the Telecommunications section of the Rutherford Appleton Laboratory.
Rina Samani is the Technology Development
Manager within the Strategic Technologies Group in the Network Development
Division. She is responsible for managing specific strategic programmes
and tracking emerging applications areas and technologies such as Internet2
developments. Also managing development programmes to ensure that the underlying
JANET network service is able to meet any novel demands of new applications.
R Appendix: Tasks & Deliverables:Traffic management
Task TM1: To understand the use of MPLS as a traffic engineering tool
within the CORE SJDN.
Description: In this phase we will understand the use of MPLS running on a variety of CISCO routers. We will develop suitable IP to label mapping at ingress and use MPLS as a basis to configure traffic management for “guaranteed” bandwidth and QoS. The usefulness of MPLS in this context will be assessed. Test equipment is to be situated at participating sites and connected to C-PoP by local loops. Deliverables: - Month 3: Procurement and installation of equipment - Month 6: Initial demonstration of sustained throughput for different traffic classes. - Month 9: Completion of work. - Month 12: Final report Risks: Timely procurement of SJDN routing equipment.
Procurement of local loops and routers to connect to C-PoPs. Task TM2: To demonstrate end-to-end traffic management across multiple domains using live Grid traffic. Description: This is the main phase of the project within the UK, necessitating the solution of inter-domain issues and leading to a demonstration of end-to-end services between Grid user sites. In this phase we extend the configuration to configure three or more additional independent test-domains peered with SJDN. These test-domains will connect to Grid site end points, and hence act as the entry point for Grid application traffic. Traffic management will be configured on all domains. Inter domain traffic management issues will then be addressed in order to configure end-to-end services. Demonstrations will use live Grid traffic where possible. If and where possible we will also negotiate to include suitable MANs between SJDN and sites. The investigation will focus upon the use of MPLS early on, in accordance with the strategic aims of the industrial collaborators. The work will be widened to address further QoS issues in the latter stages, including Diffserv, WRED and ECN. Deliverables: - Month 12: Initial demonstration of end-to-end guaranteed bandwidth and QoS. Interim report. Presentation of results at network venues. - Month 18: Advanced demonstration including use of other QoS techniques. - Month 24: Final report. Presentation of results at networking venues. Risks: Timely procurement of routing equipment for “test” domains at end user sites. Agreement by Grid applications to route traffic over test system. Multi-vendor issues. Capabilities of MANs. Description: In this phase we will be collaborating with leading Grid development groups in the USA. We will seek to configure the same types of traffic management as per TM2 between end points in the different regions. The scope of the work will explicitly include QoS issues. The main issue will be availability of suitable transatlantic connections upon which such development can take place. Deliverables: Deliverable dates must necessarily be less concrete at present, until the availability of a suitable transatlantic connection is established. - Month 12: Interim report on progress and tests made to date. - Month 24: Final report. Risks: Transatlantic interconnection between SJDN and ESNET (or Abilene) upon which such routing development work can take place. Description: We will seek to configure the same types of traffic management as per TM3 to CERN or other European sites. The main issue here is availability of suitable infrastructure within the Geant network and then into the CERN site. This task is within the scope of the EU-DataGrid project. Deliverables: Deliverable dates must necessarily be less concrete at present, until the availability of suitable connection is established. - Month 12: Interim report on progress and tests made to date. - Month 24: Final report. Risks: Availability of suitable infrastructure within Europe. Transport Applications:
Task: TP1: Demonstrate high performance transport applications
across the WAN in a live Grid context, with a target of 1
Gbit/s. Description: Characterise the performance of standard FTP and multiple TCP stream FTP applications over the WAN in the context of Grid traffic Deploy TCP modifications needed for high bandwidth-delay routes (much work already exists) and the focus will be on utilisation in Grid context. Investigate strategies for reliable transport. Investigate non-TCP based transport applications. Provide an interface for such new applications to Grid middleware layer. Deliverables: - Month 9: Demonstration of reliable transport at > 100 Mbits/s over WAN. - Month 18: Demonstration of reliable transport at > 1 Gbit/s . - Month 24 Final report. Risks: None known Appendix: The SuperJANET4 development network The diagram shows the configuration of the SuperJANET4 development network (SJDN) and the proposed links to the sites involved in this proposal. The C-PoP sites refer to the SuperJANET Core Point-of-Presence sites within the MCI-Worldcom backbone. Appendix: Technical layout and costingEquipment Configuration and CostingThis section describes the routers and test equipment required and then presents the current costs for each item. The topology of the external MPLS domains at the test sites and the Super|JANET Development Network core MPLS routers located at the Leeds, London, Reading and Warrington PoPs is shown in Figure A.1. Some of the tests proposed in the investigation require injection of test and background load traffic into the core MPLS routers at the PoPs. (To suitably load the core network) In addition, the all test equipment at the PoPs require IP access from the production SuperJANE4 network. Figure A.2 gives a more detailed diagram of the equipment and interfaces required for this.
|
|||||||||||||||||||
![]() |
![]() |
![]() |
||||||||||||||||||
![]() |
![]() |
![]() |
||||||||||||||||||
© 2001-2003, Yee-Ting Li, email: ytl@hep.ucl.ac.uk,
Tel: +44 (0) 20 7679 1376, Fax: +44 (0) 20 7679 7145 Room D14, High Energy Particle Physics, Dept. of Physics & Astronomy, UCL, Gower St, London, WC1E 6BT |
||||||||||||||||||||
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |