Man's mind, stretched to a new idea, never goes back to its original dimensions.-Oliver Wendell Holmes
This chapter describes how services are provided using MPLS. Included in this description is a high-level overview of the pieces used to provide these services, including specific protocol support functions. Specifics of the protocols and components of MPLS used in providing these services are discussed in greater detail in earlier sections of this book.
|Basic services in MPLS are effectively enabled using hop-by-hop LSPs established using LDP in the intranet (IGP) case and MPLS-BGP in the Internet (EGP) case.|
Note that this process assumes that label distribution is downstream unsolicited. In the event that the label distribution mode is actually downstream on-demand, labels are distributed as described except that the local LSR requests labels from the next-hop LSR peer for each route table entry.
If conservative label retention is used, the LSR retains those labels that it will use and releases those that it will not use. Otherwise, any number of labels may be retained, up to and including all labels received. Using the labels retained specifically for each given FEC's next hop, an LSR constructs an NHLFE and an FEC-to-NHLFE map (FTN) if the LSR is an ingress for the LSP, an Incoming Label Map (ILM) if the LSR is not an ingress for the LSP and will receive only labeled packets for the corresponding FEC, or both if the LSR expects to receive both labeled and unlabeled packets from upstream routers that will be merged onto the same LSP.
If the LSR has constructed an FTN, it may act as the ingress for any unlabeled packets it receives by using a matching NHLFE (if a valid one exists).2 An LSR may act as an egress for labeled packets it receives having an ILM and no matching NHLFE. An LSR is expected to perform some label operation (push, pop, or swap) if it receives a labeled packet for which it has both an ILM and a matching NHLFE. An LSR may be expected to forward unlabeled packets as unlabeled packets if it has no FTN or no matching NHLFE.
As route table entries are added, removed, or changed, the LSR takes corresponding label distribution actions (advertising, requesting a new label or withdrawing the request, releasing the existing invalid label). If an LSR loses a peer (because the peer session is terminated, perhaps because the adjacency is lost), it invalidates all corresponding labels.
The Path message (in RSVP-TE) or the Label Request message (in CR-LDP) carries the resource requirements for an LSP intended to support the Integrated Services QoS model. In each case, when this information arrives at an egress, a corresponding response is generated (Resv for RSVP-TE, and Label Mapping for CR-LDP) and LSR resources are committed during the process of propagating the response back to the originator. Clearly, in both cases, the downstream on-demand label distribution and ordered control modes are used. Also, in either case, implementations are free to commit resources during the request phase (the portion of the signaling process during which label requests are being propagated downstream).
An E-LSP is distinguished from an L-LSP by the former's use of the EXP bits in the generic label format. The Cell Loss Priority (CLP) field in ATM and the Discard Eligibility (DE) field in Frame Relay are used similarly, though with less effect, for E-LSPs in their respective technologies. With L-LSPs, the label itself implies the behavior that applies to packets within the given LSP. Either RSVP-TE or LDP may be used to establish LSP support for the Differentiated Services model. In the RSVP-TE case, new Class of Service (COS) objects are provided to allow for setup of an LSP for this purpose, effectively establishing an LSP for classes of best-effort service.
The key goals in TE are to maximize network efficiency and total data "good put." Priorities within the "good put" category are as follows (approximately in order):
Specific goals for network efficiency revolve around ensuring that the average utilization of network resources is as close to 100% as possible while minimum and maximum utilization of individual resources are as close to the average as possible.
Both goals are affected by congestion; thus, avoidance of congestion is of paramount importance. Problems associated with congestion are made worse by inefficient use of network resources. These problems are directly addressable using TE.
The TE model consists of a connected network, performance monitoring feedback, and a management and control system. The traffic engineer determines the current state of the network (via performance monitoring), analyzes the traffic characteristics and trends, and attempts to control the network in such a way as to alter the current state to one that maximizes the desired characteristics of the network and accommodates existing traffic characteristics and trends. This process is a continuously ongoing effort.
To minimize the amount of operator involvement in the TE model, it is desirable to minimize the operator's involvement in modifying traffic management and routing parameters and in modifying the way in which the system's use of resources is artificially constrained. A desirable solution is one that is both scalable and resilient.
Interior gateway routing protocol capabilities are not up to the task. In fact, prevalent IGPs contribute to congestion because they are effectively designed to develop a consistent view of the topology that results in traffic being forwarded dominantly along shortest paths. As a result, shortest-path routes are likely to be highly congested while similar routes are likely to be underutilized.
An important factor in the long-term effectiveness of a TE solution is system responsiveness to changes in traffic conditions and corresponding measurement of resource utilization.
One important aspect of TE is the introduction of simple load-balancing techniques. However, traffic engineering also needs to take into account other factors affecting total income production from use of the service provider's network. This requires mechanisms for supporting more complicated policies than a simple load-balancing scheme. MPLS provides a means for effecting more complex TE solutions at a potentially lower cost than alternative technologies.
MPLS offers dynamic mechanisms for establishing explicitly routed LSPs that can operate independently of IGP-determined routes. This reduces the impact that limitations in routing protocol behavior have on congestion in the network. Because MPLS mechanisms are dynamic, LSPs can be established with desirable resiliency and can be reoptimized as needed.
Specific attractive features of MPLS are as follows:
Using virtual circuits thus allows many TE functions to be accomplished with today's networks.
Equipping MPLS with a similar virtual circuit capability is important for future network TE needs. MPLS offers the ability to provide an integrated overlay model at a lower cost than existing ATM and Frame Relay equipment. MPLS also offers the opportunity to automate some of the traffic engineering functions.
The difficulty in realizing a TE solution using MPLS is the hierarchical nature of the mapping of traffic onto LSPs in a TE model. Ultimately, the traffic engineer wants to create traffic trunks to shunt traffic in the network in such a way as to produce efficient utilization. These traffic trunks would be realized using explicitly routed LSPs in order to achieve independence from the underlying routing infrastructure. However, traffic is mapped onto LSPs using forwarding equivalence classes (FECs). Thus, it is necessary to determine the FEC-to-traffic-trunk mapping that will produce the most efficient mapping of traffic to traffic trunks and then onto the overlay of explicitly routed LSPs.
To do this, TE over MPLS requires
LSP-based traffic trunks are inherently unidirectional; however, bidirectional traffic trunks may exist as well. A traffic trunk is considered bidirectional if the LSPs used to create the traffic trunk include the same ingress and egress LSRs (obviously, in reversed roles) and are created, maintained, and destroyed together. Bidirectional traffic trunks may be symmetrical or asymmetrical in the sense that they are not required to use the same set of LSRs (in reverse order) as long as they have the same termination points.
RFC 2702 (Awduche et al. 1999) defines actions with respect to traffic trunks, which are shown in Table 7.1. In addition to billing, capacity planning, and related functions, measurement of traffic trunk statistics is important in determining more immediate traffic characteristics and trends for use in optimizing network performance. From a TE perspective, the ability to collect this information is essential.
Traffic engineering can turn this around somewhat by using the observed traffic parameters of existing flows to determine the size of traffic trunks needed to carry these flows. Alternatively, the sizing of traffic trunks may be determined from measurements of congestion at various points in the network. An admission control function is then used to select specific flows to apply to each traffic trunk.
Some type of policing action must occur somewhere in a traffic trunk unless all traffic in the trunk is best-effort traffic (implying that no compliance agreement exists). Otherwise, all traffic is treated the same, regardless of its in-compliance status. However, it is not generally desirable to perform policing at every node in the network. Policing for an LSP is generally done only at the ingress for that LSP.
From a TE perspective, however, policing for a traffic trunk is either done or not done. A traffic trunk may start at some point within a service pro-vider's network or may otherwise have been subject to policing (or traffic shaping) already. In this case, it is necessary to be able to disable policing for the traffic trunk.
Priority is used to determine the order of setup for TE traffic trunks when more than one traffic trunk is pending (for example, during system initialization or fault recovery). A TE solution may need to recompute paths after each successful traffic trunk establishment, particularly when a traffic trunk consumes resources that affect the path selection process for subsequent traffic trunks. Because available resources are consumed with each traffic trunk established, it is likely that each successive traffic trunk will be more constrained than similar traffic trunks established previously.
Priority should take into account the resources each traffic trunk will consume. This is analogous to the problem of fitting as many rocks into a bottle as possible, given a fixed set of rocks of various sizes. Putting larger rocks in first can be the best strategy for getting the largest volume of rock into the bottle. In the TE case, setting up those traffic trunks that consume the most resources later in the setup process increases the likelihood that trunk establishment will fail, even if all of the existing trunks would have succeeded using a different order.
Priority should also take into account the preemption levels of various traffic trunks. Each traffic trunk that is preempted may need to be reestablished. In this case, the system will take longer to establish the full set of traffic trunks if trunks that will be subsequently preempted are established prior to those that might preempt them. As defined in RFC 2702, this occurs automatically because priority and exemption levels are dependent.
RFC 2702 defines preemption as binary along these two dimensions. That is, a trunk either can or cannot preempt another trunk, and a trunk either can or cannot be preempted by another trunk. If a trunk being established can preempt other trunks and cannot otherwise be established, it will preempt another trunk (that may be preempted) if that other traffic trunk is of a lower priority. In general, a network element processing the setup in this case will preempt existing LSPs-starting with the LSP having the lowest priority-until either there are sufficient resources to satisfy the requirements of the new LSP setup or there are no remaining lower-priority LSPs. Note that LSPs should not actually be preempted if there will not be sufficient resources to establish the new LSP when all lower-priority LSPs have been preempted.
Many implementations handle preemption using a two-level priority:
Because a circuit that has been preempted may be reestablished, it is essential that the holding priority never be lower than the setup priority. Distinct set up and hold priorities may be useful when it is desirable to set up a low-priority circuit that must have a high-priority survivability if it is successfully established. This might be the case for large numbers of short-duration circuits. It would also be the case if disruption of services is intended to be implemented as a breadth-first search for lower-priority circuits to preempt. It is relatively simple to implement the behavior defined in RFC 2702 by always setting setup and hold priorities to the same value.
The traffic engineer may be either a TE automaton or an operator administratively configuring LSPs for use with traffic trunks. In a generalized TE solution, it is possible for variant traffic engineers to each determine and attempt to establish traffic trunks for the same purpose. For instance, a TE automaton may determine one path while an operator has configured another. In general, it is necessary to provide a means to resolve which path will be used to establish a traffic trunk in this case. Specifically, it should be possible to force the system to accept the traffic trunk configured by the network operator. Ideally, the system will report inconsistencies of this type, especially in the event that the configured path is not feasible (or is suboptimal by some threshold value). Alternatively, the path selected by one method (for example, manual configuration) can be treated as the preferred path and will be used as long as this path is not infeasible or seriously suboptimal.
RFC 2702 defines the behavior of arbitrating between a manually configured path and a dynamically computed path by describing manually configured paths as either mandatory or nonmandatory. A mandatory configured path is used regardless of the computed path.
Path maintenance criteria affect whether or not a traffic trunk will be moved in response to specific changes in network topology. In general, a traffic trunk may be established such that the path will not change unless the current path optimization is exceeded by an alternative path optimization by some threshold. In the event that the threshold is exceeded, however, the LSP for the traffic trunk will be reoptimized. If it is the intent that the LSP not be reoptimized, the threshold value would be effectively infinity. Path maintenance criteria may also include other values, such as a delay value (to avoid transient reoptimization). Adaptivity and resilience are subattributes, or aspects, of the path attribute and are discussed in detail in a subsequent section.
In summary, a path attribute includes the strictness of the explicit route, arbitration (mandatory or nonmandatory in RFC 2702), adaptivity, and resilience.
Where a very high degree of traffic delivery assurance is desired, undersubscription of network resources may be used. When this is done, a subscription (or allocation) factor is applied to the bandwidth determination for the applicable traffic trunk.
Because network resources typically do not natively support the concept of oversubscribing and undersubscribing their resources, the traffic engineer applies a subscription/allocation factor prior to establishing the traffic trunk. For example, if oversubscribing by 25% corresponds to an allocation factor of 1.25, the traffic engineer would multiply the bandwidth requirement otherwise determined for the traffic trunk by 1.25 prior to requesting the corresponding LSP setup. Note that this is, in effect, an effort to fine tune an effective-bandwidth calculation4 as might have been required to determine the bandwidth requirements in the first place.
Resource classes may also be used simply to identify resources.
An ingress LSR does constraint-based route computations in order to automatically compute explicit routes used for traffic trunks originating at that LSR. For TE, the traffic engineer would initiate this process.
The traffic engineer specifies the ingress and egress of a traffic trunk and assigns a set of characteristics for a desirable route. These characteristics define constraints in terms of performance needs and other aspects. Constraint-based routing then finds an explicit route that will satisfy these constraints among the set of available routes. Note that selecting an optimal route requires determining all possible routes for N + 1 TE trunks (assuming N existing TE trunks) and selecting the optimal set of routes in reestablishing the full set of TE trunks plus the new one requested. This is a task that is easily recognizable as NP-complete.
An example of use of constraint-based routing to satisfy a traffic engineering need is the attempt to move a portion of the traffic on a congested link to another link. Assigning the congested link to a resource class that would be treated as an undesirable characteristic of the desired route is a simple and direct way to represent the desired constraint. The traffic engineer defines a portion of the traffic that would normally traverse the congested link (possibly in terms of a set of destination addresses) and initiates the constraint-based routing process. The traffic engineer causes a set of ingress LSRs to each seek a new path that satisfies the constraint that it not use any link that is in the resource class associated with the undesirable (congested) link.
Although finding the optimal route using constraint-based routing is known to be computationally difficult for almost any realistic constraint-limited routing problem, a simple heuristic can be used to find a route satisfying a set of constraints-if one exists. The traffic engineer may simply prune resources that do not match the traffic trunk attributes and run a shortest-path route computation on the residual graph. Other approaches may be used as well.
Continuing the previous example, ingress LSRs prune the set of available links known to them (for example, as a result of using a link-state routing protocol) of all links belonging to the resource class of the congested link (possibly a "congested" resource class is defined for all such links). These ingress LSRs can then run a route computation (using the pruned link-state information) and establish explicit routes on the basis of their results. The ingress LSRs then use this explicit route solely for routing the portion of traffic defined. Because the ingress LSRs no longer route this traffic via the congested link, the congestion on that link would be reduced by an amount that may be as much as the amount of traffic associated with that defined portion of the traffic now being forwarded on the new explicit route.
These procedures, being heuristic in nature, will not necessarily find the optimal solution. In addition, successive applications of these approaches may lead to failure to find a route for one or more traffic trunks when all such traffic trunks could have been accommodated with an optimal solution. This implies that it will be necessary to tear down TE trunks at some point to avoid increasingly suboptimal constraint-based route determinations.
To perform the automated constraint-based routing computation in the example, the information provided by the link-state routing protocol must include information about the links that would allow ingress LSRs to determine what links satisfy which constraints. For example, when the congested link was assigned to a resource class, this assignment would have to be advertised in the link-state routing protocol in common use by LSRs in the TE domain. Support for constraint-based routing computations is currently being developed in the IGP routing protocols IS-IS (Intelligent Scheduling and Information System) and OSPF (Open Shortest Path First).
Adaptivity is preventable in signaling if (in CR-LDP, for example) it is possible to pin a route explicitly. An explicit route may also be pinned by being strictly routed at all hops. As described in the Piggyback Label Distribution Using RSVP section in Chapter 6, it is possible to use the Record Route object to determine the exact route currently being used by an LSP and then use this information to pin the LSP. Maintenance of a pinned explicit route is simpler because it is unnecessary to retain information required to reroute the LSP at every network element that might otherwise be required to do so.
The resilience aspect can be broken into two parts: basic and extended resiliency. Basic resiliency determines whether a traffic trunk is subject to automatic rerouting as a result of a partial path failure (one or more segments fail). Extended resiliency determines the specific actions taken in response to a fault-for example, the order in which specified alternative paths are considered. Support for resilient behavior depends on interactions with underlying routing technology, both in detecting a fault and in selecting a new path.
Resilience at the local level is only possible if the original path was a loosely specified portion of an explicit route or if the fault is part of a segment where there is more than one strictly specified explicit route provided for this purpose.
This distribution can be done using MPLS by establishing multiple LSPs (effectively as a single combined traffic trunk), each of which will carry a portion of the traffic for the combined traffic. To do this, however, the ingress LSR must be capable of assigning packets to each of the multiple LSPs in an intelligent fashion.
For example, assume two LSPs are established to carry the traffic from an ingress LSR to an egress LSR for the same aggregate traffic. One is expected to carry two-thirds of the traffic, whereas the other carries one-third. In this scenario, the ingress LSR must map corresponding portions of the incoming traffic aggregate to each LSP. It is desirable that this mapping be done in such a way as to ensure that packets that are part of the same source- destination flow follow the same LSP as a safeguard against out-of-order delivery.
These functions are not necessarily performed in the order listed. For example, notification may need to occur before isolation can begin, and restoration may have begun before a fault was detected (for example, establishing a redundant circuit in anticipation of failure) and may in any case begin before notification takes place. In some technologies (for exampe, IP routing), detection and isolation are not separable functions.
Because TE uses explicitly routed LSPs, mechanisms intrinsic to the underlying routing infrastructure will not necessarily be sufficient for recovering from a fault, particularly in strictly routed (or pinned) portions of the LSP. Because (by default) routing is blind to the paths taken by an explicitly routed LSP, MPLS needs to provide separate mechanisms for detecting a fault in an LSP, notifying the ingress (especially if the fault is not locally repairable), and initiating restoration of service.
Because it is possible that MPLS is using technology that may provide some alternative fault recovery mechanisms, fault recovery mechanisms defined specifically for MPLS must be able to be disabled.
Fault recovery must also take into account the priority and precedence attributes of the traffic trunk.
This process may be extended to include additional LSPs in tandem. In this case, either the egress LSR is also the ingress to one or more further LSPs, the ingress LSR is egress to one or more LSPs, or both LSRs are both ingress and egress to LSPs. The LSPs in this discussion are LSPs for which there is a similar mapping of TE forwarding classes corresponding to a traffic trunk that uses two or more LSPs in tandem.
Because it is not possible to pin an LSP routed from an ingress to an egress LSR using LDP alone, a traffic trunk established using this approach is both adaptive and resilient by nature. Therefore, this approach cannot be used to establish traffic trunks for which either of these properties is undesirable.
In addition, it is not possible to explicitly assign resources from the path used for this approach via the LDP signaling protocol. If it is necessary to assign resources to a traffic trunk explicitly via the signaling protocol, some other approach must be used.
Because CR-LDP has the Explicit Route object (and procedures to support its use), a traffic trunk LSP can be fully specified as a set of strict explicit hops. CR-LDP supports explicit pinning of an explicit route as well. CR-LDP also includes extensions to provide RSVP-like resource allocation in setting up explicitly routed LSPs.
An explicit-route LSP is constructed using procedures defined in RSVP-TE and including an Explicit Route object in a Path message. Support for route pinning is provided by including the Record Route object in both Path and Resv messages and then including the Explicit Route object with a fully specified strict explicit route in all subsequent Path messages.
To effectively provide the illusion of a private network using shared resources, it is necessary to support private address spaces and to provide for separation of traffic by preventing leakage of traffic from one VPN to either another VPN or to the Internet and by providing some level of isolation of VPN traffic from the effects of traffic in other networks sharing the same resources.
Methods of isolating traffic from sharing effects (among VPN alternatives discussed to date) fall into one of four categories:
The essence of the procedure is that BGP is used to propagate VPN-specific routes to populate separate forwarding tables in the VPN service provider's network. RFC 2547 defines a provider edge (PE) router that must determine which forwarding table to use based on which customer edge (CE) router it was received from.
Some measure of scalability is achieved in this approach by limiting the distribution of VPN-specific routes to those PE routers that attach CE routers within the given VPN. In this way, each PE router only needs to maintain routes for CE routers to which it is directly attached.
Route distribution for VPN support using BGP is accomplished using BGP multiprotocol extensions (defined in RFC 2283 [Bates et al. 1998]) and a new Address Family Identifier (AFI) and Subsequent Address Family Identifier (SAFI)-1 and 128, respectively-that identify the VPN-IPv4 address family. Addresses from this address family are 12 bytes long and include an 8-byte route distinguisher (RD) and an IPv4 address (prefix). The mapping between RDs and specific VPNs is not guaranteed because an RD need only be unique to the PE set participating in a VPN and will vary across service provider domains. A PE determines which routes to distribute for a given VPN based on target VPN attributes that are associated with per-site VPN-specific forwarding tables. Association of target VPN attributes with specific sites is determined by configuration.
BGP-MPLS VPN routes are distributed using peer-to-peer BGP direct connections or connections via a route reflector. The BGP Update messages used to distribute these routes include MPLS labels corresponding to each route (using appropriate AFI/SAFI and address length values). Procedures and formats for carrying labels in a BGP Update message are defined in Rekhter and Rosen (w.i.p.) and described in Piggyback Label Distribution Using BGP, in Chapter 6. Setup and maintenance of an LSP between two PE routers that are not directly connected is accomplished using LDP, CR-LDP, or RSVP-TE, with or without explicit routes.
Although most proposals are currently either entirely proprietary or based on proprietary extensions to a TE-based solution, there are several common distinctions between this general approach and BGP-MPLS VPNs. Some of the ways in which explicitly routed VPNs may differ from BGP-MPLS VPNs include the following: