News & Analysis

Keeping Pace with the T1.231 Evolution

Bill Matern, NComm

4/20/2004 10:00 AM EDT

Not long ago, entire companies focused on providing physical layer (PHY) access equipment, with interoperability and reliability as core objectives. Communication circuits were expensive, so finding ways to increase payload provided real added value.

As focus within the telecom market shifted to new Layer 2/3 technologies, PHY solutions received less attention, and new engineers rarely had any interest in, or experience with, Layer 1 technologies. As PHY access was added into gateways and routers, the quality and completeness of the Layer 1 code degraded to the point where carriers became concerned at the substandard quality of the interfaces.

ANSI's T1.231 was initiated in the 1990s to address this problem. Before T1.231, basic network equipment standards were satisfied with proprietary AT&T standards. But critical functions like alarm protocols and performance monitoring remained undefined until T1.231 was completed. Thus, this spec has become one of the most important standards for WAN equipment manufacturers targeting the North American telecom and datacom markets.

All T1, T3, and Sonet equipment designs connecting to a "public" network are now expected to comply with the standard. Without T1.231, it was nearly impossible to troubleshoot and restore a lost T1 link between the customer site and the central office (CO) without costly truck rolls. Basic questions couldn't be easily answered: Have the wires been cut? Has the provider accidentally reconfigured the interface? Are SLA quality guarantees being achieved? With T1.231-compliant equipment, answers are readily obtained without dispatching technicians to troubleshoot onsite.

Every five years, ANSI standards are revised to ensure that they remain current. The T1.231 revision committee began work in 2001 to resolve outstanding issues and extend the scope of T1.231 to include newer DSL and optical transport network (OTN) technologies. With this revision, all WAN equipment for North America will fall under and follow the same standard. Below we'll take a closer look at the T1.231 spec and then examine some of the key revisions that have been added through the spec update started in 2001.

T1.231 in a Nutshell
Although T1.231 is probably the most important standard for WAN interfaces, engineers and marketers often ignore it. Market requirements documents specify in detail what's required in the routing or VoIP components of a product, yet the entire specification for WAN access is often simply "T1." Does it handle alarms? Collect performance data? Support loop-backs? Handle telephony? Just what do T1 interfaces need to do? Should it be left to individual engineers to answer these questions, given the quality and field service implications?

T1.231 provides a foundation for both data and telephony interfaces, specifying what information the network and interface must provide for each other and how this information is handled. The objectives are to ensure that (1) line status is known at both ends, (2) data is transmitted only when appropriate, and (3) tools are available to identify and resolve problems whether they are hard failures or soft (gradual or intermittent) degradations in performance.

T1.231 also defines the WAN interface data needed in increasingly complex environments of multiple technologies (T1, T3, SONET and DSL). It also provides information for higher-layer services/protocols and enables efficient operation in an environment of multiple network provider/owners.

Surveillance, Testing, and Restoration
The standard addresses the needs of "surveillance, testing and restoration" required in any maintenance strategy. Surveillance entails non-intrusive monitoring at critical points, pinpointing the source of hard failure, collecting data to avoid hard failures and enabling minimally disruptive or non-disruptive problem resolution.

Traditional testing is intrusive, using the entire bit stream to verify failures or service level declines and identify their root cause. Loop-back tests are frequently used, comparing a known input with resultant output. T1.231 provides a foundation for non-intrusive testing, employing overhead bits instead of consuming the entire bit stream. This enables diagnosis and correction of soft degradations with no service interruption.

Restoration involves processes that reroute traffic to resume service. It addresses facility or equipment failures. With SONET, for example, there are standards that specify the restoration process in the case of facility issues. Usually, restoration is dictated by the equipment manufacturer or is designed into the carrier network.

T1.231's Network Surveillance Focus
Two important network surveillance functions are needed to accurately convey what is happening in the network: alarm/status monitoring and performance monitoring.

Alarm and status monitoring triggers on certain negative transmission events ("defects") that are integrated over time into failures (or alarms). A defect may generate an immediate declaration of an indication (integration time = 0) or may need to persist for a specific time period before an indication is raised (integration time > 0). The specification for integration times greater than 0 recognizes that small "glitches" are normal and, within certain tolerances, should not generate alarms.

Figure 1 shows T1.231's specification of how detection of an out-of-frame (OOF) or loss-of-signal (LOS) defect is processed and when it is integrated into an alarm indication. The system in Figure 1 is configured to recognize a 2.5-s "soak" or integration time. If the OOF or LOS lasts less than 2.5 s, no action is taken. Should the defect persist, an OOF or LOS failure is declared, and a remote alarm indication (RAI) is transmitted to notify the far end not to transmit until the condition clears. Once the defect stays resolved for 10 to 20 s, the failure is removed.


Figure 1: Diagram showing a T1.231 alarm example in a data application.

Telephony applications are more complex. During momentary glitches on the phone line, calls in progress must be maintained. This is accomplished by freezing the signaling bits in their last known good state. If the signaling bits are not "frozen", an OOF or LOS will garble the signaling bits, causing the system to interpret this glitch as a new state, which could cause the call to change to an incorrect state.

In the telephony applications, as shown in Figure 2 , when OOF or LOS is detected, not only does the 2.5-second timer need to be started, but signaling bits also need to be frozen while the system determines whether there is a persistent problem on the line. If the OOF/LOS never integrates into an OOF/LOS failure, all call states remain as they were, and no calls are lost. If a LOS/OOF failure is indicated, all calls are taken down and the RAI is transmitted.


Figure 2: Diagram showing a T1.231 alarm example in a telephony application.

The examples shown in this article are simple. In real-life scenarios, more complex alarm integration cases are likely to occur, based on multiple, simultaneous defects and indications. Determination of the highest priority failure needs to be made, and exactly one failure is declared. Complying with the T1.231 standard ensures achievement of optimal service levels, clear data in case of failures and accurate data collection within the performance monitoring function.

Performance Monitoring
As the T1.231 spec defines it, performance monitoring is the process of continuous collection, analysis and reporting of performance data associated with a transmission entity. While alarms are binary point-in-time events, data collected within T1.231 performance monitoring has a time period associated with it and a sense of time of day.

Performance monitoring provides sufficient detail to support remote troubleshooting, allowing carriers to reduce field service expenses and respond more quickly. Problems arising from misconfigured systems and poorly provisioned services are easily isolated using PM data.

T1, T3, Sonet, and DSL all have their own distinct performance monitoring parameters. For example, T1 parameters include: bipolar violations (BPV), excessive zeros (EXZ), cyclic redundancy check (CRC-6) errors, frame bit errors (FE), controlled slips (CS), loss of signal (LOS), out of frame (OOF), severely errored frame (SEF), alarm indication signal (AIS) and alarm indication signal-customer installation (AIS-CI).

Data accumulation occurs in 15-minute buckets over a 48-hour period, collecting both near-end and far-end data. Using an ESF-formatted T1 example, the far-end performance data includes the RAI failure and the T1.403 PRM (performance report message). By knowing the performance in either direction, problems can be quickly isolated and corrected.

There are several types of data collected in the performance monitoring function, depending on the WAN technology being used and how the line is configured. For example, T3 performance monitoring collects different data depending on whether it' configured for M23 or C-bit parity.

Figure 3, a simple T1 example, illustrates the need for accurate accumulation and categorization of performance monitoring data. If BPV plus EXZ = 1 and there is no LOS for the second being monitored, the data is added into bucket A. If the BPV plus EXZ is greater than 1 but less than x and there is no LOS, bucket B gets incremented. If the BPV plus EXZ is equal to or greater than, or LOS is present, SEF is incremented.


Figure 3: Diagram showing the T3 performance monitoring collection decision algorithm.

Also included under performance monitoring are threshold crossing alerts and gauges. Threshold crossing alerts provide notification when a monitored parameter reaches or exceeds a preset level. Once the threshold has been crossed, an out-of-range alert is generated for further deterioration. Activity gauges, on the other hand, provide high- and low-water marks within the specified period typically, current and prior 15-minute buckets, current and prior day, plus a selectable period of "n" prior 15-minute buckets.

Together, alarm and performance monitoring data ensure quick service restoration and can even prevent hard failures before they occur. Unfortunately, many of today's engineers and marketers remain unaware of the benefits of implementing T1.231. As a result, they produce non-standard Layer 1 interface code that frustrates carriers and users alike.

2003 Revisions
For those familiar with T1.231, the following changes and extension should be noted. These updates are part of the 2003 release of T1.231.

T1.231 Multiple Document Format
The current version of ANSI T1.231-1997 has been split into multiple documents as described below. All are currently approved, except as otherwise indicated:

  • T1.231-CORE describes performance monitoring implementation. Individual technology documents describe what must be tracked for particular interfaces.
  • T1.231.01—The DSL technology document covers all DSL technologies, although details have been identified only for ADSL and HDSL-2.
  • T1.231.02—The T1 document.
  • T1.231.03—The T3 document.
  • T1.231.04—The Sonet document.
  • T1.231.05—The OTN document. The scope of the OTN document is currently under consideration by the standards body, and contributions have been requested. Status: Not Approved; Contributions Phase.

Performance Monitoring Changes
There have been several key performance monitoring changes provided in the T1.231 specification. For example, gauges are a new addition. Let's look at some of the key performance monitoring changes provided for T1, T3, and Sonet.

T1 Changes
1. Alarm processing during chattering conditions was clarified. Within 2 to 3 seconds, either an AIS or OOF failure will be declared, depending on which is present at declaration time. Since an AIS defect is also an OOF defect, AIS has priority over OOF. LOS failures are attended to and AIS failures are not, so unattended alarms could result. This should be taken into account during product installation.

2. The time duration for integration was clarified as a time between 2 and 3 seconds, 2.5 seconds being nominal, instead of 2.5 +/- 0.5 s.

3. The time required to de-integrate a T1 failure was changed. After the defect clears, the time interval before removal of the failure indication is now specified to be a number between 10 and 20 seconds.

4. Clarification of the handling of performance report messaging (PRM) reports during RAI-CI conditions.

5. Details of implementation of performance monitoring during RAI-CI and loop-back retention are clarified.

6. The details of declaration of RAI failure during RAI-CI conditions or other failures are clarified.

7. Clarification of handling the interruption of the PRM messages. Associated performance parameters are inhibited, not set to zero.

T3 Changes
1. The performance parameter SASCP-PFE was redefined as a far-end parameter instead of a near-end parameter.

2. AIS-CI and RAI-CI were added to the set of DS3 failures monitored.

3. NPRM-based performance monitoring requirements were added to T3.

4. AIS-CI and RAI-CI defect contentions were added to numerous performance measurements.

5. Network Performance Report Message (NPRM) was added to the PMDL link.

6. CI/Network performance parameters (ES-NP, SES-NP, UAS-NP, ES-NPFE, SES-NPFE, UAS-NPFE) were added.

7. The time duration on integration time was clarified as a time between 2 and 3 seconds, 2.5 seconds being nominal, instead of 2.5 +/- 0.5 s.

Sonet Changes
1. The number of frames to detect AIS-L was changed from 5 frames to a number of frames x where 3 is less than or equal to x which is less than or equal to 5 to align with the ITU specifications.

2. The SES threshold for OC-768 was defined as 22778 errors in the B1 byte for the section and 39339 errors in the B2 byte for the Line portions of the optical interface. This results in an error rate of 10-6.

3. The time duration on failure integration time was clarified as a time between 2 and 3 seconds, 2.5 seconds being nominal, instead of 2.5 +/- 0.5 s.

4. The time duration for failure de-integration was clarified as being between 9.5 seconds and 10.5 seconds instead of 10 +/- 0.5 s.

5. The result of a failure of the tandem PRM message was clarified. This condition explicitly results in the performance parameters being inhibited, not set to zero.

Moving Forward
ANSI T1.231 was formed to address reliable Layer 1 WAN interfaces and remains one of the most important North American standards governing those interfaces. The standard allows for verification of service levels and segmentation of responsibilities, necessary to meet the increasingly high levels of performance demanded by today's complex voice, video and data applications.

Critical Layer 1 access technologies need to achieve the same high quality as the protocols and technologies they service. To accomplish this and avoid the pitfalls inherent in WAN interface development, all that's needed is planning and adherence to T1.231.

Knowing and understanding T1.231 enables engineers to improve their own Layer 1 interface code or to better evaluate third-party solutions that they may opt to use instead of undertaking an internal development effort. In either case, it's by adhering to T1.231 that vendors will meet carrier and user expectations of interoperability, problem identification and rapid network restoration.

About the Authors
Bill Matern is the CEO of NComm, Inc. He is also the T1.231 Revision Ad-Hoch Group Chair of the T1M1 technical sub-committee responsible for producing the new standard. Bill can be reached at wtm@ncomm.com.





Please sign in to post comment

Navigate to related information

EE Buzz DesignCon

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)

Feedback Form