March 2023, Vol. 250, No. 3

Tech Notes

Tech Notes: Getting Pipeline Integrity Data Assessment Under Control

By Steven Barnett, Principal Consultant, and Wesley Allanbrook, Vice President, Technology and Product Management, Dynamic Risk 

(P&GJ) — Effective integrity management and compliance with newly released regulatory requirements rely on accurate and reliable data. Many operators have data management systems containing voluminous datasets representing their pipeline systems.

These data management systems may include geographic information systems (GIS), maintenance management systems, other corporate data systems, and publicly available information. In many cases, these datasets have evolved over decades originating with the conversion of hardcopy drawings and records.  

These datasets may have been augmented through additional record analysis (particularly identifying which data is supported by traceable, verifiable and complete (TVC) records), addition of inspection results (e.g., in-line inspection (ILI), direct assessment), repair data, aerial and satellite imagery (to update pipeline alignment).  

Along with the data management system content, the processes and systems used to create, modify, establish TVC status, perform QA/QC, and organize transmission pipeline data for use have evolved. Some pipeline systems, and their associated data, may have been acquired through mergers and acquisitions and include uncertainties related to the patrimony of the data and associated quality and reliability. 

A challenge is presented when endeavoring to understand the current level of quality and reliability of these evolving datasets and how processes and analysis that rely on these data may be affected. Depending on the pipeline integrity or regulatory compliance analysis being conducted, the quality and reliability requirements for specific data elements could vary.  

As an example, confirming maximum allowable operating pressure (MAOP) as required in § 192.624 requires TVC records for certain material properties and attributes. While operators have spent time and effort evaluating records and determining material properties that are supported by TVC records, questions related to the efficacy of the TVC process and reliability of specific attributes or groups of attributes may remain. 

Other integrity analysis, such as external corrosion direct assessment (ECDA), internal corrosion direct assessment (ICDA), dent strain analysis, pressure cycle fatigue analysis (PCFA), require material properties and other data fields to complete the analysis. Due to the complexity of these data sets, it may be difficult to determine how to efficiently and effectively deploy resources focused on enhancing data quality. 

Data Challenge 

Dynamic Risk through is offering an approach to assessing the quality of an operator’s pipeline data and the suitability of the data as a basis for performing specific pipeline integrity or regulatory compliance analysis.  

Data quality and reliability is measured for every field in an operator’s pipeline material and integrity data. Automated processes are used to dissect each data field and identify values that conform with industry norms and detect outliers. Each field is scored for data quality based on the consistency of the data.  

As an example, yield strength may be expressed in a variety of formats:  

Yield Strength (minimum) ksi 

CSA Z245.1 Grade 

API 5L Grade 






















API Specification 5L11 Section 10.3.5 is specific on the format used to express pipe grade: “the symbol shall be X followed by the first two digits of the specified minimum yield strength in U.S. Customary units.”  

It is common for datasets to express, for example, grade X42 in any number of formats, such as X42, X-42, API 5L X-42, API X-42. While these examples are understood to be congruent when viewed by the human eye, automated processes or systems that require data from this field to preform analysis may falter when confronted with such data inconsistencies.  

The consistency, or lack thereof, in the use a specific value is an indicator of the outcome of quality control performed during data loading and maintenance processes. 

Insight Hub currently contains over 89,477 miles (144,000 km) of transmission pipeline data from more than 20 operators.  

Data quality is measured for every data field through an automated process of comparing the unique values within an Operator’s data to industry norms. While pipeline systems may have unique characteristics, knowing where material and attribute data are outside industry norms provides actionable indications of potential data inconsistencies or errors. 

Data validation 

Data validation is further analyzed by comparing combinations material properties and attributes to trusted sources such as The History of Line Pipe Manufacturing in North America2 among others.  

Comparing pipeline populations as defined under § 192.607(e)(1) or other combinations of material and attribute data with trusted sources and identifying unlikely or impossible combinations provides additional indicators of the quality and reliability of material and integrity data. 


Combining strong data with rigorous risk models supports actionable risk mitigation strategies. 

Synthesizing the results of data quality and reliability assessments with the validation of data against industry norms identifies the key strengths and uncertainties associated with an operators’ pipeline system data.  

Quantifying these results provides operators’ a basis for understanding the reliability of analysis based on this data and where results may contain uncertainty. This is particularly important when performing engineering critical assessments (ECA) to reconfirm MAOP under § 192.624(c)(3) among other pipeline integrity analysis. 

Characterizing uncertainties provides direction for the development of action plans targeting enhancement of data quality for specific subsets of data. This leads to measurable changes in the reliability of results.  

Uncertainties within material and attribute data also indicate certain integrity analysis may not be possible (e.g., if coating data not reliable or is not available, the results of predictive analysis requiring coating data as input contain uncertainties or the analysis may not be possible).  

This knowledge along with an understanding of the threats associated with a pipeline leads to the identification and prioritization of data fields that will support additional types of integrity predictive analysis and lead to improvements to pipeline safety. 

Addressing the data challenge by quantifying an operators’ current level of quality and reliability in data representing their pipeline systems and the impact on processes and analysis that rely on these data results in key insights. These insights provide a basis for planning enhancements that will measurably enhance data quality and increase the reliability of integrity analysis processes. 

Related Articles


{{ error }}
{{ comment.comment.Name }} • {{ comment.timeAgo }}
{{ comment.comment.Text }}