Why data hierarchy matters in standard gas datasets
When NAESB first created standardized data sets for data exchange, all of the datasets were created for EDI implementations. EDI implementations have a natural data hierarchy to drive the data placement. Times have changed.
As development of additional data sets has progressed, there are several gas-related data sets that do not have corresponding EDI implementation. These data sets result in a data dictionary of data elements with no instruction to the user of how the data should be placed. There is no stated intention that this is a flat-file presentation of data, therefore it is up to the implementer to determine where the hierarchy belongs.
In case this isn’t your specialty – let’s go into a little more detail. If you look at most NAESB datasets, there is a 3-tiered approach to the data (there are a few with 4, but the same rules apply). The first tier, referred to as the Header, informs the two parties of who the information is from, who is to receive the information, and a date or date ranges. The second tier, referred to as Detail, may contain contract information, location information and/or a range of dates relevant to the Header. There may be multiple Details inside of one Header in the file. The third tier, referred to as the Sub-Detail, usually contains the line-item information. This may be location data, rate data, volumes, nomination line items or other specifics dependent on the dataset. There are usually multiple Sub-Details per Detail. As mentioned, there may be an additional level of detail, referred to as the Sub-Sub-Detail in some datasets. All of this means that for one dataset, there is one Header with one or more Details where each Detail contains one or more Sub-Details.
Okay – back to the original point. If a NAESB dataset does not have EDI instructions, then that NAESB dataset does not have a corresponding hierarchy. Let’s suppose that the dataset contains a contract and contract rates. Without the hierarchy, I might decide to list my rates with all of the contracts that contain that rate or I might decide to list my contract with all of the rates that apply to that contract. This might seem fine because it fits the apparent business need.
Now suppose I am a marketer who does business on 60+ different pipelines and I need to record this data into my own data management system. If one pipeline has the data presented as Date > Contract > Rate, another pipeline has the data presented as Rate> Contract > Date and a third pipeline has the data presented with all of the data elements on each row of data, then it might be impossible for me to figure out how to record this information in my system.
Designating dataset hierarchies does not need to go to the level of detail of NAESB’s previous ‘data groupings.’ It will, however, create a level of clarity for users viewing the screens and download data. With hierarchies in place, you will be able to have an expectation of the data and where, generally, it will appear.
The problem is that pipelines have implemented these datasets without this level of direction and it will cause change, for some pipelines, to institute these levels. The benefit is that the users of this data, the shippers, marketers, and agents on the pipeline systems, will find the data more useful.
This effort of work will not be an overnight task. It will require time and, probably, negotiation to come to a common ground.