Global Fossil Infrastructure Tracker Methodology

From Global Energy Monitor
This article is part of the Global Fossil Infrastructure Tracker, a project of Global Energy Monitor.
Sub-articles:

The Global Fossil Infrastructure Tracker (GFIT) is an information resource on fossil fuel infrastructure projects and their development. It is designed for activists, researchers, and the media. Currently, the GFIT includes all global LNG import terminals and export terminals, and all global oil and gas transmission pipelines over a per-determined size threshold. More information on inclusion criteria and research methodology can be found below.

The Tracker organizes information in both map and table format. The interactive map format allows users to geographically visualize pipeline routes and terminal locations, while the table format allows users to access additional data points on each project. Both the map and table provide menu-based data filtering options as well as links to further information in project-specific wiki pages housed on GEM.wiki. GEM.wiki is a repository of energy infrastructure information managed by Global Energy Monitor and the Center for Media and Democracy. The sources used in research and data collection can be found cited in each project's wiki page. The GFIT data are updated, published, and distributed twice a year. Following every data update, descriptive statistics are run and published as data summary tables. Activists, researchers, and the media can request the data in tabular format upon request. Requests can be sent to ted.nace@globalenergymonitor.org.

Both the Tracker and data summary tables are available through the Global Gas and Oil Network.

Methodology

Inclusion Criteria

The Global Fossil Infrastructure Tracker (GFIT) includes liquid natural gas (LNG) import and export terminals, natural gas transmission pipelines, oil transmission pipelines, and proposed natural gas liquid (NGL) transmission pipelines. Distribution and gathering pipelines are not included. Pipelines with a capacity of under 0.25 bcm/year (gas), 24 MMcf/d (gas), or 6,000 bpd (oil) are not included. If a pipeline's capacity isn't available, then length is used to determine inclusion. Pipelines under 100 kilometers are not included. Pipelines that fall below this inclusion criteria are not globally comprehensive, but pipelines that meet this inclusion criteria are. There is no capacity threshold for terminal inclusion.

Research Process

The Global Fossil Infrastructure Tracker's (GFIT) data collection method involves professional research analysts searching government, industry, NGO, and general news sources for missing projects and project updates. Data points are recorded in tabular format for aggregate presentation and analysis. In-depth qualitative information is recorded in project-specific wiki pages and may include categories such as financing, environmental impacts, extraction sources, public opposition, aerial photographs, videos, links to permits, coordinates, maps, and others.

Under standard wiki convention, each piece of information is linked to a published reference, such as a news article, company report, or regulatory permit. Alternate names for projects are also recorded. For each project in China, the corresponding Chinese name was identified. Once wiki pages are created and data sets are compiled, they are circulated for review to researchers familiar with local conditions and languages. In order to ensure data integrity in the open-access wiki environment, professional research analysts also review all edits made to project wiki pages.

Following every data update, results are validated against government and industry sources at both the individual and the aggregate levels.

Data Updates

The GFIT is updated twice a year, approximately every six months. Data updates includes four components:

  • verifying operational status, which involves actively researching whether construction has begun on proposed projects, or whether construction has been completed and operations have begun
  • adding project updates, which involves updating projects' in the Tracker with any recent qualitative news or data point changes that have occurred since the last data update
  • adding new projects, which involves adding projects that were proposed or announced since the previous GFIT update
  • adding missing projects, which involves adding existing projects that were missing from previous GFIT versions

Verifying Operational Status

During the biannual update, every project with an operational status of "proposed", "construction", or "shelved" is researched to determine if the operational status has changed since the last update.

  • if construction has begun on a project listed as "proposed", the status is changed to "construction"
  • if a project listed as "proposed" has had no development updates in two years, the status is changed to "shelved"
  • if service has begun on a project listed as "construction", the status is changed to "operational"
  • if a project listed as "shelved" has had no development updates in four years, the status is changed to "cancelled"

Adding Project Updates

News articles and industry reports are used to update information about projects regardless of operational status. Examples include accidents and explosions, expansion projects, project retirements, among others.

If an existing pipeline has a proposed expansion, either in its capacity or length, the expansion is added as a new project in the GFIT and to the "Expansion Projects" section of it's mainline's wiki page.

Adding New Projects

News articles, government websites, and industry sources are used to find projects that have been newly proposed since the last GFIT update.

Adding Missing Projects

Initially, a preliminary list of pipelines and LNG terminals was gathered from public data sources including industry, news, and government websites. Initially, all projects were included. The list was researched further to acquire additional data points and qualitative information about each project.

Currently, some missing pipeline projects are given research priority over others. Research priorities were established due to the extensive number of transmission pipelines in the world. Priority is defined by three factors:

  • in-development projects, defined as projects that are either proposed or under-construction and are either new mainlines or expansions to existing mainlines, are always given research priority
  • significant projects, defined as projects that have a capacity or length over the inclusion criteria defined above, are always given research priority
  • recent projects, defined as projects that have an in-service start date in 2015 or later, are always given research priority

Currently, all LNG terminals with either import or export capacity are included in the GFIT.

Mapping

Pipeline routes were determined using publicly available information and maps, usually from industry sources or news publications. Pipeline routes were plotted either by visual approximation or by geo-referencing route images using QGIS 3.4.12. If no publicly available map is available, the pipeline's route is approximated based on the written information available.

Terminal locations were approximated based on the written information available.

The Tracker's public interface, including the map view, table view, and filtering mechanism, was developed by GreenInfo Network using the Leaflet Open-Source JavaScript library.

Data Summary Tables

At the end of every biannual data update, descriptive statistics are generated using Google Sheets, R 4.0.0, and Python 3.8.2 based on the updated data set.

Pipeline Data Summary Tables

All data summary tables are broken down by fuel type (oil, gas, or NGL) and project status, and by either country, region, owner, or start year. Within these groups, the tables present either the total number of pipelines, the total pipeline kilometers, or the total pipeline capacity.

Pipelines that have multiple owners were prorated and assigned equal shares unless specific ownership shares were available. Kilometer tables were calculated using the pipeline's publicly available length, when possible. If no length value is publicly available, pipeline length estimates are calculated using pipeline route coordinates in the GeoPandas 0.7.0 package in Python 3.8.2. Country and region tables were calculated by estimating the percentage of pipeline that falls into different countries. Country percentages were estimated using the GeoPandas 0.7.0 package in Python 3.8.2. Regions are defined by the International Energy Agency's (IEA) World Energy Outlook report.

LNG Terminal Data Tables

The LNG terminal data tables are organized by LNG import terminals and LNG export trains. All data tables are broken down by facility type (import terminal or export train) and project status, and by either country, region, owner, or start year. Within these groups, the tables present either the total number of projects or the total capacity.

Projects with multiple owners were prorated and assigned equal shares unless specific ownership shares were available. Regional definitions are based on the definitions used by the International Energy Agency's (IEA) World Energy Outlook report.

Data Dictionary

The following table provides data field definitions for the variables included in the Tracker’s table view and the data summary tables. These and additional variables are maintained in the GFIT's data set.

Global Fossil Infrastructure Tracker Data Dictionary
Variable Name Definition
Project The primary name of the pipeline or terminal.
Countries The country that the terminal or domestic pipeline is in, or the countries that an international pipeline passes through.
Type The type of infrastructure project. The current options are: liquefied natural gas terminals, gas pipelines, oil pipelines, or NGL pipelines.
Status For each project, one of the following operational status categories is assigned and reviewed every six months:

1) Proposed: The item has been proposed but construction has not yet commenced.

2) Construction: Site preparation and other development and construction activities are underway.

3) Shelved: There has either been a public announcement that the sponsor is putting its plans on hold, or if there have been no reports of development activity over a period of two to four years, the project is presumed to be "shelved".

4) Cancelled: In some cases a sponsor publicly announces that it has cancelled a project. More often a project fails to advance and then quietly disappears from company documents. A project that was previously in an active category is moved to “Cancelled” if it disappears from company documents, even if no announcement is made. Or, a project is considered “cancelled” if there have been no indications of development over a period of four years.

5) Operating: The project has been formally commissioned or has entered commercial operation.

6) Idle: Construction has been completed, but it has not entered commercial operations and lacks any indication that it will. Or, the project was at one time operational and now sits unused, but has not been formally mothballed.

7) Mothballed: The project has been formally taken offline, but not yet permanently retired.

8) Retired: The project has been permanently taken offline.

Start Year The initial in-service year for operating, idle, mothballed, or retired projects. Or, the expected initial in-service year for proposed, construction, shelved, or cancelled projects.
Owner The organization(s) at the highest level of project ownership and their respective ownership percentage(s). If projects are owned by subsidiaries, the subsidiaries may be listed in the project's wiki page, but only the subsidiaries' highest-level owners are included in the Tracker.
Length The pipeline's length in kilometers. When a pipeline's length is publicly available, that value is used. If no length value is publicly available then the pipeline's length is estimated based on its route coordinates using the GeoPandas 0.7.0 package in Python 3.8.2.
Capacity The amount of product able to pass through the terminal or pipeline during a specified time-interval. The unit of measurement varies based on the type of product and infrastructure, so all terminal capacity values have been standardized to million tons per annum (mtpa). All pipeline capacity values have been standardized to barrels of oil equivalent per day (boe/d). When sources report a range, the maximum capacity value is used.

Data Quality

The amount and quality of available information varies widely between projects. Some variables are almost always available, such as the approximate location, the operational status, and the product type. Start year is the most frequently missing value in both pipelines and LNG terminals. However, the missing Start Year values for LNG terminals are almost exclusively missing from in-development projects. Capacity, owner, and pipeline length are also frequently missing. More data is missing from pipeline projects than from LNG terminal projects.

The percentage of missing data points are presented in the tables below by variable and project type. "Known Length" refers to the pipeline's length as reported by sources, whereas "Length" refers to a pipeline's length when estimated based on its route's coordinates using Python's geographic statistics packages.

Pipeline Data Quality

The March 2020 release of the Global Fossil Infrastructure Tracker includes 1,744 pipelines, 1,288 gas pipelines and 456 oil pipelines. The following table shows the percentage of values missing from key variables by product type.

Missing Data Values by Variable and Pipeline Type
Variable Name All Pipelines Gas Pipelines Oil Pipelines
Country 0 (0%) 0 (0%) 0 (0%)
Fuel 0 (0%) 0 (0%) 0 (0%)
Project Status 0 (0%) 0 (0%) 0 (0%)
Region 0 (0%) 0 (0%) 0 (0%)
Length 35 (2%) 24 (1.9%) 11 (2.4%)
Known Length 224 (12.8%) 182 (14.1%) 44 (9.7%)
Owner 344 (19.7%) 316 (24.5%) 28 (6.1%)
Capacity 694 (39.8%) 591 (45.9%) 103 (22.6%)
Start Year 826 (47.4%) 584 (45.3%) 243 (53.3%)

LNG Terminal Data Quality

The June 2021 release of the Global Fossil Infrastructure Tracker includes 1,090 LNG terminal projects, including 559 import terminal projects and 531 export trains. The following table shows the percentage of values missing from key variables by infrastructure type.

Missing Data Values by Variable and Facility Type
Variable Name Import Terminals Export Trains
Country 0 (0%) 0 (0%)
Facility Type 0 (0%) 0 (0%)
Project Status 0 (0%) 0 (0%)
Region 0 (0%) 0 (0%)
Owner 20 (3.6%) 5 (0.9%)
Capacity 83 (14.8%) 19 (3.6%)
Start Year 189 (33.8%) 135 (25.4%)