Estimating coal mine workforce size

From Global Energy Monitor
This article is part of the
Global Coal Mine Tracker, a project of Global Energy Monitor.
Download full dataset
Report an error
Sub-articles:
Related-articles:

Global Energy Monitor's Global Coal Mine Tracker uses machine learning to estimate the size of the workforce, or employment, at specific coal mines when that information is otherwise unavailable.

Global Energy Monitor first published its coal mine employment estimates in the April 2023 version of the dataset.

Coal mine workforces

The Global Coal Mine Tracker relies on corporate, government, and reliable media sources to gather data on employment and coal mine workforces. The information on coal and jobs is crucial to coal phase-outs and just transitions.

But employment information is not always accessible or transparent. To account for this, we built a machine learning tool in 2021-2022 to estimate the size of coal workforces when that data was otherwise unavailable.

Methodology

Our machine learning tool uses "supervised training" to estimate workforce size at coal mines. The supervised training, in this case, consisted of a routine where a known workforce size was withheld from the model and then compared to its prediction to assess the accuracy of the estimate.

Input and label data

The data used we used for predicting coal mine workforce size consisted of two categories: the label and the input data.

The label data is the quantity that our dataset needs to predict -- in this case, the size of a coal mine workforce. The input is the data that the label depends on, that is data variables such as coal production, mine size, etc.

In our model, some input features have numerical values, while others have categorical values.