Forecast bilateral trade series using importer and exporter node network statistics and XGboost. The goal here is to find what geometric information is the most useful for forcasting. Is any geometric information of use? Does it motive using something like a GNN? XGBoost Regression Network Statistics

In previous work I've calculated some basic network statistics on IMF Direction of Trade Statistics (DOTS) export data.

I looked for relevant features using univariate linear regression in this notebook

In this notebook I'll use XGBoost.

Table of Contents:

  1. Load and clean data
  2. For each trade series, XGBoost export series against the exporter's network statistics
    • This could be re-run on importer's statistics
    • Recalculate the network with edges as nodes: example
  3. Sort by mean absolute error.
  4. Collapse network statistics with PCA, repeat 2,3,4 on PCA series

1. Load and clean data

2. Loop and XGBoost

It is easier to forecast out of sample on series with lower standard deviation

5. PCA on Network Statistics to Reduce Dimensionality

Red is PCA model forecast, blue is prePCA model forecast, green is actual.