Feature Engineering Trade Network Statistics Feature Engineering Trade Network Statistics

-

-

In previous work I've calculated some basic network statistics on IMF Direction of Trade Statistics (DOTS) export data.

In this notebook I'll see if this data has any relationship with percent change bilateral export series. TLDR: currently no linear relationships

Table of Contents:

  1. Load and clean data
  2. For each trade series, univariate linear regress bilateral export series against the exporter's network statistics
    • This could be re-run on importer's statistics
    • Recalculate the network with edges as nodes: example
  3. Sortby pValue, r^2, and aic, check with plots
  4. Collapse network statistics with PCA, repeat 2,3,4 on PCA series

    improvements / future work:

1. Load and clean data

2. Loop and Linear Regress

3. filter univariate regression results

4. Visual Check

Let's do a visual check of the series that came back with any remote form of a linear relationship.

We can see that many of the relationships are affected by outliers so these numbers are misleading.

5. PCA on Network Statistics to Reduce Dimensionality

Example of the Relationship between the original features and the principal components. The values can be interpreted as the correlation between the original feature and the component.

an attempt to interpret the principal components:

PC 0: Centrality/Degree measures -> "Connectivity"

PC 1: Macro features such as number of edges and nodes while negatively related to pagerank values

PC 2: A bit of everything, overlaps pagerank and number of edges/nodes which are clearly seperated in PC 1