Layer formula
H^(l+1) = σ(D̂^(-1/2)  D̂^(-1/2) H^l W^l). Where  = A + I (self-loops), D̂ = degree matrix of Â.
Advertisement
Interpretation
Each layer: aggregate neighbor features (mean-like), transform via W, apply nonlinearity. Two layers ≈ 2-hop neighborhood info.
Advertisement
Semi-supervised classification
Some labeled, rest unlabeled. Cross-entropy on labeled. Softmax over classes.
Limitations
Doesn't distinguish importance of neighbors. Over-smoothing at depth. Fixed graph structure needed.