Causal Criticality of Layers in Chain-of-Thought
Late layers matter. A probe into Qwen2-1.5B with activation patching, a logit lens, and targeted ablations found that the mid-to-late layers — especially around layer 24 — often synthesize intermediate chain-of-thought steps into the final answer.