# Metadata Source URL:: https://arxiv.org/abs/2207.02505 Topics:: #ai --- # Pure Transformers are Powerful Graph Learners We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice. Given a graph, we simply treat all nodes and edges as independent tokens, augment them with token embeddings, and feed them to a Transformer. With an appropriate choice of token embeddings, we prove that this approach is theoretically at least as expressive as an invariant graph network (2-IGN) composed of equivariant linear layers, which is already more expressive than all message-passing Graph Neural Networks (GNN). When trained on a large-scale graph dataset (PCQM4Mv2), our method coined Tokenized Graph Transformer (TokenGT) achieves significantly better results compared to GNN baselines and competitive results compared to Transformer variants with sophisticated graph-specific inductive bias. Our implementation is available at https://github.com/jw9730/tokengt. ## Highlights > [!quote]+ Updated on 210922_181637 > > We show that standard Transformers without graph-specific modifications can >lead to promising results in graph learning both in theory and practice. Given >a graph, we simply treat all nodes and edges as independent tokens, augment >them with token embeddings, and feed them to a Transformer. With an appropriate >choice of token embeddings, we prove that this approach is theoretically at >least as expressive as an invariant graph network (2-IGN) composed of >equivariant linear layers, which is already more expressive than all >message-passing Graph Neural Networks (GNN). When trained on a large-scale >graph dataset (PCQM4Mv2), our method coined Tokenized Graph Transformer >(TokenGT) achieves significantly better results compared to GNN baselines and >competitive results compared to Transformer variants with sophisticated >graph-specific inductive bias. Our implementation is available at >this https URL.