Take any system where "things relate to each other," strip away every extra detail, and keep just two things: who (the nodes) and who connects to whom (the edges). A subway map is the classic case—it lies about distance, direction, and real geography, yet faithfully preserves "which stop reaches which stop." That deliberate act—forgetting geometry, remembering only connection—is the core move of graph theory.
So a graph isn't a picture; it's a set of relations. The same relations drawn in any shape count as the same graph: graph theory cares only about topology (how things connect), never about layout (where they sit). The two panels below are one graph drawn two ways.
$V$ is the set of nodes, $E$ the set of edges (pairs of nodes). $\deg(v)$, the degree of $v$, is how many edges meet it. Edges can carry direction (directed graphs) or numbers (weighted graphs). All of graph theory rests on this single pair of sets.
A graph is the minimal mathematical skeleton of the abstract idea of "relationship." The startling part is universality: social networks, molecules, the internet, neurons, citations, chess positions—strip away appearances and they are all the same object. Once translated into a graph, theorems about "connectivity," "shortest path," and "cycles" apply at once. This is mathematics' core magic: find the right level of abstraction, and one proof lights up a thousand fields. The plainest example is the handshake lemma $\sum_v \deg(v)=2|E|$: each edge adds 1 to the degree of both endpoints, so odd-degree nodes must come in pairs. An almost-obvious line—yet the key to the next concept.
Compilers build a dependency graph and run topological sort to fix the compile order; Google's PageRank treats web links as a random walk on a graph; graph neural networks (GNNs) do deep learning directly on graph structure, predicting drug-molecule properties and powering recommenders; deadlock detection in distributed systems and consensus in blockchains are, at heart, cycle and connectivity problems on graphs.
In 1736, Königsberg had seven bridges linking two banks and two islands. The townsfolk's pastime: could you take a stroll crossing each bridge exactly once? Euler didn't enumerate routes—he asked a deeper question: what structure even makes it possible?
His insight is startlingly simple: whenever you "enter" a landmass, another bridge must let you "leave." Bridges get used up in pairs. So except for the start and end points, every landmass must touch an even number of bridges—otherwise you'll eventually be trapped there. But all four of Königsberg's landmasses touch an odd number of bridges—so it's impossible. No trial needed; the structure itself rules it out.
An Eulerian path uses every edge exactly once. With 0 odd nodes you can return to the start (an Eulerian circuit); with exactly 2, those two odd nodes must be the endpoints. More than 2 odd nodes: impossible. Königsberg has 4 odd nodes, so it was doomed from the start.
This is the birth moment of topology. Euler proved the answer is independent of bridge lengths, island shapes, and geography—it hinges only on the parity of connections. A seemingly geometric question whose answer is pure combinatorial parity: this is Leibniz's dreamed-of analysis situs, the topology to come. It demonstrates one of math's most elegant moves: turning a "search" into a "criterion," collapsing exponential trial-and-error into a single line checkable in $O(V+E)$ time.
Eulerian circuits are the math of "drawing without lifting the pen," used in PCB routing, postman routes, and 3D-printer nozzle paths. The deepest application is biological: DNA sequencing builds a de Bruijn graph from a flood of short fragments, and reconstructing the full genome becomes "find an Eulerian path"—the heart of modern genome-assembly algorithms. The Chinese Postman Problem is its weighted generalization.
Picture a drop of water released at the start, seeping outward along every edge at a steady speed measured in "distance"—the near gets wet first, the far later. Dijkstra's algorithm is that wavefront: at each step it takes the currently nearest, not-yet-finalized node, declares its shortest distance settled, then uses it to "relax" (update) its neighbors.
Why does this greedy never err? The key intuition: since all edge weights are non-negative, the currently nearest pending node can never be reached more cheaply by some longer detour—detours only add length. So you can safely lock it in and never look back. That mild condition, non-negativity, is exactly the ground on which greed stands.
$d[v]$ is the current best estimate of the distance from start to $v$. Repeatedly lock in the pending node $u$ with the smallest $d$, then relax each outgoing edge $(u,v)$: if routing through $u$ is shorter, update. Here $w$ is the edge weight. With a priority queue the cost is $O\big((V+E)\log V\big)$.
It splits "global optimum" into a chain of "locally regret-free" decisions—greed actually yields the global optimum, a rare gift in the algorithmic world. For most problems greed errs; here "non-negative weights" supplies exactly the monotone structure that makes it valid: a locked-in distance is never overturned. Lovelier still is its unifying view: swap the priority-queue's ordering key and the same "priority queue + relaxation" skeleton grows into A\* search and Prim's minimum spanning tree—three classics, one soul.
The path-planning core of map navigation (Google Maps, accelerated with A\*); network routing's OSPF protocol uses Dijkstra to compute shortest forwarding paths; game-AI pathfinding; robot motion planning. When weights can be negative (e.g., log exchange-rates in arbitrage detection), you must switch to Bellman-Ford—which only proves how much the non-negativity assumption was doing.
One telephone is worthless, two can talk, a million form a network. The value lies not in the nodes but in the possible connections between them. Among $n$ nodes there are $\binom{n}{2}\approx n^2/2$ potential links—so a network's value grows roughly with the square of its node count (Metcalfe's law). This is the engine of winner-take-all: more users → more connections → more value → still more users.
But real networks are never uniform. A few hubs carry enormous numbers of edges while the vast majority hold only a handful—the degree distribution is a power law (a scale-free network), like an airport system (a few super-hubs plus countless small fields) rather than a road grid (where every junction looks alike).
Left: Metcalfe's law—value grows with the square of the node count. Right: the degree distribution of a scale-free network—the fraction of nodes with degree $k$ decays as a power of $k$ (with $\gamma$ usually between 2 and 3). A power law has no "typical" degree: from tiny airfields to giant hubs, spanning several orders of magnitude, all obey one curve—hence scale-free (no characteristic scale).
The same power law $P(k)\sim k^{-\gamma}$ appears in internet topology, citations, protein interactions, and social follows—pointing to one cross-domain generating mechanism: preferential attachment, i.e., "the rich get richer," as new nodes prefer to link to already-popular ones. Here hides one of complexity science's deepest motifs: simple local rules (randomness + preference) spontaneously give rise to global, reproducible statistical structure. Metcalfe's $n^2$ speaks of scale, the power-law tail of inequality; together they reveal why a network is unlike an ordinary market.
The theoretical backbone of valuing platform economies (social, payments, marketplaces); epidemic spread—scale-free networks have no spreading threshold, and hubs are super-spreaders, explaining why "immunize the highly-connected first" is wildly efficient; the same structure makes the internet extremely robust to random failure (most broken nodes are low-degree) yet strangely fragile to targeted attacks on hubs. Grasp the degree distribution and you grasp the shared grammar of monopoly, plague, and system resilience.