Skip to main content

Section 5.5 Connectedness and Trees

As we mentioned in Section 5.1 the power of graph theory is that it allows us to abstract only the relevant details about the structure underlying a given scenario, regardless of the original context of the problem. A tree is a special type of graph that arises in several real-world applications. For instance, it is a prominent data type in computer science; it is seen whenever the structure being described has an underlying hierarchy or order.

We first restate some definitions you've seen before and give a bit more detail. Recall that a path is a trail with no repeated vertices.

Definition 5.5.1. Path Graph.

A path graph is any graph isomorphic to a path on \(n\) vertices and is denoted by \(P_n\text{.}\)

Two different drawings of the graph \(P_6\text{:}\)

Video: Connectedness

A graph is connected if there are paths between any two of its vertices.

Definition 5.5.3. Connected Component.

A connected component of a graph \(G\) is any connected subgraph of \(G\) that is not contained in any other connected subgraph (i.e. the maximal connected subgraphs of \(G\)).

If there exists a \(u\)-\(v\) path on a graph \(G\text{,}\) we also say that \(u\) and \(v\) are connected on \(G\text{.}\)

Verify that the graph \(P_6\) is connected.

The graph \(G\) is not connected since not all pairs of vertices are endpoints of some path. For example, there is no path joining 1 and 6; nor is there one joining 3 and 4.

The graph \(G\) has two connected components: one is the \(C_3\) subgraph formed by the vertices \(\{1,2,3\}\text{,}\) and the other one is a path of length three (\(P_3\)) formed by the vertices \(\{4,5,6\}\text{.}\) These are the maximal connected subgraphs of \(G\text{.}\) (On the other hand, the path 1-2-3 is not a maximal connected subgraph, since it is a subgraph of the cycle 1-2-3-1.)

The graph \(H\) is connected, since between any two of its vertices one can find a path. For example, if we pick vertices 1 and 5, we can find a 1-5 path: 1-2-3-5. In fact there is another path (1-4-3-5) but we only need at least one for each vertex pair to satisfy the condition of connectedness.

Which of the following graphs are connected?

Prove that if \(G_1 = (V_1,E_1)\) and \(G_2 = (V_2,E_2)\) are isomorphic, and \(G_1\) is connected, then \(G_2\) must be connected as well.

If a graph is connected, and contains no copies of \(C_n\) for any \(n\) (i.e. it does not have cycles) then we call it a tree.

Video: Trees

Definition 5.5.8. Tree, Forest.

A tree is a simple connected graph that contains no cycles as a subgraph.

A forest is a simple graph that contains no cycles as a subgraph.

As to be expected, a forest is simply a collection of trees. Technically, by Definition 5.5.8 a single tree can be called a forest as well.

A drawing of an actual, non-graph tree.

A tree on six vertices (also a forest)

A forest on six vertices (not a tree)

A connected graph (not a tree)

A tree (not a graph)

Find all nonisomorphic trees on

  1. four vertices;
  2. five vertices.

In a sense, trees are the minimally connected graphs, since removing any edge from a tree results in a disconnected graph.

We proceed by contradiction. Assume that the graph \(G'\) remains connected after removing the edge \((u,v)\text{.}\) Then by the definition of connectnedness there is a \(u\)-\(v\) path in \(G'\text{.}\)

But this means that there had to have been a cycle in the original graph \(G\text{,}\) consisting of the \(u\)-\(v\) path in \(G'\) and the edge \((u,v)\text{.}\) This contradicts the assumption that \(G\) was a tree. Therefore removing any edge from \(G\) results in a disconnected graph.

Our main result in this section is the relationship between the number of vertices and the number of edges in a tree. We first give one more definition and then prove an auxiliary result.

Definition 5.5.12. Leaf.

A leaf in a tree is a vertex with degree 1.

Exploration 5.5.1. Proving Lemma 5.5.13.

Let's work through the proof of Lemma 5.5.13 together! This is a proof by contradiction, so we start by assuming the opposite of the desired conclusion. That is: let's assume that \(G = (V,E)\) is a tree on at least two vertices, such that it has only one leaf or none at all.

(a)

We say that a path in a graph is maximal if we cannot add any more vertices to either end to make it longer. (For example, the path \(v_2\)-\(v_1\)-\(v_4\)-\(v_5\) in Example 5.2.3 is maximal.)

Explain why we can always find a maximal path in a tree on at least two vertices (in fact, in any graph that has at least one edge).

(b)

Take a maximal path \(P\) in \(G\) and call its vertices \(x_1,x_2,\ldots,x_k\text{.}\)

Given the assumption that \(G\) has fewer than two leaves, what can we conclude about \(x_1\) and \(x_k\text{?}\) (Hint: what about their degrees?)

(c)

Show that your conclusion from (b) will inevitably lead to one of the following contradictions:

  • that the path \(P\) was not maximal after all;
  • that the graph \(G\) could not have been a tree to begin with.

In Exploration 5.5.1 we claimed that we can always find a maximal path in a graph with at least one edge. Find all maximal paths in each graph below. Do maximal paths of a graph always have the same number of vertices?

Lemma 5.5.13 gives the minimum number of leaves in any tree on at least two vertices. What is the maximum number of leaves in a tree on \(n\) vertices?

What is the minimum and maximum number of leaves in a forest that has \(k\) connected components (assume the components have \(n_1, n_2, \ldots, n_k\) vertices)?

Before we state our main result, try studying the trees of Checkpoint 5.5.14. Do you notice a relationship between the number of vertices and the number of edges of each graph?

Video: Characterization of Trees

We proceed by induction on the number of vertices \(|V|\) in the tree. The claim is:

\(P(n):\) All trees with \(n\) vertices have \(n-1\) edges.

Base case: \(P(1)\) is clearly true, since a tree on one vertex has no edge.

Induction step: Assume that \(P(k-1)\) holds; that is, any tree with \(k-1\) vertices has \(k-2\) edges.

To prove \(P(k)\) holds, let \(G\) be an arbitrary tree on \(k\) vertices. By Lemma 5.5.13, \(G\) must have a leaf, or a vertex \(v\) with \(d(v) = 1\text{.}\)

Remove the vertex \(v\) from the tree and the one edge it is an endpoint of. The resulting graph \(G'\) is still a tree (it must still be connected and have no cycles, since we only removed a leaf), but with \(k-1\) vertices. By the induction hypothesis, \(G'\) has \(k-2\) edges.

Therefore the original graph \(G\) must have had \(k-1\) edges (adding back what was removed with \(v\)).

This completes the proof.

It is interesting to note that the converse of Theorem 5.5.18 is actually true.

Let \(G\) be a connected graph on \(n\) vertices and \(n-1\) edges. Towards a contradiction, assume \(G\)has at least one cycle.

Construct a new graph \(G'\) by removing an edge from a cycle in \(G\text{.}\) Then \(G'\) remains connected (why?).

Continue this process until we obtain a graph \(H\) with no cycles. Then by definition, \(H\) must be a tree (connected and acyclic). Observe that \(H\) has \(n\) vertices but strictly fewer than \(n-1\) edges since we removed at least one edge from \(G\) to obtain \(H\text{.}\) This contradicts Theorem 5.5.18.

Hence \(G\) must have had no cycles to begin with. This implies \(G\) is a tree.

We now summarize both results into one nice characterization of trees.

Find an example of a simple graph on \(n\) vertices and \(n-1\) edges that is not a tree.