Data Structure Study Notes <7> Graph

I. Graph#

1. What is a Graph#

Represents a "many-to-many" relationship, consisting of:

A set of vertices: usually represented by V (Vertex) for the set of vertices
A set of edges: usually represented by E (Edge) for the set of edges
An undirected edge is a pair of vertices: (v, w) ∈ E, where v, w ∈ V
A directed edge <v,w> represents an edge pointing from v to w (single line)
Does not consider multiple edges and self-loops

Abstract Data Type Definition#

Type Name: Graph (Graph)
Data Object Set: G(V,E) consists of a non-empty finite set of vertices V and a finite set of edges E.
Operation Set: For any graph G ∈ Graph, and v ∈ V, e ∈ E

Graph Create(): creates and returns an empty graph;
Graph InsertVertex(Graph G, Vertex v): inserts vertex v into graph G
Graph InsertEdge(Graph G, Edge e): inserts edge e into graph G
void DFS(Graph G, Vertex v): performs depth-first traversal of graph G starting from vertex v;
void BFS(Graph G, Vertex v): performs breadth-first traversal of graph G starting from vertex v;
void ShortestPath(Graph G, Vertex v, int Dist[]): calculates the shortest distance from vertex v to any other vertex in graph G;
void MST(Graph G): calculates the minimum spanning tree of graph G
Common Terms:
Undirected graph, directed graph, network, etc.

How to Represent a Graph in a Program#

Adjacency Matrix#

Insert image description here

Benefits of Adjacency Matrix

Intuitive, simple, easy to understand
Convenient to check if there is an edge between any pair of vertices
Convenient to find all "adjacent points" (vertices directly connected by edges) of any vertex
Convenient to calculate the "degree" of any vertex (the number of edges emanating from that point is "out-degree", the number of edges pointing to that point is "in-degree")
- Undirected graph: the number of non-zero elements in the corresponding row (or column)
- Directed graph: the number of non-zero elements in the row is the out-degree, the number of non-zero elements in the column is the in-degree

Adjacency List#

Pointer array + linked list, very efficient when points are sparse
Insert image description here
Benefits of Adjacency List:

Convenient to find all "adjacent points" (vertices directly connected by edges) of any vertex
Saves space for sparse graphs
- Requires N head pointers + 2E nodes (each node has at least 2 fields)
Convenient to calculate the "degree" of any vertex in an undirected graph, but can only calculate the out-degree for directed graphs.

2. Graph Traversal#

DFS Depth First Search#

Depth First Search (DFS), explores each possible branch path until it can no longer go deeper, and each node can only be visited once.
Pseudocode description:

void DFS(Vertex V) {
	visited[V] = true;
	for each adjacent point W of v
		if (!visited[W])
			DFS(W);
}

BFS Breadth First Search#

Breadth First Search (BFS), implemented using a queue (first in, first out)
Pseudocode description:

void BFS(Vertex V) {
	visited[V] = true;
	Enqueue(V, Q); // Put this vertex into the queue
	while(!IsEmpty(Q)) { // End search when the queue is empty
		V = Dequeue(Q); // V is the front element
		for each adjacent point W of V {
			if (!visited[W]) {
				visited[W] = true; // Mark this point as visited
				Enqueue(W, Q); // Push this point into the queue
			}
		}
	}
}

What to do if the graph is disconnected?#

Path: A path from V to W is a set of vertices {V, V1, V2, ..., Vn, W}, where there is an edge in the graph between any pair of adjacent vertices. The length of the path is the number of edges in the path (if weighted, it is the sum of all edge weights). If all vertices between V and W are different, it is called a simple path
Connected: If there exists a (undirected) path from V to W, then V and W are said to be connected
Cycle: A path where the starting point equals the endpoint
Connected Graph: A graph where any two vertices are connected
Connected Component: The maximal connected subgraph of an undirected graph
- Maximal vertex count: adding one more vertex would make it disconnected
- Maximal edge count: contains all edges connecting all vertices in the subgraph
Strongly Connected: If there exists a bidirectional path between vertices V and W in a directed graph, then V and W are said to be strongly connected
Strongly Connected Graph: A directed graph where any two vertices are strongly connected
Strongly Connected Component: The maximal strongly connected subgraph of a directed graph

Each time DFS is called, it actually traverses the connected component containing V. The same goes for BFS.

void ListComponents ( Graph G ) { // Traverse connected components
	for (each V in G)
		if (!visited[V]) {
			DFS(V);
		}
}

3. How to Construct a Graph#

(1) Establishing a Graph Represented by an Adjacency Matrix#

Definition#

const int MaxVertexNum = 100;
typedef int DataType;
typedef bool WeightType;
typedef struct GNode *PtrToGNode;
struct GNode {
    int Nv; // Number of vertices
    int Ne; // Number of edges
    WeightType G[MaxVertexNum][MaxVertexNum];
    DataType Data[MaxVertexNum]; // Store vertex data
};
typedef PtrToGNode MGraph; // Graph type stored by adjacency matrix

Initialization#

Initialize a graph with VertexNum vertices but no edges

typedef int Vertex;
MGraph CreateGraph(int VertexNum) {
    Vertex V, W;
    MGraph Graph;
    Graph = (MGraph) malloc(sizeof(struct GNode));
    Graph->Nv = VertexNum;
    Graph->Ne = 0;
    // Vertex numbering starts from 0, up to Graph->Nv-1
    for (V = 0; V < Graph->Nv; V++) {
        for(W = 0; W < Graph->Nv; W++) {
            Graph->G[V][W] = 0; // In directed graphs, 0 can be changed to INF, etc.
        }
    }
    return Graph;
}

Inserting Edges into the Graph#

Definition of an edge

typedef struct ENode *PtrToENode;
struct ENode {
    Vertex V1, V2; // Directed edge <V1,V2>
    WeightType Weight; // Weight
};
typedef PtrToENode Edge;

Insertion operation

void InsertEdge(MGraph Graph, Edge E) {
    // Insert edge <V1,V2>
    Graph->G[E->V1][E->V2] = E->Weight;
    // If it is an undirected graph, also insert edge <V2,V1>
    Graph->G[E->V2][E->V1] = E->Weight;
}

Complete Establishment of an MGraph#

Input format

Nv Ne
V1 V2 Weight
……

MGraph BuildGraph() {
    MGraph Graph;
    Edge E;
    Vertex V;
    int Nv;
    cin >> Nv;
    Graph = CreateGraph(Nv); // Establish a graph with Nv vertices
    cin >> Graph->Ne; // Number of edges Ne
    if(Graph->Ne != 0) {
        E = (Edge)malloc(sizeof(struct ENode));
        for (int i = 0; i < Graph->Ne; i++) {
            cin >> E->V1 >> E->V2 >> E->Weight;
            InsertEdge(Graph, E);
        }
    }
    // If vertices have data, read in the data
    for(V = 0; V < Graph->Nv; V++) {
        cin >> Graph->Data[V];
    }
    return Graph;
}

(2) Establishing a Graph Represented by an Adjacency List#

Can be modified based on the adjacency matrix

Definition#

const int MaxVertexNum = 100;
typedef int DataType;
typedef int Vertex;
typedef bool WeightType;
// Definition of adjacency list
typedef struct AdjVNode *PtrToAdjVNode;
struct AdjVNode {
    Vertex AdjV; // Index of adjacent point
    WeightType Weight; // Edge weight
    PtrToAdjVNode Next;
};

typedef struct VNode {
    PtrToAdjVNode FirstEdge;
    DataType Data; // Store vertex data
} AdjList;

typedef struct GNode *PtrToGNode;
struct GNode {
    int Nv; // Number of vertices
    int Ne; // Number of edges
    AdjList G; // Adjacency list
};
typedef PtrToGNode LGraph; // Graph type stored by adjacency list

LGraph Initialization#

LGraph CreateGraph(int VertexNum) {
    Vertex V, W;
    LGraph Graph;
    Graph = (LGraph) malloc(sizeof(struct GNode));
    Graph->Nv = VertexNum;
    Graph->Ne = 0;
    // Vertex numbering starts from 0, up to Graph->Nv-1
    for (V = 0; V < Graph->Nv; V++) {
        Graph->G[V].FirstEdge = NULL;
    }
    return Graph;
}

Inserting Edges into LGraph#

typedef struct ENode *PtrToENode;
struct ENode {
    Vertex V1, V2; // Directed edge <V1,V2>
    WeightType Weight; // Weight
};
typedef PtrToENode Edge;
void InsertEdge(LGraph Graph, Edge E) {
    PtrToAdjVNode NewNode;
    // Insert edge <V1,V2>
    // Create a new adjacent point for V2
    NewNode = (PtrToAdjVNode) malloc(sizeof(struct AdjVNode));
    NewNode->AdjV = E->V2;
    NewNode->Weight = E->Weight;
    // Insert V2 adjacent point at the head of V1's list
    NewNode->Next = Graph->G[E->V1].FirstEdge;
    Graph->G[E->V1].FirstEdge = NewNode;
    
    // If it is an undirected graph, also insert edge <V2,V1>
    // Create a new adjacent point for V1
    NewNode = (PtrToAdjVNode) malloc(sizeof(struct AdjVNode));
    NewNode->AdjV = E->V1;
    NewNode->Weight = E->Weight;
    // Insert V1 adjacent point at the head of V2's list
    NewNode->Next = Graph->G[E->V2].FirstEdge;
    Graph->G[E->V2].FirstEdge = NewNode;
}

Complete Establishment of an LGraph#

Just replace MGraph with LGraph and make slight changes to store Data

LGraph BuildGraph() {
    LGraph Graph;
    Edge E;
    Vertex V;
    int Nv;
    cin >> Nv;
    Graph = CreateGraph(Nv); // Establish a graph with Nv vertices
    cin >> Graph->Ne; // Number of edges Ne
    if(Graph->Ne != 0) {
        E = (Edge)malloc(sizeof(struct ENode));
        for (int i = 0; i < Graph->Ne; i++) {
            cin >> E->V1 >> E->V2 >> E->Weight;
            InsertEdge(Graph, E);
        }
    }
    // If vertices have data, read in the data
    for(V = 0; V < Graph->Nv; V++) {
        cin >> Graph->G[V].Data;
    }
    return Graph;
}

II. Shortest Path Problem#

1. Concept Overview#

In a network, find the path with the minimum sum of edge weights among all paths between two different vertices
- This path is called the shortest path (Shortest Path) between the two points
- The first vertex is called the source (Source)
- The last vertex is called the destination (Destination)

2. Problem Classification#

Single-source shortest path problem: From a fixed source point, find the shortest path to all other vertices
- (Directed) unweighted graph
- (Directed) weighted graph
Multi-source shortest path problem: Find the shortest path between any two vertices

2. Single-source Shortest Path Algorithm for Unweighted Graphs#

Find the shortest path to each vertex in increasing order, similar to the idea of BFS!
Insert image description here
First, define an array dist, dist[W] stores the shortest distance from S to W, where S is the starting point, dist[S]=0, and dist needs to be initialized to -1 (or infinity) for later determination of whether it has been visited.
Next, define an array path, path[W] stores a vertex passed through from S to W.
Both dist and path arrays need to be initialized to -1, then set the dist[S] of the starting point to 0, and enqueue to start visiting.
Pseudocode:

void Unweighted(Vertex S) {
	Enqueue(S, Q);
	while(!IsEmpty(Q)) {
		V = Dequeue(Q);
		for each adjacent point W of V
			if(dist[W] == -1) {
				dist[W] = dist[V] + 1;
				path[W] = V;
				Enqueue(W, Q);
			}
	}
}

3. Single-source Shortest Path Algorithm for Weighted Graphs#

Insert image description here

Dijkstra's Algorithm!#

Let s={source s + vertices v~i~ that have already determined the shortest path}
For any unrecorded vertex v, define dist[v] as the shortest path length from s to v, but this path only passes through vertices in S, i.e., the minimum length of the path {s→(v~i~∈S)→v}
If the path is generated in increasing order, then
- The true shortest path must only pass through vertices in S (can be proven by contradiction)
- Each time, select one vertex with the minimum dist from the unrecorded vertices (greedy)
- Adding a vertex v into S may affect the dist value of another w! (So check all adjacent points w of v!)
  - dist[w] = min{ dist[w], dist[v] + <weight of edge (v,w)>}
    Initialize dist: the dist of all adjacent points W of S can be initialized to the weight from s to w, while others are defined as positive infinity.

Pseudocode description:

void Dijkstra(Vertex s) {
	while (1){
		V = the vertex with the minimum dist among unrecorded vertices;
		if (no such V exists)
			break;
		collected[V] = true;
		for each adjacent point W of V
			if(collected[W] == false)
				if(dist[V] + E<v,w> < dist[W]) {
					dist[W] = dist[V] + E<v,w>;
					path[W] = V;
				}
	}
} // Cannot solve cases with negative edges

In the pseudocode, dist[W]=dist[V]+E~<V,W>~; is not a simple assignment, but if a shorter distance is found, it needs to be updated to the shorter distance.
Insert image description here

Core Code of Dijkstra#

#include <iostream>
#include <algorithm>
#include <cstdio>
#include <cstring>
using namespace std;
const int maxn = 1005;
const int inf  = 0x3f3f3f;
int T,N,x,y,z;
int edge[maxn][maxn];
int dist[maxn];
bool vis[maxn];
void init() {
    for(int i = 1; i <= N; ++i) {
        for(int j = 1; j <= N; ++j) {
            edge[i][j] = inf;
        }
        edge[i][i] = 0;
    }
}
void Dijstra(int u) {
    for(int i = 1; i <= N; ++i) {
        vis[i] = false;
        dist[i] = edge[u][i];
    }
    vis[u] = true;
    for(int i = 1; i <= N; ++i) {
        int t, mindis = inf;
        for(int j = 1; j <= N; ++j) {
            if(!vis[j] && dist[j] < mindis) {
                mindis = dist[j];
                t = j;
            }
        }
        vis[t] = true;
        for(int j = 1; j <= N; ++j) 
            if(!vis[j] && dist[j] > edge[t][j] + dist[t]) 
                dist[j] = edge[t][j] + dist[t];
    }
}

III. Minimum Spanning Tree#

1. What is a Minimum Spanning Tree (MST)#

It is a tree
- No cycles
- A tree with |V| vertices must have |V|-1 edges
It is a spanning tree
- Contains all vertices
- |V|-1 edges are all in the graph
- Adding any edge to the spanning tree will definitely form a cycle
The sum of the edge weights is minimum

The minimum spanning tree is equivalent to the graph being connected.

2. Solving the Minimum Spanning Tree Problem#

Usually involves a greedy algorithm:

"Greedy": Always choose the best at each step
"Best": The edge with the smallest weight
Constraints needed:
- Only use edges that exist in the graph
- Must use exactly |V|-1 edges
- Cannot have cycles

This is discussed in another blog.
Graph Theory - Solving the Minimum Spanning Tree Problem (Kruskal's Algorithm & Prim's Algorithm)