MHB Find Algorithm for O(V*E) Transitive Closure of a Graph

  • Thread starter Thread starter evinda
  • Start date Start date
  • Tags Tags
    Matrix
AI Thread Summary
The discussion focuses on finding an efficient algorithm for calculating the transitive closure of a directed graph with a time complexity of O(V*E). The initial algorithm proposed was based on topological sorting, which is only applicable to directed acyclic graphs (DAGs) and resulted in a complexity of O(V^2). The conversation then shifted to using Depth-First Search (DFS) or Breadth-First Search (BFS) from each vertex, which can achieve the desired complexity by exploring reachable vertices and updating the reachability matrix accordingly. The final proposed algorithm initializes a matrix and performs BFS for each vertex, ensuring that the reachability is correctly marked, leading to a time complexity of O(V*E). The participants also discussed the implications of graph connectivity and the initialization of the adjacency list in achieving the desired performance.
evinda
Gold Member
MHB
Messages
3,741
Reaction score
0
Hello! (Wave)

Suppose that we are given a directed graph and we want to find out if a vertex $j$ is reachable from another vertex $i$ for all vertex pairs $(i, j)$ in the given graph.
Reachable mean that there is a path from vertex $i$ to $j$.
The reachability matrix is called transitive closure of a graph.I want to write an algorithm that runs in time $O(|V| \cdot |E|)$ and calculates the transitive closure of a directed graph $G=(V,E)$.

I have tried the following:

Code:
Transitive_Closure(G)
     compute a topological sorting {v_0, v_1,..., v_(|V|-1)}
     for v=1 to |V|
          for u=1 to |V|
                T(v,u)=0
     for v=1 to |V|
           C(v,v)
   
   
C(i,j)
     for each u in Adj[j]
          T(i,u)=1
          C(i,u)

But the time complexity of the algorithm will be $O(V^2)$, right? (Worried)
What could we change, in order to get the desired time complexity?
 
Technology news on Phys.org
Topological sorting can be computed only for DAGs.
 
Evgeny.Makarov said:
Topological sorting can be computed only for DAGs.

Oh yes, right... (Tmi)

We could also apply [m] DFS [/m] for each vertex and if for some vertices $u,v$ it holds that $[d[v],f[v]] \subset [d,f]$ then we deduce that $v$ is reachable from $u$. ($d$ is the discovery time of a vertex $s$ and $f$ is its finishing time)
Am I right?
If so, then the time complexity will be $\Theta(|V| \cdot (|V|+|E|))=\Theta(|V|^2+|V| \cdot |E|)$.
But then it only holds that the time complexity is $O(|V| \cdot |E|)$ if $|E| \geq |V|$ and we are not given such an information.. (Worried)Or do we have to modify Floyd-Warshall in order to calculate the transitive closure of a directed graph $G=(V,E)$? But I am not sure how we could modify the algorithm so that it is done in time $O(|V| \cdot |E|)$, since it has three for-loops and it requires $O(|V|^3)$ time..
 
I believe we could perform BFS or DFS from each vertex. Now, I agree that the complexity of both searches is $O(|E|+|V|)$, but I think it is also $O(|E|)$. Indeed, BFS and DFS explore only the connected component to which the starting vertex belongs, and in a connected graph $|V|\le |E|+1$ (equality holds for a tree). Therefore, $O(|V|+|E|)=O(|E|)$. I have no idea why some pages (such as this and this) say that BFS visits every vertex even in a disconnected graph (in such graphs $|E|$ can indeed be significantly smaller than $|V|)$.

Therefore, doing BFS for each vertex has complexity $O(|V|\cdot|E|)$.
 
Evgeny.Makarov said:
I believe we could perform BFS or DFS from each vertex. Now, I agree that the complexity of both searches is $O(|E|+|V|)$, but I think it is also $O(|E|)$. Indeed, BFS and DFS explore only the connected component to which the starting vertex belongs, and in a connected graph $|V|\le |E|+1$ (equality holds for a tree). Therefore, $O(|V|+|E|)=O(|E|)$. I have no idea why some pages (such as this and this) say that BFS visits every vertex even in a disconnected graph (in such graphs $|E|$ can indeed be significantly smaller than $|V|)$.

Therefore, doing BFS for each vertex has complexity $O(|V|\cdot|E|)$.

I see.. (Smile)

So do we have to call BFS for each vertex $v$ and if the color of an other vertex gets black, we know that it is reachable from $v$ and elsewhise we know that it isn't? So should the algorithm look as follows?

Code:
Transitive_Closure(G)
1.  for each vertex u in G.V
2.       for each vertex v in G.V
3.            T[u,v]=0
4.   for each vertex u in G.V
5.        BFS(G,u)
6.   return T

Code:
BFS(G,s)
1.  for each vertex u in G.V-{s}
2.       color[u]=WHITE
3.       d[u]=inf
4.       pi[u]=NIL
5.  color[s]=GRAY
6.  d[s]=0
7.  pi[s]=NIL
8.  Q=empty set
7.  ENQUEUE(Q,s)
8.  while Q!= empty set
9.          u=DEQUEUE(Q)
10.        for each v in G.Adj[u]
11.             if color[v]==WHITE
12.                color[v]=GRAY
13.                d[v]=d[u]+1
14.                pi[v]=u
17.                ENQUEUE(Q,v)
18.        color[u]=BLACK
19.        T[s,u]=1

We find the time complexity of the algorithm [m]Transitive_Closure[/m] as follows:

  • The lines 1-3 require time $O(|V|^2)=O(|V| \cdot |V|) \overset{|V| \leq |E|+1}{=} O((|E|+1) \cdot |V|)=O(|E| \cdot |V|)$
  • The lines 4-5 require time $O(|V| \cdot (|V|+|E|))=O(|V| \cdot |E|)$
  • The line $6$ requires time $O(1)$.

So the time complexity of the algorithm is $O(|E| \cdot |V|)+O(|V| \cdot |E|)+O(1)=O(|V| \cdot |E|)$.
Am I right? (Thinking)
 
Or should it be as follows? Should the command [m]T[u,v]=1[/m] be at the line 11 and not at the end of the algorithm BFS? (Thinking)
We don't want T[u,v] to be equal to "1" only if u and v are adjacent ($u \rightarrow v$) but T[u,v] should be "1" also when we have $u \rightarrow w \rightarrow v$, right? Do we get this result with the change I did? (Thinking)

Code:
    Transitive_Closure(G)
    1.  for each vertex u in G.V
    2.       for each vertex v in G.V
    3.            T[u,v]=0
    4.   for each vertex u in G.V
    5.        BFS(G,u)
    6.   return T

Code:
    BFS(G,s)
    1.  for each vertex u in G.V-{s}
    2.       color[u]=WHITE
    3.       d[u]=inf
    4.       pi[u]=NIL
    5.  color[s]=GRAY
    6.  d[s]=0
    7.  pi[s]=NIL
    8.  Q=empty set
    7.  ENQUEUE(Q,s)
    8.  while Q!= empty set
    9.         u=DEQUEUE(Q)
    10.        for each v in G.Adj[u]
    11.             T[u,v]=1
    12.             if color[v]==WHITE
    13.                color[v]=GRAY
    14.                d[v]=d[u]+1
    15.                pi[v]=u
    16.                ENQUEUE(Q,v)
    17.        color[u]=BLACK
 
I changed the algorithm and it gives the right output:

Code:
    Transitive_Closure(G)
    1.  for each vertex u in G.V
    2.       for each vertex v in G.V
    3.            T[u,v]=0
    4.   for each vertex u in G.V
    5.        BFS(G,u,T)
    6.   return T
   
    BFS(G,s,T)
    1.  for each vertex u in G.V-{s}
    2.       color[u]=WHITE
    3.  color[s]=GRAY
    4.  Q=empty set
    5.  ENQUEUE(Q,s)
    6.  while Q!= empty set
    7.         u=DEQUEUE(Q)
    8.        for each v in G.Adj[u]            
    9.             if color[v]==WHITE
    10.               T[s,v]=1
    11.               color[v]=GRAY
    12.               ENQUEUE(Q,v)
    13.        color[u]=BLACK
The initialization of [m]T[/m] requires $O(|V|^2)$ time, and so the time complexity of the algorithm is $O(|V|^2+|V|(|V|+|E|))=O(|V|^2+|V| \cdot |E|)$.

We don't know if the given graph is connected, so we cannot use the fact that $|V| \leq |E|+1$, right?
What could we do in order to get time complexity $O(|V| \cdot |E|)$?
Can the initialization be avoided? :confused:
 
I tried the following: http://pastebin.com/qNvvRCYf

Is the time complexity calculated as follows? Or am I wrong?
The time complexity of T=createGraph(V) is O(|V|). The time complexity of BFS is O(|V_i|+|E_i|)(the time complexity of addEdge is O(1)). So, the lines 2,3 are executed $\sum_{i=1}^{|V|} O(|V_i|+|E_i|)=\sum_{i=1}^{|V|} O(|E_i|) \leq \sum_{i=1}^{|V|} O(|E|)=O(|V| \cdot |E|)$ times and this is also the time complexity of the algorithm.

($|V_i|, |E_i|$ are the number of vertices and edges in the ith component respectively)
 
evinda said:
The time complexity of T=createGraph(V) is O(|V|).
What does createGraph(V) do?

evinda said:
So, the lines 2,3 are executed $\sum_{i=1}^{|V|} O(|V_i|+|E_i|)$... ($|V_i|, |E_i|$ are the number of vertices and edges in the ith component respectively)
Are you assuming the graph has $|V|$ components (since $i$ ranges from 1 to $|V|$ in the sum)?
 
  • #10
Evgeny.Makarov said:
What does createGraph(V) do?

With [m] createGraph(V) [/m] we initialize the matrix of adjacency lists.
Evgeny.Makarov said:
Are you assuming the graph has $|V|$ components (since $i$ ranges from 1 to $|V|$ in the sum)?

I did so because I thought that this is the worst case if we have $|V|$ components. Is it wrong?
 
  • #11
evinda said:
With [m] createGraph(V) [/m] we initialize the matrix of adjacency lists.
Isn't the graph given to you as input?

evinda said:
I did so because I thought that this is the worst case if we have $|V|$ components.
Then why did you write $|V_i|$ instead of 1 and $|E_i|$ instead of 0?
 
  • #12
Evgeny.Makarov said:
Isn't the graph given to you as input?

Oh yes right... (Nod) Doesn't the following suffice? I deleted the definitions of the structs..

Code:
Transitive_Closure(G)
        // T is the matrix of adjacency lists, which is at the beginning NULL
        1.   for each vertex u in G.V
        2.        BFS(G,u,T)
        3.   return T
       
       BFS(G,s,T)
        1.  for each vertex u in G.V-{s}
        2.       color[u]=WHITE
        3.  color[s]=GRAY
        4.  Q=empty set
        5.  ENQUEUE(Q,s)
        6.  while Q!= empty set
        7.         u=DEQUEUE(Q)
        8.        for each v in G.Adj[u]            
        9.             if color[v]==WHITE
        10.               addEdge(T, s, v)
        11.               color[v]=GRAY
        12.               ENQUEUE(Q,v)
        13.        color[u]=BLACK 
  
         
        
        void addEdge(T, position, vertex)
        {
            newNode->data=vertex;
            newNode->next = T[position].head;
            T[position].head = newNode;
        }

Evgeny.Makarov said:
Then why did you write $|V_i|$ instead of 1 and $|E_i|$ instead of 0?

Rethinking about it, couldn't we say that the time complexity is $O(|V|)+\sum_{i}O(|V|(|V_i|+|E_i|))= \sum_{i}O(|V|(|V_i|+|E_i|))=\sum_{i}O(|V|(|E_i|))=O(|V| \sum_i |E_i|) $ and since $ \sum |E_i| = |E|$ we have that $\sum_{i}O(|V|(|V_i|+|E_i|))=O(|V| \cdot |E|)$?
 
Last edited:
Back
Top