Find total number of republications

evinda · Dec 2, 2020

Hello! (Wave)

Twitter is testing a new function where each publication that someone makes gets republished automatically by each of his followers. The functions works <<sequentially>>, i.e. each republication from a follower causes republication also from his followers and so on. However, in any case, the publication/republication of the news is done at most once from the account of each user. I want to model the problem using graphs and write and analyze an algorithm that computes for each user the total number of publications/republications that occurs with each of his publication.

I have thought the following.

We consider a graph, where the users of twitter will be represented by the vertices. If someone follows someone else, then the vertices will be connected with an edge.

So we begin with the first user that makes the publication. This will be a vertex with no incoming edges.

Then we visit the neighbors of the vertex, that means the followers of the user.

Since, when one user makes a publications, all of hisfollowers make the same publication, we follow the idea of the DFS algorithm.

The number of times that the publication will be replubished will be equal to the finishing time found at the last visited vertex to which there is a path from the inital vertex.

So, the algorithm that counts the total number of republications that occur from a publication of a user is the following:

Code:

Algorithm Republications
Input G=(V,E)
Output array F with |V| elements
for each v in G.V do
      color[v]<-WHITE
time<-0
for each v in G.V do
     if color[v]=WHITE then
        dfs_visit(G,v)
for each v in G.V do
      print(v,f(v))

Does this algorithm computes for each user the total number of republications that occur with his publication? (Thinking)

I like Serena · Dec 2, 2020

evinda said:

So we begin with the first user that makes the publication. This will be a vertex with no incoming edges.

Wouldn't this first user possibly have incoming edges as well?
They can follow other people too that may independently be making publications.

So, the algorithm that counts the total number of republications that occur from a publication of a user is the following:
Code:
Algorithm Republications
Input G=(V,E)
Output array F with |V| elements
for each v in G.V do
      color[v]<-WHITE
time<-0
for each v in G.V do
     if color[v]=WHITE then
        dfs_visit(G,v)
for each v in G.V do
      print(v,f(v))
Does this algorithm computes for each user the total number of republications that occur with his publication?

So V is the set of all users of twitter, and E is the set that represents who is following who.
Where is the "first user"? Shouldn't that be an input as well?

What does the output array F contain?
Shouldn't the "total number of republications that occur from a publication of a user" be an output?

Where does the output F get a value?
And shouldn't "time" be used somewhere?

evinda · Dec 2, 2020

Klaas van Aarsen said:

Wouldn't this first user possibly have incoming edges as well?
They can follow other people too that may independently be making publications.

Oh yes, right! If there were no incoming edges, that would mean that the first user wouldn't follow anyone.

Klaas van Aarsen said:

So V is the set of all users of twitter, and E is the set that represents who is following who.
Where is the "first user"? Shouldn't that be an input as well?

Ah, I see... We would have to call the desired function for each user seperately, in order to find the number of republications that results from a publication of him, right? (Thinking)

Klaas van Aarsen said:

What does the output array F contain?
Shouldn't the "total number of republications that occur from a publication of a user" be an output?

Where does the output F get a value?
And shouldn't "time" be used somewhere?

I thought that at the array F we would include the finishing times that we would get from the last visited vertices of all the independent vertices.
But applying this algorithm, we can just find the number of republications of a publication that the first user makes, right?

As for the varibale "time", I thought that it would just be used at the application of the dfs-algorithm and that we initialize it for this reason.

I like Serena · Dec 2, 2020

evinda said:

Ah, I see... We would have to call the desired function for each user seperately, in order to find the number of republications that results from a publication of him, right?

Yes.
A possible alternative is to provide another input with the set of publications combined with the time they are made.
But then we add an extra level of complexity to the algorithm.
Either way, the algorithm must have an input that describes the publication.

I thought that at the array F we would include the finishing times that we would get from the last visited vertices of all the independent vertices.
But applying this algorithm, we can just find the number of republications of a publication that the first user makes, right?

Indeed. According to your problem statement we only need to find a count.
So that count should then be the only output.

Then again, you also wrote that you wanted to compute for each user the total number of publications/republications that occurs with each of his publication.
That suggests that you do want a count for every vertex. And you apparently also want multiple publications at possibly different times, and possibly by the same person as well.

As for the variable "time", I thought that it would just be used at the application of the dfs-algorithm and that we initialize it for this reason.

If "time" is used at the application of the dfs-algorithm, then it should be passed as both input and output, shouldn't it?

evinda · Dec 3, 2020

Klaas van Aarsen said:

Yes.
A possible alternative is to provide another input with the set of publications combined with the time they are made.
But then we add an extra level of complexity to the algorithm.
Either way, the algorithm must have an input that describes the publication. Indeed. According to your problem statement we only need to find a count.
So that count should then be the only output.

Then again, you also wrote that you wanted to compute for each user the total number of publications/republications that occurs with each of his publication.
That suggests that you do want a count for every vertex. And you apparently also want multiple publications at possibly different times, and possibly by the same person as well. If "time" is used at the application of the dfs-algorithm, then it should be passed as both input and output, shouldn't it?

So we consider the graph G that contains all the vertices that are connected to u, which represents the first user that makes a publication, and there is an edge if the user follows or is followed by someone.

What kind of variable will the publication be?

I have thought of the following algorithm:

Code:

Algorithm Republications
   Input G=(V,E), vertex u, publication p
   Output number_of_republications
   for each v in G.V do
        color[v]<-WHITE
   time<-0
   for each v in G.V do
        if color[v]=WHITE then
           dfs_visit(G,v,time)
number_of_republications=time
return number_of_republications

And we would need to call this function for each publication that a user makes and for each user.
Is my idea right? (Thinking)

I like Serena · Dec 3, 2020

evinda said:

What kind of variable will the publication be?

So we have vertex u to represent the user who makes one publication at time=0.
Then we don't need a publication p as a separate input do we? It suffices to specify that there will be one publication by u at the start.

I have thought of the following algorithm:

And we would need to call this function for each publication that a user makes and for each user.
Is my idea right?

How are the number_of_republications and the time related to each other?
As it is now, isn't the number_of_republications simply the number of nodes that we are visiting?
If so then we don't need a time do we?

How about:

Code:

Algorithm Republications
   Input G=(V,E), vertex u in V of user that makes one initial publication
   Output number_of_republications
   for each v in G.V do
        color[v]<-WHITE
   number_of_republications<-0
   for each v in G.V do
        if color[v]=WHITE then
           number_of_republications += dfs_visit(G,v)
   return number_of_republications

where we assume that dfs_visit returns the number of WHITE nodes visited, and where we also assume that dfs_visit removes the WHITE color of visited nodes? :unsure:

Furthermore, the user u is not supposed to republish their own publication.
So shouldn't we make sure that does not happen?

evinda · Dec 3, 2020

Klaas van Aarsen said:

So we have vertex u to represent the user who makes one publication at time=0.
Then we don't need a publication p as a separate input do we? It suffices to specify that there will be one publication by u at the start.

Ok... :)

Klaas van Aarsen said:

How are the number_of_republications and the time related to each other?
As it is now, isn't the number_of_republications simply the number of nodes that we are visiting?
If so then we don't need a time do we?

Does it hold that the dfs-algorithm returns the finishing time of the last visited vertex, which is the number of nodes that have been visited?

Klaas van Aarsen said:
How about:
Code:
Algorithm Republications
   Input G=(V,E), vertex u in V of user that makes one initial publication
   Output number_of_republications
   for each v in G.V do
        color[v]<-WHITE
   number_of_republications<-0
   for each v in G.V do
        if color[v]=WHITE then
           number_of_republications += dfs_visit(G,v)
   return number_of_republications
where we assume that dfs_visit returns the number of WHITE nodes visited, and where we also assume that dfs_visit removes the WHITE color of visited nodes? :unsure:

If that what I said above about the dfs-algorithm is true, then don't we have to check at the point where we enter the iteration:

Code:

for each v in G.V do
        if color[v]=WHITE then
           number_of_republications += dfs_visit(G,v)

if there is an edge betwenn u and v? Because otherwise if there is a user that isn't connected to the first user, we will add number of republications that are not connected with the publication of the first user. Or am I wrong?

Klaas van Aarsen said:

Furthermore, the user u is not supposed to republish their own publication.
So shouldn't we make sure that does not happen?

Do we check this in an other algorithm, where we call the function Republications? (Thinking)
Because in this algorithm all the publications occur from the publication of the first user. Or am I wrong?

I like Serena · Dec 3, 2020

evinda said:

Does it hold that the dfs-algorithm returns the finishing time of the last visited vertex, which is the number of nodes that have been visited?

The Depth-first search algorithm doesn't actually do anything other than visiting all nodes.
We need to specify the behavior we want from it.
That could be to return the number of nodes it visited. :unsure:

If that what I said above about the dfs-algorithm is true, then don't we have to check at the point where we enter the iteration:
Code:
for each v in G.V do
        if color[v]=WHITE then
           number_of_republications += dfs_visit(G,v)
if there is an edge betwenn u and v? Because otherwise if there is a user that isn't connected to the first user, we will add number of republications that are not connected with the publication of the first user. Or am I wrong?

Ah, we forgot to use u and instead we started from every v in the graph.
We should only start from u.
So perhaps instead:

Code:

number_of_republications ← dfs_visit(G, u)

:unsure:

Do we check this in an other algorithm, where we call the function Republications?
Because in this algorithm all the publications occur from the publication of the first user. Or am I wrong?

No.
I misinterpreted the for-loop, which shouldn't be there to begin with as I've just explained.
With the correction, u will be counted as the first node visited, which is the publication, and it will be marked as discovered.
After that the algorithm won't try to do a republication for u if it happens to return to it.

evinda · Dec 3, 2020

Klaas van Aarsen said:

The Depth-first search algorithm doesn't actually do anything other than visiting all nodes.
We need to specify the behavior we want from it.
That could be to return the number of nodes it visited. :unsure:

The number of nodes visited will be equal to the finishing time of the last visited vertex, given that from the vertex that we give us input to DFS there is a path to each other vertex, right? (Thinking)
So, the algorithm would be as follows, right?

Code:

Algorithm dfs_visit(G,u,time)
   time<-time+1
   color[u]<-GRAY
   for each v in G.adjacent(u) do
        if color[v]==WHITE then
            dfs_visit(G,v,time)
  color[u]<-BLACK
  time<-time+1
 return time

Klaas van Aarsen said:
Ah, we forgot to use u and instead we started from every v in the graph.
We should only start from u.
So perhaps instead:
Code:
number_of_republications += dfs_visit(G, u)
:unsure:No.
I misinterpreted the for-loop, which shouldn't be there to begin with as I've just explained.
With the correction, u will be counted as the first node visited, which is the publication, and it will be marked as discovered.
After that the algorithm won't try to do a republication for u if it happens to return to it.

Applying the DFS with u as input, we find the number of republications that occur from a publication of u.
Then, if we apply the DFS for all the other vertices, we could also find the number of republications that occur from their publication, but we need to ignore all the black vertices, that mean the vertices of which we have already found the desired amount. Right? (Thinking)

Or have I understood it wrong and we just apply the DFS-algorithm for the first user u? (Thinking)

I like Serena · Dec 3, 2020

evinda said:

The number of nodes visited will be equal to the finishing time of the last visited vertex, given that from the vertex that we give us input to DFS there is a path to each other vertex, right?
So, the algorithm would be as follows, right?

Code:

Algorithm dfs_visit(G,u,time)
1   time<-time+1
2   color[u]<-GRAY
3   for each v in G.adjacent(u) do
4        if color[v]==WHITE then
5            dfs_visit(G,v,time)
6  color[u]<-BLACK
7  time<-time+1
8  return time

In line 5 the call to dfs_visit will return the accumulated time in that visit.
That still has to be added to the time we return.

In line 7 time is incremented, but that was already done in line 1.
So it is incremented twice for the same user, which shouldn't happen.

There is no real need to distinguish GRAY and BLACK is there?
We can remove line 6 and the output of the algorithm will be the same.

Applying the DFS with u as input, we find the number of republications that occur from a publication of u.
Then, if we apply the DFS for all the other vertices, we could also find the number of republications that occur from their publication, but we need to ignore all the black vertices, that mean the vertices of which we have already found the desired amount. Right?

Or have I understood it wrong and we just apply the DFS-algorithm for the first user u?

We just apply the DFS-algorithm for the first user u.
Any node that is visited will be originally WHITE at which it is counted (as either the initial publication or as a republication).
Then it is changed in color, which ensures it is not counted again, which is what we want. :geek:

evinda · Dec 3, 2020

Klaas van Aarsen said:
Code:
Algorithm dfs_visit(G,u,time)
1   time<-time+1
2   color[u]<-GRAY
3   for each v in G.adjacent(u) do
4        if color[v]==WHITE then
5            dfs_visit(G,v,time)
6  color[u]<-BLACK
7  time<-time+1
8  return time
In line 5 the call to dfs_visit will return the accumulated time in that visit.
That still has to be added to the time we return.

In line 7 time is incremented, but that was already done in line 1.
So it is incremented twice for the same user, which shouldn't happen.

There is no real need to distinguish GRAY and BLACK is there?
We can remove line 6 and the output of the algorithm will be the same. We just apply the DFS-algorithm for the first user u.
Any node that is visited will be originally WHITE at which it is counted (as either the initial publication or as a republication).
Then it is changed in color, which ensures it is not counted again, which is what we want. :geek:

So the algorithm will be as follows:

Code:

Algorithm Republications
   Input G=(V,E), vertex u in V of user that makes one initial publication
   Output number_of_republications
   for each v in G.V do
        color[v]<-WHITE
   number_of_republications <- dfs_visit(G,v)
   return number_of_republications

Since the variable number_of_republications will be equal to the finishing time of the last visited vertex, which will be the desired number of republications. Right?

And dfs_visit will be as follows:

Code:

Algorithm dfs_visit(G,u)
  time<-0
color[u]<-GRAY
for each v in G.adjacent(u) do
if color[v]==WHITE then
time<-dfs_visit(G,v)
return time

Is it right like that? (Thinking)
If so the running time will be equal to O(|V|)+the running time of the dfs_visit which is O(|E|), so in total O(|V|+|E|).

Is evrything right or have I done something wrong? :unsure:

I like Serena · Dec 3, 2020

evinda said:

And dfs_visit will be as follows:
Is it right like that?

Code:

1 Algorithm dfs_visit(G,u)
2  time<-0
3  color[u]<-GRAY
4  for each v in G.adjacent(u) do
5    if color[v]==WHITE then
6      time<-dfs_visit(G,v)
7  return time

In line 6 the time of a previous result of dfs_visit() is overwritten now. It should be added instead.

If so the running time will be equal to O(|V|)+the running time of the dfs_visit which is O(|E|), so in total O(|V|+|E|).

Correct. (Nod)

evinda · Dec 3, 2020

Klaas van Aarsen said:
Code:
1 Algorithm dfs_visit(G,u)
2  time<-0
3  color[u]<-GRAY
4  for each v in G.adjacent(u) do
5    if color[v]==WHITE then
6      time<-dfs_visit(G,v)
7  return time
In line 6 the time of a previous result of dfs_visit() is overwritten now. It should be added instead.

Why should it be added? I thought that the result of dfs_visit(v) that corresponds to the last visited vertex $v$ is the number of the republications...
When adding all the results of the dfs_visit-calls, don't we count the number of visits of the vertices more times?
Or have I understood it wrong?

Could you explain it to me? (Thinking)

I like Serena · Dec 3, 2020

evinda said:

Why should it be added? I thought that the result of dfs_visit(v) that corresponds to the last visited vertex $v$ is the number of the republications...
When adding all the results of the dfs_visit-calls, don't we count the number of visits of the vertices more times?
Or have I understood it wrong? Could you explain it to me?

I have just noticed that the time is not increased anywhere, so we are always returning time=0.

Line 2 should be "time← 1" to fix that.
That is, when we visit a node, we need to count that node once.

Now suppose u is followed by v who is followed in turn by w.
And suppose u is also followed by x who does not have followers.

Then at the top level we will see:
time ← 1
time ← dfs_visit(G,v) = 2
time ← dfs_visit(G,x) = 1
return time = 1

But we should have a total time of 1+2+1=4 shouldn't we?

We can fix it by making it:
time ← 1
time ← time + dfs_visit(G,v) = 1+2 = 3
time ← time + dfs_visit(G,x) = 3 + 1 = 4
return time = 4

evinda · Dec 5, 2020

Klaas van Aarsen said:

I have just noticed that the time is not increased anywhere, so we are always returning time=0.
Line 2 should be "time← 1" to fix that.
That is, when we visit a node, we need to count that node once.

I thought to set at the beginning the variable "time" equal to 0, since we are looking for the number of republications. If the user has no followers, we will have no republication. Is it wrong like that?

Klaas van Aarsen said:

Now suppose u is followed by v who is followed in turn by w.
And suppose u is also followed by x who does not have followers.

Then at the top level we will see:
time ← 1
time ← dfs_visit(G,v) = 2
time ← dfs_visit(G,x) = 1
return time = 1

But we should have a total time of 1+2+1=4 shouldn't we?

We can fix it by making it:
time ← 1
time ← time + dfs_visit(G,v) = 1+2 = 3
time ← time + dfs_visit(G,x) = 3 + 1 = 4
return time = 4

The graph would be as follows, right? (Thinking)

Suppose that we make a DFS-traversal and we start with the neighbor v of u, then we visit w, and then x.
If the time would be 1 for example at the vertex u, the time of v would be 1, the time of w 2 and the time of x would be 3.
Isn't that the idea of DFS? Or have I understood it wrong?

I like Serena · Dec 5, 2020

evinda said:

I thought to set at the beginning the variable "time" equal to 0, since we are looking for the number of republications. If the user has no followers, we will have no republication. Is it wrong like that?

I'm afraid it's wrong.
The "problem" is that dfs_visit is a recursive function.
When dfs_visit is recursively called to visit v, the time is set to zero as well for v, while this is a republication.

The graph would be as follows, right?

Suppose that we make a DFS-traversal and we start with the neighbor v of u, then we visit w, and then x.
If the time would be 1 for example at the vertex u, the time of v would be 1, the time of w 2 and the time of x would be 3.
Isn't that the idea of DFS? Or have I understood it wrong?

Yep. That is the graph. (Nod)

The idea of DFS is simply that we visit all nodes exactly once.
And we can return something we want to track - in this case the time spent in each subgraph.

Either way, when would 'time' become different from 0 in the algorithm as it is now? :unsure:

evinda · Dec 6, 2020

Klaas van Aarsen said:

I'm afraid it's wrong.
The "problem" is that dfs_visit is a recursive function.
When dfs_visit is recursively called to visit v, the time is set to zero as well for v, while this is a republication.

Oh yes, I see... (Nod)

Klaas van Aarsen said:

Yep. That is the graph. (Nod)

The idea of DFS is simply that we visit all nodes exactly once.
And we can return something we want to track - in this case the time spent in each subgraph.

Either way, when would 'time' become different from 0 in the algorithm as it is now? :unsure:

So in order to count the number of republications, we need to increase the variable "time" each time we visit a new vertex, right?

Should the "time" be an input of the dfs_visit, i.e. should it be as follows?

Code:

Algorithm Republications
   Input G=(V,E), vertex u in V of user that makes one initial publication
   Output number_of_republications
   for each v in G.V do
        color[v]<-WHITE
   time<-0
   number_of_republications <- dfs_visit(G,u,time)
   return number_of_republications

And dfs_visit would be as follows? (Thinking)

Code:

Algorithm dfs_visit(G,u,time)
  color[u]<-GRAY
  for each v in G.adjacent(u) do
        if color[v]==WHITE then
           time<-time+1
           dfs_visit(G,v)
 return time

I like Serena · Dec 6, 2020

evinda said:

So in order to count the number of republications, we need to increase the variable "time" each time we visit a new vertex, right?

Yes. (Nod)

Should the "time" be an input of the dfs_visit, i.e. should it be as follows?

And dfs_visit would be as follows?

Code:

1 Algorithm Republications
2   Input G=(V,E), vertex u in V of user that makes one initial publication
3   Output number_of_republications
4   for each v in G.V do
5        color[v]<-WHITE
6   time<-0
7   number_of_republications <- dfs_visit(G,u,time)
8   return number_of_republications

Code:

1 Algorithm dfs_visit(G,u,time)
2  color[u]<-GRAY
3  for each v in G.adjacent(u) do
4        if color[v]==WHITE then
5           time<-time+1
6           dfs_visit(G,v)
7  return time

We can make time an input, but we don't have to.
As it is now, it is inconsistent, since inside dfs_visit there is a recursive call that does not have time as parameter.
And it does not use the return value either.

We have a couple of options.

We can measure the time spent inside each recursive call to dfs_visit, which would start at zero each time.
In this case we don't need an input parameter.
We would return the time spent inside the recursive call as output.
And we would have to add the times together of the recursive calls.
We can make time both an input parameter and a returned output.
In this case the recursive call in line 6 of dfs_visit also needs time as an input parameter.
And the returned value needs to be assigned to time again.
We can make time a variable with extended scope that is accessible from within dfs_visit.
In this case we should not pass time as an input parameter, and we should not return it either from dfs_visit.
Line 6 of Republications should then simply return time.
We can make time an input-output parameter, which we should then specify in the specification of dfs_visit.
In this case we should not return time from dfs_visit.
And line 6 of Republications should simply return time.

evinda · Dec 6, 2020

Klaas van Aarsen said:

We can make time an input, but we don't have to.
As it is now, it is inconsistent, since inside dfs_visit there is a recursive call that does not have time as parameter.
And it does not use the return value either.

Oy yes, I forgot to include it when calling dfs_visit...
At which point do you mean that the return value is not used? (Thinking)

Klaas van Aarsen said:

We have a couple of options.

We can measure the time spent inside each recursive call to dfs_visit, which would start at zero each time.
In this case we don't need an input parameter.
We would return the time spent inside the recursive call as output.
And we would have to add the times together of the recursive calls.

But how will we get the right times, when "time" will always be zero when applying the dfs_visit at each new vertex?

Klaas van Aarsen said:

[*]We can make time both an input parameter and a returned output.
In this case the recursive call in line 6 of dfs_visit also needs time as an input parameter.
And the returned value needs to be assigned to time again.

What do you mean with "And the returned value needs to be assigned to time again."?

Klaas van Aarsen said:

[*]We can make time a variable with extended scope that is accessible from within dfs_visit.
In this case we should not pass time as an input parameter, and we should not return it either from dfs_visit.
Line 6 of Republications should then simply return time.

You mean that we would consider the variable "time" as a global variable?

Klaas van Aarsen said:

4.We can make time an input-output parameter, which we should then specify in the specification of dfs_visit.
In this case we should not return time from dfs_visit.
And line 6 of Republications should simply return time.

How would we specify "time" in the specification of dfs_visit?

I like Serena · Dec 6, 2020

evinda said:

Oy yes, I forgot to include it when calling dfs_visit...
At which point do you mean that the return value is not used?

The same line: line 6 in dfs_visit.
It read dfs_visit(G, v), but it should read time ← dfs_visit(G,v,time).

evinda said:

1.
But how will we get the right times, when "time" will always be zero when applying the dfs_visit at each new vertex?

We can do:

Code:

Algorithm dfs_visit(G, u)
  time ← 1
  color[u] = GRAY
  for each v in G.adjacent(u) do
    if color[v] == WHITE
      time ← time + dfs_visit(G, v)
  return time

That is, we add the return value of dfs_visit to time.
We do have to start at 1 instead of 0 to make it work though.

evinda said:

2.
What do you mean with "And the returned value needs to be assigned to time again."?

Something like this:

Code:

Algorithm dfs_visit(G,u,time)
  color[u] ← GRAY
  for each v in G.adjacent(u) do
        if color[v]==WHITE then
           time ← time + 1
           time ← dfs_visit(G, v, time)
  return time

That is, dfs_visit return the time as output.
The time that was changed inside the call to dfs_visit will no longer be accessible.
So we need to pick up the return value and overwrite the previous value that we had.

evinda said:

3.
You mean that we would consider the variable "time" as a global variable?

Basically yes. (Nod)

Note that many programming languages allow a local variable to be accessible inside another local function.
That is, we could have:

Code:

Algorithm Republications
   Input G=(V,E), vertex u in V of user that makes one initial publication
   Output number_of_republications

   time<-0
   for each v in G.V do
        color[v] ← WHITE

   Algorithm dfs_visit(G, u)
     color[u] ← GRAY
     for each v in G.adjacent(u) do
         if color[v]==WHITE then
            time ← time + 1
            dfs_visit(G, v)
   EndAlgorithm

   dfs_visit(G, u, time)
   return time

For the record, we are already doing the same thing with color.

evinda said:

4.
How would we specify "time" in the specification of dfs_visit?

Something like:

Code:

Algorithm dfs_visit(G, u, time)
  Input: G=(V,E)
  Input: vertex u in V of user that makes either a publication or a republication
  Input-Output: time to which the time spent is added
  color[u] ← GRAY
  for each v in G.adjacent(u) do
        if color[v] == WHITE then
           time ← time + 1
           dfs_visit(G, v, time)

To be fair, I haven't seen such a combined Input-Output in algorithm-pseudocode syntax yet, but many programming languages support it, so it makes sense that we can use it if we want to.

evinda · Dec 6, 2020

Klaas van Aarsen said:
We can do:
Code:
Algorithm dfs_visit(G, u)
  time ← 1
  color[u] = GRAY
  for each v in G.adjacent(u) do
    if color[v] == WHITE
      time ← time + dfs_visit(G, v)
  return time
That is, we add the return value of dfs_visit to time.
We do have to start at 1 instead of 0 to make it work though.
Something like this:
Code:
Algorithm dfs_visit(G,u,time)
  color[u] ← GRAY
  for each v in G.adjacent(u) do
        if color[v]==WHITE then
           time ← time + 1
           time ← dfs_visit(G, v, time)
  return time
That is, dfs_visit return the time as output.
The time that was changed inside the call to dfs_visit will no longer be accessible.
So we need to pick up the return value and overwrite the previous value that we had.

The last algorithm gives the right output, but doesn't the previous one give the output it should+1?

Klaas van Aarsen said:
Basically yes. (Nod)

Note that many programming languages allow a local variable to be accessible inside another local function.
That is, we could have:
Code:
Algorithm Republications
   Input G=(V,E), vertex u in V of user that makes one initial publication
   Output number_of_republications

   time<-0
   for each v in G.V do
        color[v] ← WHITE

   Algorithm dfs_visit(G, u)
     color[u] ← GRAY
     for each v in G.adjacent(u) do
         if color[v]==WHITE then
            time ← time + 1
            dfs_visit(G, v)
   EndAlgorithm

   dfs_visit(G, u, time)
   return time
For the record, we are already doing the same thing with color.

So if "time" is a global variable and we want to write the two functions seperately, we would have the following, right? (Thinking)

Code:

Algorithm dfs_visit(G, u)
  color[u] = GRAY
  for each v in G.adjacent(u) do
    if color[v] == WHITE
      time<-time+1
      dfs_visit(G, v)

Code:

Algorithm Republications
   Input G=(V,E), vertex u in V of user that makes one initial publication
   Output number_of_republications
   time<-0
   for each v in G.V do
        color[v] ← WHITE
   dfs_visit(G, u)
   return time

I like Serena · Dec 6, 2020

evinda said:

The last algorithm gives the right output, but doesn't the previous one give the output it should+1?

If we want, we can subtract 1 from the result when we call dfs_visit from the Republications algorithm.

evinda said:

So if "time" is a global variable and we want to write the two functions separately, we would have the following, right?

Yep. (Nod)

evinda · Dec 6, 2020

Klaas van Aarsen said:

If we want, we can subtract 1 from the result when we call dfs_visit from the Republications algorithm.

Yep. (Nod)

Nice, thank you very much! (Clapping)

Find total number of republications

1. How do you find the total number of republications?

2. Why is it important to find the total number of republications?

3. What factors should be considered when finding the total number of republications?

4. Are there any limitations to finding the total number of republications?

5. How can the total number of republications be used in research?

Similar threads

Hot Threads

Recent Insights