Why does not this Erdos-Renyi C code work?

AI Thread Summary
The discussion centers around issues with a C implementation of an Erdos-Renyi random graph using an adjacency matrix. The code fails to compile correctly due to improper handling of indices and the use of global variables, which leads to confusion about the purpose of certain arrays like `neighbour` and `neighbours`. Participants suggest that the `CONNECTIVITY` variable should represent the probability of edge existence and emphasize the need for clearer comments in the code to explain its functionality. Additionally, there is a recommendation to simplify the depth-first search (DFS) implementation to avoid unnecessary recursion and to clarify the goal of finding direct versus indirect connections. The conversation highlights the importance of debugging and understanding the structure of the graph being modeled.
valesdn
Messages
42
Reaction score
1

Homework Statement



I need to write an Erdos-Renyi random graph, by using the adjacency matrix (or alternatively list) and calculate the fitness of the graph.
Definition: G(n, p) is a random graph with n vertices where each possible edge has probability p of existing.

Homework Equations



The number of edges in a G(n,p) graph is a random variable with a binomial expected value.

The Attempt at a Solution



I wrote the following C code:
Mod note: Edited by mentor to change code tag to code=c tag, and deleted extraneous spaces between lines.
C:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define VOL 30
#define CONNECTIVITY 1

int main(){

    int i , j , neighbour[VOL], connected[VOL];
    int matrix[VOL][VOL];

    for ( i = 0; i < VOL; i ++ ){
        for (  j = 0; j < VOL; j ++ ){
            matrix[i][j] = 0;
            matrix[j][i] = 0;
        }
    }

    srand((unsigned)time(NULL));

    int num_edges = 0;

    for(i = 0; i < VOL; i++){
        connected[i] = 0;
        for(j=VOL-1;j>=i;j--){
            double prob=(rand()/(double)(RAND_MAX));
            if(prob < CONNECTIVITY){
                j = prob*VOL;
                if(i != j ){
                    matrix[i][j] = 1;
                    matrix[j][i] = 1;
                    neighbour[i] = j;
                    neighbour[j] = i;
                    num_edges++;
                    printf("matrix[%d][%d]\n", i,j);
                }
                else{
                    j = prob*VOL;
                    matrix[i][j] = 1;
                    matrix[j][i] = 1;
                    neighbour[i] = j;
                    neighbour[j] = i;
                    connected[i] = 1;
                }
            }
        }
        return 0; 
   }
}

However there is something wrong. Any advice or help would be greatly appreciated. Thank you.
 
Last edited:
Physics news on Phys.org
valesdn said:

Homework Statement



I need to write an Erdos-Renyi random graph, by using the adjacency matrix (or alternatively list) and calculate the fitness of the graph.
Definition: G(n, p) is a random graph with n vertices where each possible edge has probability p of existing.

Homework Equations



The number of edges in a G(n,p) graph is a random variable with a binomial expected value.

The Attempt at a Solution



I wrote the following C code:

C:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define VOL 30
#define CONNECTIVITY 1

int main(){

    int i , j , neighbour[VOL], connected[VOL];
    int matrix[VOL][VOL];

    for ( i = 0; i < VOL; i ++ ){
        for (  j = 0; j < VOL; j ++ ){
            matrix[i][j] = 0;
            matrix[j][i] = 0;
        }
    }

    srand((unsigned)time(NULL));

    int num_edges = 0;

    for(i = 0; i < VOL; i++){
        connected[i] = 0;
        for(j=VOL-1;j>=i;j--){
            double prob=(rand()/(double)(RAND_MAX)); 
            if(prob < CONNECTIVITY){
                j = prob*VOL;
                if(i != j ){
                    matrix[i][j] = 1;
                    matrix[j][i] = 1;
                    neighbour[i] = j;
                    neighbour[j] = i;
                    num_edges++;
                    printf("matrix[%d][%d]\n", i,j);
                }
                else{
                    j = prob*VOL;
                    matrix[i][j] = 1;
                    matrix[j][i] = 1;
                    neighbour[i] = j;
                    neighbour[j] = i;
                    connected[i] = 1;
                }
            }
        }
        return 0;   
   }
}

However there is something wrong. Any advice or help would be greatly appreciated. Thank you.
What exactly is wrong? Does your program compile? If it compiles, do the results it produces differ from what you think it should be produced? If so, tell us what it should produce and what it actually produces.

The normal course of action is to use a debugger to single-step through the program, noting the values of variables, and comparing the expected values of them with their actual values as produced by your code.
 
Last edited:
Thank you, Mark44 for your reply.
It doesn't compile the adjacency matrix: the indices of the matrix are the edges between the nodes, but I didn't want to do that. I wanted something 'cleaner' and elegant, like an adjacency list or matrix in order to calculate the fitness (...but I don't know how to do that).
 
Code:
    for(i = 0; i < VOL; i++){
        connected[i] = 0;
        for(j=VOL-1;j>=i;j--){
            double prob=(rand()/(double)(RAND_MAX));
            if(prob < CONNECTIVITY){
                j = prob*VOL;

This assignment to j can't be right. You are already uning j for the loop, and I have also no idea why this statement is here at all.

I assume that CONNECTIVITY is 1 for debugging and that you will use some value <1 later.

I don't know what neighbour is supposed to mean. A node can have more than one neighbour. The information will get overwritten.
 
Thank you, Willem. I didn't understand what you mean about the CONNECTIVITY.
Yes, you have supposed right about neighbour: a node can have more than one neighbour. The probability of edges may change.

If I consider the following dfs function, I don't find the only nearest neighbour of my code:

Code:
int dfs(int i)
{
    int j;
    neighbours[i]=1;

    printf("%d->",i+1);

     for(j=0;j<VOL;j++)
   {
   if(matrix[j][i]==1&&neighbours[j]==0)
            dfs(j);
    }
    return j;
}
 
Last edited:
valesdn said:
Thank you, Willem. I didn't understand what you mean about the CONNECTIVITY.

It would appear that CONNECTIVITY is supposed to be the probability that any pair of nodes is connected (the constant p from G(n,p)).

Code:
for(i = 0; i < VOL; i++){
        connected[i] = 0;
        for(j=VOL-1;j>=i;j--){
            double prob=(rand()/(double)(RAND_MAX));
            if(prob < CONNECTIVITY) {
              martix[i][j] = 1;
              matrix[j][i] = 1;

You enumerated all the possible edges (i,j) here, and you use a random number, to find out wether they should be connected.
You seem to be about done here.
Unfortunately I still have no idea wat neigbours or dfs(i) is supposed to represent, so I have no idea wether this is correct.
 
willem2 said:
Unfortunately I still have no idea wat neigbours or dfs(i) is supposed to represent, so I have no idea wether this is correct.

dfs(i) is the depth first search function.
I would like to find the neighbourhood of a node(my source). However it doesn't work, because it prints all the numbers and not only the neighbours of a selected vertex.
 
valesdn said:
dfs(i) is the depth first search function.
I would like to find the neighbourhood of a node(my source). However it doesn't work, because it prints all the numbers and not only the neighbours of a selected vertex.
Your description of dfs() is still very vague. What does the parameter mean, and what does the return value mean?

If you want the neigbourhood from the node i, you need to test all potential edges from node i to all other nodes to see if they exist.
I can't see why you would want to use a recursive function for this. You can just use a simple loop.

Do you want to find ALL the other nodes connected to node, including indirect connections? dfs() seems to do that. How is the neighbours[] array initialized and reset?
 
willem2 said:
Your description of dfs() is still very vague. What does the parameter mean, and what does the return value mean?

This function returns the printf, where there is all information about the connection of all nodes with a selected one (source). However I should find only the nearest neighbours and possibile consider a directed graph.

willem2 said:
If you want the neigbourhood from the node i, you need to test all potential edges from node i to all other nodes to see if they exist.
I can't see why you would want to use a recursive function for this. You can just use a simple loop.

Sorry, Willem: what should it be into the loop? Because I didn't understand how to check the potential edges from a node i to all others ones.

[/QUOTE]

willem2 said:
Do you want to find ALL the other nodes connected to node, including indirect connections? dfs() seems to do that. How is the neighbours[] array initialized and reset?

Yes, I do. I would like to understand how I can consider the direct and indirect connections respectively.
neighbours[] array in the dfs() function is all there. It was initialised equals to 0.
 
  • #10
willem2 said:
Your description of dfs() is still very vague. What does the parameter mean, and what does the return value mean?

valesdn said:
This function returns the printf, where there is all information about the connection of all nodes with a selected one (source). However I should find only the nearest neighbours and possibile consider a directed graph.
"Returns the printf" makes no sense.
Here's the definition of dfs() that you showed in post #5.
C:
int dfs(int i)
{
    int j;
    neighbours[i]=1;

    printf("%d->",i+1);

     for(j=0;j<VOL;j++)
    {
       if(matrix[j][i] == 1 && neighbours[j] == 0)
            dfs(j);
    }
    return j;
}
Some questions:
1) How does this code tie into main() as you showed in post #1? There is no mention of dfs() in that code.
2) The code just above references the neighbours and matrix arrays. main() has local arrays named neighbour (different from neighbours) and matrix. Since neighbours and matrix aren't passed as parameters to dfs(), there must be some global arrays named neighbours and matrix somewhere that we haven't seen. It's very bad practice for a function to modify global variables as your dfs() function appears to be doing.
3) Your code for dfs() should have comments, one of which would give a bried explanation of what it does, and another that explains the meaning of the parameter i. In fact, there is not a single comment in any of the code you have shown here, which makes it very difficult for people reading your code to understand what you're trying to do. Further, comments can help the person writing the code keep focused on what the code is supposed to be doing.
willem2 said:
If you want the neigbourhood from the node i, you need to test all potential edges from node i to all other nodes to see if they exist.
I can't see why you would want to use a recursive function for this. You can just use a simple loop.
I don't see the purpose of the recursive call, either.

To get a better understanding of what should happen, I recommend that you start with a small graph of, say, 6 nodes, and draw paths between some of them. By hand, fill out your arrays. This might help you understand what your code needs to be doing.
 
  • Like
Likes valesdn

Similar threads

Replies
5
Views
3K
Replies
23
Views
8K
Replies
1
Views
4K
Replies
8
Views
2K
Replies
9
Views
4K
Replies
14
Views
4K
Replies
3
Views
2K
Back
Top