Why does not this Erdos-Renyi C code work?

Click For Summary

Discussion Overview

The discussion revolves around a C programming issue related to implementing an Erdos-Renyi random graph using an adjacency matrix. Participants are exploring the code's functionality, its compilation issues, and the logic behind the implementation of graph properties such as connectivity and neighbor identification.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Homework-related

Main Points Raised

  • One participant shares their C code for generating an Erdos-Renyi random graph and seeks help with errors encountered.
  • Another participant questions the logic behind the assignment of values to the variable 'j' within the nested loop, suggesting it may lead to confusion.
  • Concerns are raised about the use of the 'neighbour' array, with one participant noting that a node can have multiple neighbors, which may lead to data being overwritten.
  • There is a discussion about the meaning of the 'CONNECTIVITY' constant, with some participants suggesting it represents the probability of edge creation between nodes.
  • A participant describes the purpose of a depth-first search (dfs) function intended to find the neighborhood of a node but expresses confusion about its current implementation and output.
  • Another participant suggests that the dfs function's description is vague and questions how the neighbors array is initialized and reset.
  • There is a suggestion to use a simple loop instead of recursion for finding neighbors, with a request for clarification on how to check potential edges from a node to others.

Areas of Agreement / Disagreement

Participants express differing views on the implementation details, particularly regarding the handling of neighbors and the use of the CONNECTIVITY constant. The discussion remains unresolved, with multiple competing perspectives on how to properly implement the graph generation and neighbor identification.

Contextual Notes

There are limitations in the current code regarding the initialization of variables and the handling of edge probabilities, which have not been fully addressed. The discussion also highlights potential misunderstandings about the intended functionality of certain code components.

valesdn
Messages
42
Reaction score
1

Homework Statement



I need to write an Erdos-Renyi random graph, by using the adjacency matrix (or alternatively list) and calculate the fitness of the graph.
Definition: G(n, p) is a random graph with n vertices where each possible edge has probability p of existing.

Homework Equations



The number of edges in a G(n,p) graph is a random variable with a binomial expected value.

The Attempt at a Solution



I wrote the following C code:
Mod note: Edited by mentor to change code tag to code=c tag, and deleted extraneous spaces between lines.
C:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define VOL 30
#define CONNECTIVITY 1

int main(){

    int i , j , neighbour[VOL], connected[VOL];
    int matrix[VOL][VOL];

    for ( i = 0; i < VOL; i ++ ){
        for (  j = 0; j < VOL; j ++ ){
            matrix[i][j] = 0;
            matrix[j][i] = 0;
        }
    }

    srand((unsigned)time(NULL));

    int num_edges = 0;

    for(i = 0; i < VOL; i++){
        connected[i] = 0;
        for(j=VOL-1;j>=i;j--){
            double prob=(rand()/(double)(RAND_MAX));
            if(prob < CONNECTIVITY){
                j = prob*VOL;
                if(i != j ){
                    matrix[i][j] = 1;
                    matrix[j][i] = 1;
                    neighbour[i] = j;
                    neighbour[j] = i;
                    num_edges++;
                    printf("matrix[%d][%d]\n", i,j);
                }
                else{
                    j = prob*VOL;
                    matrix[i][j] = 1;
                    matrix[j][i] = 1;
                    neighbour[i] = j;
                    neighbour[j] = i;
                    connected[i] = 1;
                }
            }
        }
        return 0; 
   }
}

However there is something wrong. Any advice or help would be greatly appreciated. Thank you.
 
Last edited:
Physics news on Phys.org
valesdn said:

Homework Statement



I need to write an Erdos-Renyi random graph, by using the adjacency matrix (or alternatively list) and calculate the fitness of the graph.
Definition: G(n, p) is a random graph with n vertices where each possible edge has probability p of existing.

Homework Equations



The number of edges in a G(n,p) graph is a random variable with a binomial expected value.

The Attempt at a Solution



I wrote the following C code:

C:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define VOL 30
#define CONNECTIVITY 1

int main(){

    int i , j , neighbour[VOL], connected[VOL];
    int matrix[VOL][VOL];

    for ( i = 0; i < VOL; i ++ ){
        for (  j = 0; j < VOL; j ++ ){
            matrix[i][j] = 0;
            matrix[j][i] = 0;
        }
    }

    srand((unsigned)time(NULL));

    int num_edges = 0;

    for(i = 0; i < VOL; i++){
        connected[i] = 0;
        for(j=VOL-1;j>=i;j--){
            double prob=(rand()/(double)(RAND_MAX)); 
            if(prob < CONNECTIVITY){
                j = prob*VOL;
                if(i != j ){
                    matrix[i][j] = 1;
                    matrix[j][i] = 1;
                    neighbour[i] = j;
                    neighbour[j] = i;
                    num_edges++;
                    printf("matrix[%d][%d]\n", i,j);
                }
                else{
                    j = prob*VOL;
                    matrix[i][j] = 1;
                    matrix[j][i] = 1;
                    neighbour[i] = j;
                    neighbour[j] = i;
                    connected[i] = 1;
                }
            }
        }
        return 0;   
   }
}

However there is something wrong. Any advice or help would be greatly appreciated. Thank you.
What exactly is wrong? Does your program compile? If it compiles, do the results it produces differ from what you think it should be produced? If so, tell us what it should produce and what it actually produces.

The normal course of action is to use a debugger to single-step through the program, noting the values of variables, and comparing the expected values of them with their actual values as produced by your code.
 
Last edited:
Thank you, Mark44 for your reply.
It doesn't compile the adjacency matrix: the indices of the matrix are the edges between the nodes, but I didn't want to do that. I wanted something 'cleaner' and elegant, like an adjacency list or matrix in order to calculate the fitness (...but I don't know how to do that).
 
Code:
    for(i = 0; i < VOL; i++){
        connected[i] = 0;
        for(j=VOL-1;j>=i;j--){
            double prob=(rand()/(double)(RAND_MAX));
            if(prob < CONNECTIVITY){
                j = prob*VOL;

This assignment to j can't be right. You are already uning j for the loop, and I have also no idea why this statement is here at all.

I assume that CONNECTIVITY is 1 for debugging and that you will use some value <1 later.

I don't know what neighbour is supposed to mean. A node can have more than one neighbour. The information will get overwritten.
 
Thank you, Willem. I didn't understand what you mean about the CONNECTIVITY.
Yes, you have supposed right about neighbour: a node can have more than one neighbour. The probability of edges may change.

If I consider the following dfs function, I don't find the only nearest neighbour of my code:

Code:
int dfs(int i)
{
    int j;
    neighbours[i]=1;

    printf("%d->",i+1);

     for(j=0;j<VOL;j++)
   {
   if(matrix[j][i]==1&&neighbours[j]==0)
            dfs(j);
    }
    return j;
}
 
Last edited:
valesdn said:
Thank you, Willem. I didn't understand what you mean about the CONNECTIVITY.

It would appear that CONNECTIVITY is supposed to be the probability that any pair of nodes is connected (the constant p from G(n,p)).

Code:
for(i = 0; i < VOL; i++){
        connected[i] = 0;
        for(j=VOL-1;j>=i;j--){
            double prob=(rand()/(double)(RAND_MAX));
            if(prob < CONNECTIVITY) {
              martix[i][j] = 1;
              matrix[j][i] = 1;

You enumerated all the possible edges (i,j) here, and you use a random number, to find out wether they should be connected.
You seem to be about done here.
Unfortunately I still have no idea wat neigbours or dfs(i) is supposed to represent, so I have no idea wether this is correct.
 
willem2 said:
Unfortunately I still have no idea wat neigbours or dfs(i) is supposed to represent, so I have no idea wether this is correct.

dfs(i) is the depth first search function.
I would like to find the neighbourhood of a node(my source). However it doesn't work, because it prints all the numbers and not only the neighbours of a selected vertex.
 
valesdn said:
dfs(i) is the depth first search function.
I would like to find the neighbourhood of a node(my source). However it doesn't work, because it prints all the numbers and not only the neighbours of a selected vertex.
Your description of dfs() is still very vague. What does the parameter mean, and what does the return value mean?

If you want the neigbourhood from the node i, you need to test all potential edges from node i to all other nodes to see if they exist.
I can't see why you would want to use a recursive function for this. You can just use a simple loop.

Do you want to find ALL the other nodes connected to node, including indirect connections? dfs() seems to do that. How is the neighbours[] array initialized and reset?
 
willem2 said:
Your description of dfs() is still very vague. What does the parameter mean, and what does the return value mean?

This function returns the printf, where there is all information about the connection of all nodes with a selected one (source). However I should find only the nearest neighbours and possibile consider a directed graph.

willem2 said:
If you want the neigbourhood from the node i, you need to test all potential edges from node i to all other nodes to see if they exist.
I can't see why you would want to use a recursive function for this. You can just use a simple loop.

Sorry, Willem: what should it be into the loop? Because I didn't understand how to check the potential edges from a node i to all others ones.

[/QUOTE]

willem2 said:
Do you want to find ALL the other nodes connected to node, including indirect connections? dfs() seems to do that. How is the neighbours[] array initialized and reset?

Yes, I do. I would like to understand how I can consider the direct and indirect connections respectively.
neighbours[] array in the dfs() function is all there. It was initialised equals to 0.
 
  • #10
willem2 said:
Your description of dfs() is still very vague. What does the parameter mean, and what does the return value mean?

valesdn said:
This function returns the printf, where there is all information about the connection of all nodes with a selected one (source). However I should find only the nearest neighbours and possibile consider a directed graph.
"Returns the printf" makes no sense.
Here's the definition of dfs() that you showed in post #5.
C:
int dfs(int i)
{
    int j;
    neighbours[i]=1;

    printf("%d->",i+1);

     for(j=0;j<VOL;j++)
    {
       if(matrix[j][i] == 1 && neighbours[j] == 0)
            dfs(j);
    }
    return j;
}
Some questions:
1) How does this code tie into main() as you showed in post #1? There is no mention of dfs() in that code.
2) The code just above references the neighbours and matrix arrays. main() has local arrays named neighbour (different from neighbours) and matrix. Since neighbours and matrix aren't passed as parameters to dfs(), there must be some global arrays named neighbours and matrix somewhere that we haven't seen. It's very bad practice for a function to modify global variables as your dfs() function appears to be doing.
3) Your code for dfs() should have comments, one of which would give a bried explanation of what it does, and another that explains the meaning of the parameter i. In fact, there is not a single comment in any of the code you have shown here, which makes it very difficult for people reading your code to understand what you're trying to do. Further, comments can help the person writing the code keep focused on what the code is supposed to be doing.
willem2 said:
If you want the neigbourhood from the node i, you need to test all potential edges from node i to all other nodes to see if they exist.
I can't see why you would want to use a recursive function for this. You can just use a simple loop.
I don't see the purpose of the recursive call, either.

To get a better understanding of what should happen, I recommend that you start with a small graph of, say, 6 nodes, and draw paths between some of them. By hand, fill out your arrays. This might help you understand what your code needs to be doing.
 
  • Like
Likes   Reactions: valesdn

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 23 ·
Replies
23
Views
9K
  • · Replies 1 ·
Replies
1
Views
5K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 9 ·
Replies
9
Views
4K
  • · Replies 14 ·
Replies
14
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 21 ·
Replies
21
Views
4K