A C program to determine the most common birthday

Click For Summary
SUMMARY

The discussion focuses on a C program designed to determine the most common birthday from a file containing dates in the format mm/dd/yyyy. The initial approach involved using separate arrays for counting occurrences of months and days, which was identified as flawed. The solution proposed involves using a 2D array to track the counts of specific day/month combinations and suggests reading the filename as a command-line argument for better file handling. Key issues identified include segmentation faults due to array index out-of-bounds errors and unnecessary initializations of arrays.

PREREQUISITES
  • Understanding of C programming syntax and structures
  • Familiarity with array manipulation in C
  • Knowledge of file handling in C, including command-line arguments
  • Basic debugging techniques for runtime errors in C
NEXT STEPS
  • Learn about C command-line arguments and how to handle file input
  • Study array bounds checking and its importance in C programming
  • Explore debugging techniques for identifying segmentation faults in C
  • Research efficient data structures for counting occurrences, such as hash tables
USEFUL FOR

C programmers, computer science students, and anyone interested in file processing and data analysis using C.

toforfiltum
Messages
341
Reaction score
4

Homework Statement


Write a program that opens a file of the users choice that contains a list of birthdays. Extract from this file two things: (1) the date with the most common birthday (all of them) and (2) the month with the most people born. We will not test for a tie in either of these statistics. The file tested will always be in the format:
mm/dd/yyyy

Homework Equations


bday1.txt file is uploaded, second one is too large but of same format.

The Attempt at a Solution


Ok, at first I tried to come up separate array of counters for the month and day, thinking that the highest for both would give me the most common birthday. But that logic is flawed. I may have lesser months or days but a higher occurrence of the particular same month and day.

So, I realize I must create a 2D array counter for birthdays and months, and another array for months.
I hope my logic here is right so far.

Therefore, this is my code:
C:
#include <stdio.h>

int main ( void )
{
    int month [12] = {0}, m_d [12][31] = {0},i = 0, flag, store_m = 0, store_d = 0, count_month [12] = {0}, count_m_d [12][31] = {0}, trash;
    int largest_m, largest_bd, a, j;
   
    for ( i = 0; i < 12; i++ )
    {
        month [i] = i+1;
    }
   
    for ( i = 0; i < 31; i++ )
    {
        for ( int j = 0; j < 31; j++ )
        {
            m_d [i][j] = j+1;
        }
    }
   
    flag = scanf("%d/%d", &store_m, &store_d );
   
    while ( flag != EOF )
    {
        for ( i = 0; i < 12; i++ )
        {
            if ( store_m == month [i] )
            {
                count_month [i]++;
                for ( int j = 0; j < 31; j++ )
                {
                    if ( store_d  == m_d [i][j] )
                    {
                        count_m_d [i][j]++;
                    }
                }
            }
        }
   
       
       
    scanf("/%d", &trash );
    flag = scanf("%d/%d", &store_m, &store_d );
   
    }
   
    largest_m = count_month [0];
   
    for ( int i = 0; i < 12; i++ )
    {
        if ( largest_m < count_month [i] )
        {
            largest_m = count_month [i];
           
        }
    }
    largest_m = month [i];
   
    largest_bd = count_m_d [0][0];
   
    for ( a = 0; a < 12; a++ )
    {
        for ( j = 0; j < 31; j++ )
        {
            if ( largest_bd < count_m_d [a][j] )
            {
                largest_bd = count_m_d [a][j];
       
            }
        }
    }
    largest_bd = m_d [a][j];
    a = month [a];
   
    printf("Most common birthday: \n");
    printf("%2d/%2d\n", a, largest_bd );
   
    printf("Most common birthday month: \n");
    printf("%d", largest_m );
   
    return 0;
}

The code could compile but during runtime, it has segmentation fault. Using UNIX to run this file, I could redirect input from stdin to the file using '<'. I'm suspecting the fault lies in a = month [a]? Because that's the value that I want, not the total count, but I need the count to determine the month. Am I doing this right?
 

Attachments

Physics news on Phys.org
Code:
for ( i = 0; i < 31; i++ )
    {
        for ( int j = 0; j < 31; j++ )
        {
            m_d [i][j] = j+1;
        }
    }
You access m_d with a first index beyond 11.

I don't understand what you want to do in this step. Where is the point of m_d if you just want m_d[i][j] to be j+1?

count_m_d is all you need. count_m is optional. Everything else is unnecessary.
 
1) As mfb said, your month[j] is always equal to j, and your m_d[ i][j] is always j, so why use those arrays.
2) You have bounds checked store_m and store_d by comparing each to every valid value. But why not simply make sure that each is greated than or equal to 1 and less than or equal to 12 or 31?
3) Once you have bound check store_m and store_d, you can increment count_month[store_m-1] and count_m_d[store_m-1][store_d-1]. Or, if you don't like subtracting the 1's, dimension your arrays by 13 and 32.
 
Last edited by a moderator:
toforfiltum said:
The code could compile but during runtime, it has segmentation fault.

Well just at a glance: 1) you're writing to array elements that don't exist, and 2) you create two arrays that are presumably meant to count how many times each month and each day/month pair appear in the file, but then you have two loops at the start of your program in which you do month [i] = i+1; and m_d [i][j] = j+1; -- why? If you want to count them then surely you want the counters to start at zero, no? Then at the end just scan both arrays to find which month and which day/month pair have the highest counts.
Using UNIX to run this file, I could redirect input from stdin to the file using '<'.

You're probably meant to open and read from the file from C, which you can do with standard library functions.

This means you need to read the filename from somewhere. The normal way to do this would be to read the filename as a command-line argument, so you would run your program from the terminal like this:
Code:
$ ./my_program my_file.txt
and the program would try to open "my_file.txt" (if it exists). C provides an easy and standard way to access the command-line arguments your program was called with (if any); googling for "C command line arguments" or "argc argv" should turn up explanations.

Just make sure to check for errors if you do this (i.e., if the program was run without a command line argument or there's an error attempting to open the file) and do reasonable things in these cases (e.g., read from standard input by default and/or quit with an error message).
 
Last edited:
Unrelated to the current segfault, but as the problem will come up anyway later:
Code:
    largest_m = count_month [0];
   
    for ( int i = 0; i < 12; i++ )
    {
        if ( largest_m < count_month [i] )
        {
            largest_m = count_month [i];
           
        }
    }
    largest_m = month [i];
What do you expect this code to do?
Follow its execution step by step: What will it actually do? In particular, what will the value of "i" be when the last line is executed? And what do all the other lines do?

Same problem for the days.
 

Similar threads

  • · Replies 17 ·
Replies
17
Views
2K
  • · Replies 9 ·
Replies
9
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 21 ·
Replies
21
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K