Solving R Homework: Sunspots Data

  • Thread starter Thread starter _N3WTON_
  • Start date Start date
  • Tags Tags
    Data Homework
AI Thread Summary
The discussion revolves around an R programming assignment focused on analyzing sunspots data, requiring tasks such as calculating sample mean and variance, creating histograms, and performing multiple replications. The user encountered errors related to data types, specifically that the mean function was applied to non-numeric values due to incorrect column labeling in the dataset. Suggestions were made to print the data values for debugging and to ensure the correct columns were referenced. The conversation also touched on the learning curve of programming, with participants sharing their own experiences and encouraging perseverance. Ultimately, the user successfully completed the assignment after troubleshooting the issues.
_N3WTON_
Messages
350
Reaction score
3

Homework Statement


I have an upcoming assignment for a Statistics/Probability class that requires me to write a program in R. The assignment requires me to do the following:
1. Obtain the sample mean x and sample variance s2 of the sunspots data.
2. Provide a histogram of the data.
3. For 10000 replications, randomly sample n sunspots observations from the given dataset. For each replication, obtain the sample mean. That is, you will have 10000 sample means. Compute the sample variance of these sample means.
4. Repeat the above process with n=10, 20, 30, 40, 50, 60, 70, 80, 90, and 100.
5. Obtain juxtaposed plots of the histograms of the means corresponding to n=10 and n=100.
6. Plot the variances as the function of n. What are your observations?

Homework Equations

The Attempt at a Solution


This is the code I have come up with thus far:
Code:
filename <- "C://Users//Colin//Desktop//Project1Data.txt"
data <- read.table(filename)
colnames(data) <- c("id","x")

#Questions 1 and 2
x <- data$x
mean(x)
var(x)
summary(data)
hist(x)

#partial answer to Questions 3 and 4
p <- numeric()

for (i in 1:10000){
s <- data[sample(1:1053,10,replace=FALSE),]
x <- s$x
y[i] <- mean(x)
p[i] <- y[i]
assign(paste("sample",i,sep=""),s)
assign(paste("mean",i,sep=""),y)
}
Unfortunately this code is generating a number of errors that I am unsure how to deal with (I was hoping somebody here could help). I suppose I should begin with the first error I am receiving: "
Warning message:
In mean.default(x) : argument is not numeric or logical: returning NA". Any help or advice would be greatly appreciated, thanks.
 
Physics news on Phys.org
Can you print the x values? That might give you a clue.

When you read in the flle the data object has three columns so data$x is an array of values. The question is are they strings or numbers. The mean() is looking for numbers.
 
jedishrfu said:
Can you print the x values? That might give you a clue.

When you read in the flle the data object has three columns so data$x is an array of values. The question is are they strings or numbers. The mean() is looking for numbers.
Here is a sample of the data set:
"x" "id"

"1" 33

"2" 81

"3" 7

"4" 38

"5" 113

"6" 92

"7" 18

"8" 24

"9" 100

"10" 89

"11" 14

"12" 26

"13" 19

"14" 32

"15" 7

"16" 58

"17" 1

"18" 30

"19" 41

"20" 32
the "x" values go until 1053 so I don't want to post them all here, but I have it saved in the above format in a .txt file
 
Okay you can see right there that the x value is a string not a number so that's why its failing. I think they want you to average the second column which is numeric. It makes no sense to find the mean of x here since it is just a row counter.
 
I see a mismatch you defined your columns as id and x whereas your data says x and id. Try switching the column labeling in your program at line 3
 
Thank you for that. However, I'm a little confused as to why I am getting results for variance but not mean despite the code being the same for both...
 
I apologize if some of these questions are sort of elementary, my only real knowledge of R comes from a crash course a few days ago from a tutorial I found online :/
 
  • #10
Don't feel,bad we all start programming somewhere and that means we get tripped up by some very simple things. My first programming at my high school on a fancy programmable desktop calculator and I couldn't figure out how to turn it on. The teacher had a chuckle but was impressed with my first program to compute the nth root of any number.
 
  • #11
jedishrfu said:
Don't feel,bad we all start programming somewhere and that means we get tripped up by some very simple things. My first programming at my high school on a fancy programmable desktop calculator and I couldn't figure out how to turn it on. The teacher had a chuckle but was impressed with my first program to compute the nth root of any number.
That's impressive. In my high school programming class we had a project to write a program on a TI-84 that finds the area under the curve using Riemann Sums. It finds the left sum, right sum, midpoint sum, trapezoidal sum and the definite integral. I still use it to this day :D
 
  • #12
My project was on a very limited desktop calculator circa 1970 that had programmable features for math only. You were really limited in what it could do and how much memory it had. In my program, I ran out of registers and so I had hit the enter key repeatedly for the next iteration because I had no register for the loop counter. It used something akin to the Newton approximation technique optimized for the machine.
 
  • #13
jedishrfu said:
You were really limited in what it could do and how much memory it had.
I always forget how spoiled we are today: for most basic programs memory really doesn't enter into the equation at all. I have a lot of respect for people good at programming, it takes a ton of patience. Right now I'm about ready to throw my computer out a window into the snow because I can't get this program working haha :D
 
  • #14
Learn how to use the print statement. It's one of the best debugging tools in a new environment like this. Don't trust your code write a few lines and test them and eventually you'll get through it.
 
  • #15
jedishrfu said:
Learn how to use the print statement. It's one of the best debugging tools in a new environment like this. Don't trust your code write a few lines and test them and eventually you'll get through it.
Thanks again for the advice. After a lot of trying I was able to successfully complete this assignment a few minutes ago :D
 
  • #16
That's great. Welcome to the programmers guild!
 

Similar threads

Replies
1
Views
2K
Replies
8
Views
2K
Replies
2
Views
2K
Replies
1
Views
4K
Replies
7
Views
2K
Replies
28
Views
3K
Replies
2
Views
2K
Replies
1
Views
1K
Back
Top