1. Not finding help here? Sign up for a free 30min tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

R Programming

  1. Feb 21, 2015 #1
    1. The problem statement, all variables and given/known data
    I have an upcoming assignment for a Statistics/Probability class that requires me to write a program in R. The assignment requires me to do the following:
    1. Obtain the sample mean x and sample variance s2 of the sunspots data.
    2. Provide a histogram of the data.
    3. For 10000 replications, randomly sample n sunspots observations from the given dataset. For each replication, obtain the sample mean. That is, you will have 10000 sample means. Compute the sample variance of these sample means.
    4. Repeat the above process with n=10, 20, 30, 40, 50, 60, 70, 80, 90, and 100.
    5. Obtain juxtaposed plots of the histograms of the means corresponding to n=10 and n=100.
    6. Plot the variances as the function of n. What are your observations?

    2. Relevant equations


    3. The attempt at a solution
    This is the code I have come up with thus far:
    Code (Text):
    filename <- "C://Users//Colin//Desktop//Project1Data.txt"
    data <- read.table(filename)
    colnames(data) <- c("id","x")

    #Questions 1 and 2
    x <- data$x
    mean(x)
    var(x)
    summary(data)
    hist(x)

    #partial answer to Questions 3 and 4
    p <- numeric()

    for (i in 1:10000){
    s <- data[sample(1:1053,10,replace=FALSE),]
    x <- s$x
    y[i] <- mean(x)
    p[i] <- y[i]
    assign(paste("sample",i,sep=""),s)
    assign(paste("mean",i,sep=""),y)
    }
    Unfortunately this code is generating a number of errors that I am unsure how to deal with (I was hoping somebody here could help). I suppose I should begin with the first error I am receiving: "
    Warning message:
    In mean.default(x) : argument is not numeric or logical: returning NA". Any help or advice would be greatly appreciated, thanks.
     
  2. jcsd
  3. Feb 21, 2015 #2

    jedishrfu

    Staff: Mentor

    Can you print the x values? That might give you a clue.

    When you read in the flle the data object has three columns so data$x is an array of values. The question is are they strings or numbers. The mean() is looking for numbers.
     
  4. Feb 21, 2015 #3
    Here is a sample of the data set:
    the "x" values go until 1053 so I don't want to post them all here, but I have it saved in the above format in a .txt file
     
  5. Feb 21, 2015 #4

    jedishrfu

    Staff: Mentor

    Okay you can see right there that the x value is a string not a number so that's why its failing. I think they want you to average the second column which is numeric. It makes no sense to find the mean of x here since it is just a row counter.
     
  6. Feb 21, 2015 #5

    jedishrfu

    Staff: Mentor

    I see a mismatch you defined your columns as id and x whereas your data says x and id. Try switching the column labeling in your program at line 3
     
  7. Feb 21, 2015 #6
    Thank you for that. However, I'm a little confused as to why I am getting results for variance but not mean despite the code being the same for both....
     
  8. Feb 21, 2015 #7
    I apologize if some of these questions are sort of elementary, my only real knowledge of R comes from a crash course a few days ago from a tutorial I found online :/
     
  9. Feb 21, 2015 #8

    jedishrfu

    Staff: Mentor

  10. Feb 21, 2015 #9
  11. Feb 21, 2015 #10

    jedishrfu

    Staff: Mentor

    Don't feel,bad we all start programming somewhere and that means we get tripped up by some very simple things. My first programming at my high school on a fancy programmable desktop calculator and I couldn't figure out how to turn it on. The teacher had a chuckle but was impressed with my first program to compute the nth root of any number.
     
  12. Feb 21, 2015 #11
    That's impressive. In my high school programming class we had a project to write a program on a TI-84 that finds the area under the curve using Riemann Sums. It finds the left sum, right sum, midpoint sum, trapezoidal sum and the definite integral. I still use it to this day :D
     
  13. Feb 21, 2015 #12

    jedishrfu

    Staff: Mentor

    My project was on a very limited desktop calculator circa 1970 that had programmable features for math only. You were really limited in what it could do and how much memory it had. In my program, I ran out of registers and so I had hit the enter key repeatedly for the next iteration because I had no register for the loop counter. It used something akin to the Newton approximation technique optimized for the machine.
     
  14. Feb 21, 2015 #13
    I always forget how spoiled we are today: for most basic programs memory really doesn't enter into the equation at all. I have a lot of respect for people good at programming, it takes a ton of patience. Right now I'm about ready to throw my computer out a window into the snow because I can't get this program working haha :D
     
  15. Feb 21, 2015 #14

    jedishrfu

    Staff: Mentor

    Learn how to use the print statement. It's one of the best debugging tools in a new environment like this. Don't trust your code write a few lines and test them and eventually you'll get through it.
     
  16. Feb 22, 2015 #15
    Thanks again for the advice. After a lot of trying I was able to successfully complete this assignment a few minutes ago :D
     
  17. Feb 22, 2015 #16

    jedishrfu

    Staff: Mentor

    That's great. Welcome to the programmers guild!
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: R Programming
  1. R square (Replies: 4)

  2. Solve in R (Replies: 12)

  3. Find r (Replies: 2)

  4. Sum of r(r+1) (Replies: 10)

  5. Defining R (Replies: 15)

Loading...