R for Beginners: Textbooks & Resources for Neurobiology

  • Thread starter cobalt124
  • Start date
  • Tags
    Beginner
  • #1
61
32
TL;DR Summary
R programming advice for a beginner
My daughter has just started her first year at university studying Neurobiology. She doesn’t have much experience with programming and is having to learn R (RStudio 3.0.2) from scratch for the statistical modelling/graph plotting parts of the course. She knows what she needs to do with her data but not how to express the requirement in R. Are there any textbooks/online resources anyone could recommend for a beginner using R?
 

Answers and Replies

  • #2
Google is your friend, searching on R tutorial will bring up many good resources for her to look at, in particular tutorialspoint is a good starting place:

https://www.tutorialspoint.com/r/index.htm

is R a course prerequisite?

If not there are other tools that may fit the bill such as MATLAB, Julia, or Python. A lot of stat modules are available for these more commonly used tools.

There are also some YouTube videos on R, here’s one such course

 
  • Like
Likes FactChecker
  • #4
Thanks for the links, I’ll pass them on. Yes RStudio is what she has to use. I did a Google search and checked the forum resources before posting. I mistakenly assumed R resources would be as common as for Python, that doesn’t seem to be the case. Think we’ll have to have a further search, thanks for the start.
 
  • #6
Look at "R in a Nutshell". The O'Reilly Nutshell series of books are usually good and the R book is a good reference.
 
  • #7
BTW the last time I looked for R books in a bookstore, they were not with the other programming books but were, instead, filed over in biology. Just in case you want to go browse books in person. Took me a while to find them.
 
  • #8
Your daughter might also benefit from browsing https://rosettacode.org ##-## she can check how a given problem can be solved in R, and then review solutions to the same problem in e.g. Mathematica or Python.

Here's an example:

rosettacode.org said:
R being a statistical computating language, the chi-squared test is built in with the function "chisq.test"
Code:
dset1=c(199809,200665,199607,200270,199649)
dset2=c(522573,244456,139979,71531,21461)
 
chi2IsUniform<-function(dataset,significance=0.05){
  chi2IsUniform=(chisq.test(dataset)$p.value>significance)
}
 
for (ds in list(dset1,dset2)){
  print(c("Data set:",ds))
  print(chisq.test(ds))
  print(paste("uniform?",chi2IsUniform(ds)))
}
Output:
[1] "Data set:" "199809" "200665" "199607" "200270" "199649"

Chi-squared test for given probabilities

data: ds
X-squared = 4.1463, df = 4, p-value = 0.3866

[1] "uniform? TRUE"
[1] "Data set:" "522573" "244456" "139979" "71531" "21461"

Chi-squared test for given probabilities

data: ds
X-squared = 790063.3, df = 4, p-value < 2.2e-16

[1] "uniform? FALSE"
The same problem solved in Mathematica:
rosettacode.org said:
This code explicity assumes a discrete uniform distribution since the chi square test is a poor test choice for continuous distributions and requires Mathematica version 2 or later
Code:
discreteUniformDistributionQ[data_, {min_Integer, max_Integer}, confLevel_: .05] :=
If[$VersionNumber >= 8, 
  confLevel <= PearsonChiSquareTest[data, DiscreteUniformDistribution[{min, max}]],
  Block[{v, k = max - min, n = Length@data},
   v = (k + 1) (Plus @@ (((Length /@ Split[Sort@data]))^2))/n - n;
   GammaRegularized[k/2, 0, v/2] <= 1 - confLevel]]
 
discreteUniformDistributionQ[data_] :=discreteUniformDistributionQ[data, data[[Ordering[data][[{1, -1}]]]]]
code used to create test data requires Mathematica version 6 or later

uniformData = RandomInteger[10, 100];
nonUniformData = Total@RandomInteger[10, {5, 100}];
{discreteUniformDistributionQ[uniformData],discreteUniformDistributionQ[nonUniformData]}
Output:
{True,False}
The same problem solved in Python:
rosettacode.org said:
Implements the Chi Square Probability function with an integration. I'm sure there are better ways to do this. Compare to OCaml implementation.
Python:
import math
import random
 
def GammaInc_Q( a, x):
    a1 = a-1
    a2 = a-2
    def f0( t ):
        return t**a1*math.exp(-t)
 
    def df0(t):
        return (a1-t)*t**a2*math.exp(-t)
 
    y = a1
    while f0(y)*(x-y) >2.0e-8 and y < x: y += .3
    if y > x: y = x
 
    h = 3.0e-4
    n = int(y/h)
    h = y/n
    hh = 0.5*h
    gamax = h * sum( f0(t)+hh*df0(t) for t in ( h*j for j in xrange(n-1, -1, -1)))
 
    return gamax/gamma_spounge(a)
 
c = None
def gamma_spounge( z):
    global c
    a = 12
 
    if c is None:
       k1_factrl = 1.0
       c = []
       c.append(math.sqrt(2.0*math.pi))
       for k in range(1,a):
          c.append( math.exp(a-k) * (a-k)**(k-0.5) / k1_factrl )
          k1_factrl *= -k
 
    accm = c[0]
    for k in range(1,a):
        accm += c[k] / (z+k)
    accm *= math.exp( -(z+a)) * (z+a)**(z+0.5)
    return accm/z;
 
def chi2UniformDistance( dataSet ):
    expected = sum(dataSet)*1.0/len(dataSet)
    cntrd = (d-expected for d in dataSet)
    return sum(x*x for x in cntrd)/expected
 
def chi2Probability(dof, distance):
    return 1.0 - GammaInc_Q( 0.5*dof, 0.5*distance)
 
def chi2IsUniform(dataSet, significance):
    dof = len(dataSet)-1
    dist = chi2UniformDistance(dataSet)
    return chi2Probability( dof, dist ) > significance
 
dset1 = [ 199809, 200665, 199607, 200270, 199649 ]
dset2 = [ 522573, 244456, 139979,  71531,  21461 ]
 
for ds in (dset1, dset2):
    print "Data set:", ds
    dof = len(ds)-1
    distance =chi2UniformDistance(ds)
    print "dof: %d distance: %.4f" % (dof, distance),
    prob = chi2Probability( dof, distance)
    print "probability: %.4f"%prob,
    print "uniform? ", "Yes"if chi2IsUniform(ds,0.05) else "No"
Output:
Data set: [199809, 200665, 199607, 200270, 199649]
dof: 4 distance: 4.146280 probability: 0.3866 uniform? Yes
Data set: [522573, 244456, 139979, 71531, 21461]
dof: 4 distance: 790063.275940 probability: 0.0000 uniform? No
 
  • #9
R is respected and well established as a statistics package. I would not advise her to worry about other languages unless it was unavoidable.
 
Last edited:
  • #10
R is respected and well established as a statistics package. I would not advise her to worry about other languages unless it was unavoidable.
I think that it's clear from the direct comparison that the chi square function is more easily implemented in R ##-## the OP said that R was mandatory, anyway ##-## I presume that merely casually reading other languages for comparison purposes, as distinguished from coding in them, would not cause his daughter to become too distracted ##-## I agree that it's important to keep focused when learning a new programming language; however, I also suppose that seeing how things are done in other languages can help to bring about a valuable sense of perspective.
 
  • Like
Likes FactChecker

Suggested for: R for Beginners: Textbooks & Resources for Neurobiology

Replies
2
Views
746
Replies
23
Views
1K
Replies
1
Views
369
2
Replies
37
Views
2K
Replies
19
Views
1K
Replies
17
Views
972
Replies
10
Views
852
Back
Top