R for Beginners: Textbooks & Resources for Neurobiology

cobalt124 · Oct 17, 2020

My daughter has just started her first year at university studying Neurobiology. She doesn’t have much experience with programming and is having to learn R (RStudio 3.0.2) from scratch for the statistical modelling/graph plotting parts of the course. She knows what she needs to do with her data but not how to express the requirement in R. Are there any textbooks/online resources anyone could recommend for a beginner using R?

jedishrfu · Oct 17, 2020

Google is your friend, searching on R tutorial will bring up many good resources for her to look at, in particular tutorialspoint is a good starting place:

https://www.tutorialspoint.com/r/index.htm

is R a course prerequisite?

If not there are other tools that may fit the bill such as MATLAB, Julia, or Python. A lot of stat modules are available for these more commonly used tools.

There are also some YouTube videos on R, here’s one such course

jedishrfu · Oct 18, 2020

Another summarized resource for R:

https://learnxinyminutes.com/docs/r/

cobalt124 · Oct 19, 2020

Thanks for the links, I’ll pass them on. Yes RStudio is what she has to use. I did a Google search and checked the forum resources before posting. I mistakenly assumed R resources would be as common as for Python, that doesn’t seem to be the case. Think we’ll have to have a further search, thanks for the start.

pbuk · Oct 19, 2020

I think this 2-day course would be good: https://training.csx.cam.ac.uk/bioinformatics/course/bioinfo-introRbio, however it is not free and the only one currently timetabled is full.

FactChecker · Oct 30, 2020

Look at "R in a Nutshell". The O'Reilly Nutshell series of books are usually good and the R book is a good reference.

harborsparrow · Oct 31, 2020

BTW the last time I looked for R books in a bookstore, they were not with the other programming books but were, instead, filed over in biology. Just in case you want to go browse books in person. Took me a while to find them.

sysprog · Oct 31, 2020

Your daughter might also benefit from browsing https://rosettacode.org ##-## she can check how a given problem can be solved in R, and then review solutions to the same problem in e.g. Mathematica or Python.

Here's an example:

https://rosettacode.org/wiki/Verify_distribution_uniformity/Chi-squared_test#R

rosettacode.org said:

R being a statistical computating language, the chi-squared test is built in with the function "chisq.test"

Code:

dset1=c(199809,200665,199607,200270,199649)
dset2=c(522573,244456,139979,71531,21461)
 
chi2IsUniform<-function(dataset,significance=0.05){
  chi2IsUniform=(chisq.test(dataset)$p.value>significance)
}
 
for (ds in list(dset1,dset2)){
  print(c("Data set:",ds))
  print(chisq.test(ds))
  print(paste("uniform?",chi2IsUniform(ds)))
}

Output:

[1] "Data set:" "199809" "200665" "199607" "200270" "199649"

Chi-squared test for given probabilities

data: ds
X-squared = 4.1463, df = 4, p-value = 0.3866

[1] "uniform? TRUE"
[1] "Data set:" "522573" "244456" "139979" "71531" "21461"

Chi-squared test for given probabilities

data: ds
X-squared = 790063.3, df = 4, p-value < 2.2e-16

[1] "uniform? FALSE"

The same problem solved in Mathematica:

https://rosettacode.org/wiki/Verify_distribution_uniformity/Chi-squared_test#Mathematica

rosettacode.org said:

This code explicity assumes a discrete uniform distribution since the chi square test is a poor test choice for continuous distributions and requires Mathematica version 2 or later

Code:

discreteUniformDistributionQ[data_, {min_Integer, max_Integer}, confLevel_: .05] :=
If[$VersionNumber >= 8, 
  confLevel <= PearsonChiSquareTest[data, DiscreteUniformDistribution[{min, max}]],
  Block[{v, k = max - min, n = Length@data},
   v = (k + 1) (Plus @@ (((Length /@ Split[Sort@data]))^2))/n - n;
   GammaRegularized[k/2, 0, v/2] <= 1 - confLevel]]
 
discreteUniformDistributionQ[data_] :=discreteUniformDistributionQ[data, data[[Ordering[data][[{1, -1}]]]]]
code used to create test data requires Mathematica version 6 or later

uniformData = RandomInteger[10, 100];
nonUniformData = Total@RandomInteger[10, {5, 100}];
{discreteUniformDistributionQ[uniformData],discreteUniformDistributionQ[nonUniformData]}

Output:

{True,False}

The same problem solved in Python:

https://rosettacode.org/wiki/Verify_distribution_uniformity/Chi-squared_test#Python

rosettacode.org said:

Implements the Chi Square Probability function with an integration. I'm sure there are better ways to do this. Compare to OCaml implementation.

Python:

import math
import random
 
def GammaInc_Q( a, x):
    a1 = a-1
    a2 = a-2
    def f0( t ):
        return t**a1*math.exp(-t)
 
    def df0(t):
        return (a1-t)*t**a2*math.exp(-t)
 
    y = a1
    while f0(y)*(x-y) >2.0e-8 and y < x: y += .3
    if y > x: y = x
 
    h = 3.0e-4
    n = int(y/h)
    h = y/n
    hh = 0.5*h
    gamax = h * sum( f0(t)+hh*df0(t) for t in ( h*j for j in xrange(n-1, -1, -1)))
 
    return gamax/gamma_spounge(a)
 
c = None
def gamma_spounge( z):
    global c
    a = 12
 
    if c is None:
       k1_factrl = 1.0
       c = []
       c.append(math.sqrt(2.0*math.pi))
       for k in range(1,a):
          c.append( math.exp(a-k) * (a-k)**(k-0.5) / k1_factrl )
          k1_factrl *= -k
 
    accm = c[0]
    for k in range(1,a):
        accm += c[k] / (z+k)
    accm *= math.exp( -(z+a)) * (z+a)**(z+0.5)
    return accm/z;
 
def chi2UniformDistance( dataSet ):
    expected = sum(dataSet)*1.0/len(dataSet)
    cntrd = (d-expected for d in dataSet)
    return sum(x*x for x in cntrd)/expected
 
def chi2Probability(dof, distance):
    return 1.0 - GammaInc_Q( 0.5*dof, 0.5*distance)
 
def chi2IsUniform(dataSet, significance):
    dof = len(dataSet)-1
    dist = chi2UniformDistance(dataSet)
    return chi2Probability( dof, dist ) > significance
 
dset1 = [ 199809, 200665, 199607, 200270, 199649 ]
dset2 = [ 522573, 244456, 139979,  71531,  21461 ]
 
for ds in (dset1, dset2):
    print "Data set:", ds
    dof = len(ds)-1
    distance =chi2UniformDistance(ds)
    print "dof: %d distance: %.4f" % (dof, distance),
    prob = chi2Probability( dof, distance)
    print "probability: %.4f"%prob,
    print "uniform? ", "Yes"if chi2IsUniform(ds,0.05) else "No"

Output:

Data set: [199809, 200665, 199607, 200270, 199649]
dof: 4 distance: 4.146280 probability: 0.3866 uniform? Yes
Data set: [522573, 244456, 139979, 71531, 21461]
dof: 4 distance: 790063.275940 probability: 0.0000 uniform? No

FactChecker · Oct 31, 2020

R is respected and well established as a statistics package. I would not advise her to worry about other languages unless it was unavoidable.

sysprog · Oct 31, 2020

FactChecker said:

R is respected and well established as a statistics package. I would not advise her to worry about other languages unless it was unavoidable.

I think that it's clear from the direct comparison that the chi square function is more easily implemented in R ##-## the OP said that R was mandatory, anyway ##-## I presume that merely casually reading other languages for comparison purposes, as distinguished from coding in them, would not cause his daughter to become too distracted ##-## I agree that it's important to keep focused when learning a new programming language; however, I also suppose that seeing how things are done in other languages can help to bring about a valuable sense of perspective.

R for Beginners: Textbooks & Resources for Neurobiology

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Use of AI (ML/DL) in Science

Other than just FizzBuzz to test programmer candidates

Sweetspot of data compression

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

HTML/CSS Problems with DNS records

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect