Programming language in Biology

In summary: I think it can be useful to learn C, as it's a very versatile language. It can be used for a lot of different tasks, and is very fast. However, it's not always the most convenient language to use, and you may find that you need to learn a few other languages in addition to it to do certain tasks.
  • #1
Monique
Staff Emeritus
Science Advisor
Gold Member
4,219
67
I am wondering, does it give one an edge to know programming language in Biomolecular Sciences? For instance if a certain analysis, maybe statistical, needs to be done.. and you can write your own little program..

If so, which languages would be the best to master? Visual Basics, C++, Oracle..?
 
Biology news on Phys.org
  • #2
I guess Oracle is not a programming language, but a database.. which different kinds of databases are there?
 
  • #3
Most databases nowadays are relational, at least in part. And most of them accept some dialect of the database language SQL (pronounced sequel and standing for structured query language). Oracle and SQL Server are proprietary database products and Mydb is an open source (I believe) database. The Microsoft product Access is sometimes called a database product, making old line DBAs like yours truly smile.

An older type of database was the network type, of which I believe IDMS still survives.
 
  • #4
C is probably the ideal language for writing little, fast programs.
 
  • #5
Originally posted by Monique
I am wondering, does it give one an edge to know programming language in Biomolecular Sciences? For instance if a certain analysis, maybe statistical, needs to be done.. and you can write your own little program..

If so, which languages would be the best to master? Visual Basics, C++, Oracle..?

Being able to program can often give one an edge in any scientific field. But unless you do specifically computational research, you often have students or assistants or something who can write programs for you (unless you are the student or assistant yourself).

As for analyses, most working scientists use sophisticated analysis software as opposed to programming the raw analysis themselves. For instance, statistical analyses are often done in something like R, S, or SPLUS. Differential equations are often solved analytically in Mathematica or Maple, numerically in Matlab, etc. Of course, those packages are themselves specialized programming languages, but for common analyses, the required programming ability is minimal: you just type a command (or select one from a menu), and it does it. If you do something complicated enough, you will need to learn how to program them.

Raw programming in general-purpose programming languages is still necessary sometimes, though. Many of the scientific analysis languages are slow. So knowledge of a fast language like C++ is very useful. I wouldn't bother with something like Visual Basic; it's not that fast, and if speed isn't a concern, it's better to work with a special-purpose analysis package that has lots of built-in features relevant to your task. (VB isn't bad if you need to make something with graphical user interfaces and such, but there is also software that does things like that, such as LabView.)

The Perl and Python programming languages, while not nearly as fast as C, are gaining popularity in bioinformatics because they're easier to learn than C, and have very good string-processing abilities (searching, matching, comparing, etc. -- good for genomics). They're not as good for pure number-crunching calculations.

Overall, I think C/C++ is a good general language to pick up, as kind of a default -- there is are a lot of code libraries available for it, and it's fast, even though it's not always the most convenient.

It all depends on your needs.. I use Python a lot for my research, which doesn't involve a lot of number-crunching. For a colleague of mine, I spent a couple weekends cobbling together a little C program to do a statistical data analysis for a kind of non-standard problem he is studying -- since he doesn't know how to program, he couldn't do a proper analysis himself. It can be very useful at times ... but other people never need to know how to program.

So.. I guess, if you have the time, pick some toy problem to analyze, and teach yourself enough of a language like C++ or Python to do the analysis.. it may pay off in the future.

(If you tell me more specifically what kinds of analysis a molbio person is likely to want to do, I might be able to give more specific advice.)
 
  • #6
By the way, unless you go into some heavily data-processing intensive field like bioinformatics, you're not likely to work with databases. And even then, unless you're working on a really big dataset, you probably won't need to learn to program databases in SQL or anything; most programming languages (C, Perl, Python, etc.) include bindings to easily manipulate databases from within the language, so you don't have to interact with them directly.
 
  • #7
Whoa, thank you for being so helpfull. I have a question though, what is meant with a 'fast' programming language?

I still have to process all that you just said, but I can ask the following: is it usefull to start with learning C before going to C++?
I hiked along with a friend who was taking an introductory C course. The course wasn't that well set up, just a new programming problem was given each week that had to be solved (eg. you have ten students, write a program to enter their grades and calculate the mean, etc.)

VB has been helpfull for me in Excel, where with a little programming knowledge I was able to modify a macro so that it would loop infinately (with some lingo I had learned from the C class).

Oracle I also ran into at work, where a computer engineer had developed a database for me for entering and safely storing data (Access didn't cut it and Excel is too easy to mess up), I would have loved to've modify the database myself..
 
  • #8
This will be a very silly question

With the databases/languages you mentioned, in what kind of environment are they written and operated?

Is an ordinary PC with Windows XP able to handle them? I think with the C course we had to log into a different computer through telnet and download software from the internet (the compiler I guess).

The computer engineer who worked on the Oracle database was sitting at a Windows NT computer.. but I could use the database at an XP computer, and the data was sent to again a different computer which I didn't have access to..
 
  • #9
"Fast" languages are ones whose programs execute in a relatively short amount of time. They are usually compiled (meaning that they are translated into the computer's native machine language before execution). Interpreted languages, which read a line of text, translate it, execute it, then read another line of text, etc. are slower, but often more flexible.

C is just a subset of C++, so you can start out just by learning C, maybe mixing in a few C++ features here and there when convenient. (e.g., C++'s input/output routines are more convenient than C's.) The main difference between the languages is C++'s addition of object-oriented programming, but you don't need to start out worrying about that.

VB is a good language if you want to write something to control other Microsoft software; if you don't, then scripting languages like Python or Perl are usually better.

I stand corrected: I've never met a practicing scientist who liked keeping their data in a database, as opposed to a spreadsheet or a plain text file. Big databases are often accessed from a central server over a network, but you can set up a small database on your own computer. I think most of the good free databases are for Unix, though. Most other major software, like programming languages, you can get for Windows, especially the commercial software.

(There is a lot of commercial scientific software that is only for Unix, but that's less true for biology-related software. Biologists are some of the most Windows-oriented scientists, whereas physicists and astronomers are some of the most Unix-oriented.)

As for downloading, well, Windows doesn't come with much in the way of programming languages or utilities, so you will either have to buy or download software. (Unix always comes with a C compiler, and usually a bunch of scripting languages like Python or Perl, and generally is more programmer-friendly than Windows.)
 
  • #10
I echo Ambitwistor's comments on C vs C++. While you will probably only need C to do what you want to do, C++ offers simple alternatives to several things that are more convenient than C. Ambitwistor mentioned input/output. In addition, C++ style memory allocation/deallocation is a little simpler to write, and C++ has a real string type, a feature that C lacks... string manipulation in C is fast and flexible, but it is probably one of the most difficult things to learn as a beginning programmer.


But in any case, you will probably want to query people who work in your field; if every biomolecular scientest in the world uses Perl for everything, then you should definitely start learning that simply because it would be easier to understand/use what others have written. I don't know if that would be important to you at all, though...


If you want to DL a free C/C++ compiler for windows, I know of two options...

(a) Download "cygwin" which is a unix emulator for windows. You can then use "gcc" in this environment which is probably the best free c/c++ compiler.

(b) Download "djgpp" which is a port of gcc over to dos. I don't remember how up to date it is, nor how good of a conversion it is. On the upside, with it you can get the free integrated development environment "rhide"
 
Last edited:
  • #11
Well as someone programming almost whole my life (nothing professional...do you remember good old C64 fatty :), and the spectrum) I can tell you, that the safest thing is a C++. Any kind of foundation is good, basic, Pascal, vbasic, c etc... Because you will find very similar/same logic everywhere just different syntax... Of course there’s a thing called object programming you’ll face dealing with C++, but it’s not some kind of great obstacle, after a while you’ll find that way of thinking useful and economic.

Even today I see many physicians working with the Fortran for their simulations (even though there’s better solutions (sometimes)), but there’s no mainstream of that kind in molecular biology (tj. problems being solved with the computers). But C++ is mainstream in today’s serious programming (you could use it for many things, and it’s great starting point for some other great languages as PHP...)
+ There are a bunch of great eBooks abut C/C++(almost every good book could be found in eBook form), or if you’re more comfortable with real books (who isn’t :)) it’s no problem learning it by your own, believe me...
 

1. What is the purpose of using programming language in biology?

Programming language in biology allows scientists to analyze and manipulate large amounts of biological data quickly and efficiently. It also helps in the development of models and simulations that can aid in understanding complex biological processes.

2. What are the most commonly used programming languages in biology?

The most commonly used programming languages in biology are R, Python, and MATLAB. These languages have a wide range of libraries and tools that are specifically designed for biological data analysis and modeling.

3. How is programming language used in genetic research?

Programming language is used in genetic research to analyze and interpret large genomic datasets, identify genetic variations, and develop predictive models for disease risk. It is also used in gene expression analysis and genome annotation.

4. Can programming language be used in other areas of biology besides genetics?

Yes, programming language can be used in other areas of biology such as ecology, evolution, and bioinformatics. Many researchers use programming to analyze ecological data, simulate evolution, and develop computational tools for biological data analysis.

5. Do biologists need to have programming skills to use programming language in their research?

While it is not necessary for all biologists to have advanced programming skills, having a basic understanding of programming can be beneficial in utilizing programming language for data analysis and model development. Many universities now offer bioinformatics courses for biologists to learn programming language specifically for biology-related research.

Similar threads

  • Biology and Medical
Replies
5
Views
823
  • Programming and Computer Science
Replies
16
Views
1K
  • STEM Academic Advising
Replies
12
Views
1K
  • Biology and Medical
Replies
1
Views
902
  • Set Theory, Logic, Probability, Statistics
2
Replies
40
Views
6K
  • Programming and Computer Science
Replies
15
Views
1K
  • Programming and Computer Science
12
Replies
397
Views
13K
  • Programming and Computer Science
Replies
8
Views
867
  • Programming and Computer Science
2
Replies
59
Views
7K
  • Programming and Computer Science
2
Replies
69
Views
4K
Back
Top