Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Programming language in Biology

  1. Nov 1, 2003 #1

    Monique

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    I am wondering, does it give one an edge to know programming language in Biomolecular Sciences? For instance if a certain analysis, maybe statistical, needs to be done.. and you can write your own little program..

    If so, which languages would be the best to master? Visual Basics, C++, Oracle..?
     
  2. jcsd
  3. Nov 1, 2003 #2

    Monique

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    I guess Oracle is not a programming language, but a database.. which different kinds of databases are there?
     
  4. Nov 1, 2003 #3

    selfAdjoint

    User Avatar
    Staff Emeritus
    Gold Member
    Dearly Missed

    Most databases nowadays are relational, at least in part. And most of them accept some dialect of the database language SQL (pronounced sequel and standing for structured query language). Oracle and SQL Server are proprietary database products and Mydb is an open source (I believe) database. The Microsoft product Access is sometimes called a database product, making old line DBAs like yours truly smile.

    An older type of database was the network type, of which I believe IDMS still survives.
     
  5. Nov 1, 2003 #4

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    C is probably the ideal language for writing little, fast programs.
     
  6. Nov 1, 2003 #5
    Being able to program can often give one an edge in any scientific field. But unless you do specifically computational research, you often have students or assistants or something who can write programs for you (unless you are the student or assistant yourself).

    As for analyses, most working scientists use sophisticated analysis software as opposed to programming the raw analysis themselves. For instance, statistical analyses are often done in something like R, S, or SPLUS. Differential equations are often solved analytically in Mathematica or Maple, numerically in Matlab, etc. Of course, those packages are themselves specialized programming languages, but for common analyses, the required programming ability is minimal: you just type a command (or select one from a menu), and it does it. If you do something complicated enough, you will need to learn how to program them.

    Raw programming in general-purpose programming languages is still necessary sometimes, though. Many of the scientific analysis languages are slow. So knowledge of a fast language like C++ is very useful. I wouldn't bother with something like Visual Basic; it's not that fast, and if speed isn't a concern, it's better to work with a special-purpose analysis package that has lots of built-in features relevant to your task. (VB isn't bad if you need to make something with graphical user interfaces and such, but there is also software that does things like that, such as LabView.)

    The Perl and Python programming languages, while not nearly as fast as C, are gaining popularity in bioinformatics because they're easier to learn than C, and have very good string-processing abilities (searching, matching, comparing, etc. -- good for genomics). They're not as good for pure number-crunching calculations.

    Overall, I think C/C++ is a good general language to pick up, as kind of a default -- there is are a lot of code libraries available for it, and it's fast, even though it's not always the most convenient.

    It all depends on your needs.. I use Python a lot for my research, which doesn't involve a lot of number-crunching. For a colleague of mine, I spent a couple weekends cobbling together a little C program to do a statistical data analysis for a kind of non-standard problem he is studying -- since he doesn't know how to program, he couldn't do a proper analysis himself. It can be very useful at times ... but other people never need to know how to program.

    So.. I guess, if you have the time, pick some toy problem to analyze, and teach yourself enough of a language like C++ or Python to do the analysis.. it may pay off in the future.

    (If you tell me more specifically what kinds of analysis a molbio person is likely to want to do, I might be able to give more specific advice.)
     
  7. Nov 1, 2003 #6
    By the way, unless you go into some heavily data-processing intensive field like bioinformatics, you're not likely to work with databases. And even then, unless you're working on a really big dataset, you probably won't need to learn to program databases in SQL or anything; most programming languages (C, Perl, Python, etc.) include bindings to easily manipulate databases from within the language, so you don't have to interact with them directly.
     
  8. Nov 1, 2003 #7

    Monique

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Whoa, thank you for being so helpfull. I have a question though, what is meant with a 'fast' programming language?

    I still have to process all that you just said, but I can ask the following: is it usefull to start with learning C before going to C++?
    I hiked along with a friend who was taking an introductory C course. The course wasn't that well set up, just a new programming problem was given each week that had to be solved (eg. you have ten students, write a program to enter their grades and calculate the mean, etc.)

    VB has been helpfull for me in Excel, where with a little programming knowledge I was able to modify a macro so that it would loop infinately (with some lingo I had learned from the C class).

    Oracle I also ran into at work, where a computer engineer had developed a database for me for entering and safely storing data (Access didn't cut it and Excel is too easy to mess up), I would have loved to've modify the database myself..
     
  9. Nov 1, 2003 #8

    Monique

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    This will be a very silly question

    With the databases/languages you mentioned, in what kind of environment are they written and operated?

    Is an ordinary PC with Windows XP able to handle them? I think with the C course we had to log into a different computer through telnet and download software from the internet (the compiler I guess).

    The computer engineer who worked on the Oracle database was sitting at a Windows NT computer.. but I could use the database at an XP computer, and the data was sent to again a different computer which I didn't have access to..
     
  10. Nov 1, 2003 #9
    "Fast" languages are ones whose programs execute in a relatively short amount of time. They are usually compiled (meaning that they are translated into the computer's native machine language before execution). Interpreted languages, which read a line of text, translate it, execute it, then read another line of text, etc. are slower, but often more flexible.

    C is just a subset of C++, so you can start out just by learning C, maybe mixing in a few C++ features here and there when convenient. (e.g., C++'s input/output routines are more convenient than C's.) The main difference between the languages is C++'s addition of object-oriented programming, but you don't need to start out worrying about that.

    VB is a good language if you want to write something to control other Microsoft software; if you don't, then scripting languages like Python or Perl are usually better.

    I stand corrected: I've never met a practicing scientist who liked keeping their data in a database, as opposed to a spreadsheet or a plain text file. Big databases are often accessed from a central server over a network, but you can set up a small database on your own computer. I think most of the good free databases are for Unix, though. Most other major software, like programming languages, you can get for Windows, especially the commercial software.

    (There is a lot of commercial scientific software that is only for Unix, but that's less true for biology-related software. Biologists are some of the most Windows-oriented scientists, whereas physicists and astronomers are some of the most Unix-oriented.)

    As for downloading, well, Windows doesn't come with much in the way of programming languages or utilities, so you will either have to buy or download software. (Unix always comes with a C compiler, and usually a bunch of scripting languages like Python or Perl, and generally is more programmer-friendly than Windows.)
     
  11. Nov 1, 2003 #10

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    I echo Ambitwistor's comments on C vs C++. While you will probably only need C to do what you want to do, C++ offers simple alternatives to several things that are more convenient than C. Ambitwistor mentioned input/output. In addition, C++ style memory allocation/deallocation is a little simpler to write, and C++ has a real string type, a feature that C lacks... string manipulation in C is fast and flexible, but it is probably one of the most difficult things to learn as a beginning programmer.


    But in any case, you will probably want to query people who work in your field; if every biomolecular scientest in the world uses Perl for everything, then you should definitely start learning that simply because it would be easier to understand/use what others have written. I don't know if that would be important to you at all, though...


    If you want to DL a free C/C++ compiler for windows, I know of two options...

    (a) Download "cygwin" which is a unix emulator for windows. You can then use "gcc" in this environment which is probably the best free c/c++ compiler.

    (b) Download "djgpp" which is a port of gcc over to dos. I don't remember how up to date it is, nor how good of a conversion it is. On the upside, with it you can get the free integrated development environment "rhide"
     
    Last edited: Nov 1, 2003
  12. Nov 2, 2003 #11
    Well as someone programming almost whole my life (nothing professional...do you remember good old C64 fatty :), and the spectrum) I can tell you, that the safest thing is a C++. Any kind of foundation is good, basic, Pascal, vbasic, c etc... Because you will find very similar/same logic everywhere just different syntax... Of course there’s a thing called object programming you’ll face dealing with C++, but it’s not some kind of great obstacle, after a while you’ll find that way of thinking useful and economic.

    Even today I see many physicians working with the Fortran for their simulations (even though there’s better solutions (sometimes)), but there’s no mainstream of that kind in molecular biology (tj. problems being solved with the computers). But C++ is mainstream in today’s serious programming (you could use it for many things, and it’s great starting point for some other great languages as PHP...)
    + There are a bunch of great eBooks abut C/C++(almost every good book could be found in eBook form), or if you’re more comfortable with real books (who isn’t :)) it’s no problem learning it by your own, believe me...
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?



Similar Discussions: Programming language in Biology
  1. Ants Language- (Replies: 1)

  2. Biology calculations (Replies: 20)

Loading...