Dead on. From my experience, most scientists and engineers make for incredibly bad programmers. My opinion: The only reason computer science majors are not used to write our scientific /engineering programs is that most computer scientists fare even worse at doing science and engineering than we do at doing programming.What are your thoughts on this?
Dead on. From my experience, most scientists and engineers make for incredibly bad programmers. My opinion: The only reason computer science majors are not used to write our scientific /engineering programs is that most computer scientists fare even worse at doing science and engineering than we do at doing programming.
It isn't that hard to program well. We know what a good engineering design or a solid scientific theory looks like. We can learn what a well-constructed program looks like. It does take some training, however. It is a bit arrogant on our part to think that training is not required.
It's also a bit arrogant to think that it's not hard to program well.It is a bit arrogant on our part to think that training is not required.
Because bad programs cannot be fixed, they must be written as if the original had never existed. The difference between a program that works and one that works well is vastly greater than that between a program that works and one that doesn't work at all.If they already have the programs that "work" why not just give them to a programmer and have them make it work well..
It sounds like part of the problem was that they couldn't explain to the programmers exactly what they wanted so programmers couldn't do what the already written programs could do.Because bad programs cannot be fixed, they must be written as if the original had never existed. The difference between a program that works and one that works well is vastly greater than that between a program that works and one that doesn't work at all.
That sounds awesome in theory, but works out rather badly when you actually have to figure out somebody's scientific computing code full of all sorts of crazy math, almost no comments, and lots of hacks to keep the code from crashing. I spend a good chunk of time using and rewriting a labmate's code to make it robust enough for my purposes and it's like pulling teeth to get an explanation of the code that makes any sense to me.This is why you give the programs to the programmers and have them see what the program actually does. They can then make a program that gives the same results, but works much better and has more features.
In theory code cleanup would be a great task to farm out to undergrads, but the math involved in the programming makes it totally unfeasible a lot of the time.The people who write the astronomical codes discussed in the article are predominantly astronomy grad students.
One of the fun things about working with very large datasets is that I'm usually the only person in the room who cares about the space complexity as much (if not more than) the time complexity of any of the algorithms used to do the number crunching.Figuring out how to store, reference, and analyze that is all a programming task of large magnitude.
Because if you can precisely explain exactly what equations need to be programmed, then you've already written the program.If they already have the programs that "work" why not just give them to a programmer and have them make it work well.
Haha so true. But what about code for supercomputers? (code that might take several days to process?) Or code that, say, requires 8 GM of RAM to process? (seriously, I once had to run code that required 8 GB of RAM for certain parameters).It is not hard to write effective code, merely to write efficient code. This was an issue 30 years ago when memory was expensive. This is no longer true. You can now write horribly inefficienct code and no one cares - aside from waiting for it to process.
I have a number of problems with the above. I am having a very hard time parsing your first sentence. For one thing, you are using two words, effective and efficient, that are synonyms / near synonyms of one another. For another, that parenthetical remark is a bit hard to parse. I think you are saying "It is not hard to write effective code. What is hard is writing efficient code." If that is the correct interpretation, I take exception to it.It is not hard to write effective code, merely to write efficient code. This was an issue 30 years ago when memory was expensive. This is no longer true. You can now write horribly inefficienct code and no one cares - aside from waiting for it to process.
And personally I think that's a good thing, since CS classes are often terrible for teaching application programming. What's not surprising is the number of astronomy Ph.D.'s that are terrible programmers, what is more surprising is the number of CS Ph.D.'s that are terrible programmers.Very few require students of science and engineering to take anything beyond an introductory computer programming class. A lot don't require *any* classes in computer science.
And it's likely to be much faster. Working on high-performance computing is not part of the typical IT major's curriculum.I trust an astrophysict's 'plodding' algorithms more than I would ever trust a programmer's ability to figure out what it is they are trying to calculate. Yes, the astrophysicist will not write programs as efficiently as an IT major, but, they still work.
It's actually quite hard and getting harder. The key to CPU programming is to keep everything on the L1 cache, which is quite limited and requires a lot of tricks. Then there is GPU and multi-core/multi-threaded programming which adds a different level of complexity.It is not hard to write effective code, merely to write efficient code. This was an issue 30 years ago when memory was expensive.
People do care in astrophysics and finance. A simulation can take two weeks, and a factor of 2 speedup makes the difference between a calculation that you can't do and one you can. In finance, what options you can sell often limited by how much compute power that you have.This is no longer true. You can now write horribly inefficienct code and no one cares - aside from waiting for it to process.
What about applied math courses?And personally I think that's a good thing, since CS classes are often terrible for teaching application programming. What's not surprising is the number of astronomy Ph.D.'s that are terrible programmers, what is more surprising is the number of CS Ph.D.'s that are terrible programmers.
And what about these ones? If you know computer systems, could that make you better at CPU programming?AMATH 581 Scientific Computing (5)
Project-oriented computational approach to solving problems arising in the physical/engineering sciences, finance/economics, medical, social, and biological sciences. Problems requiring use of advanced MATLAB routines and toolboxes. Covers graphical techniques for data presentation and communication of scientific results.
AMATH 582 Computational Methods for Data Analysis (5)
Exploratory and objective data analysis methods applied to the physical, engineering, and biological sciences. Brief review of statistical methods and their computational implementation for studying time series analysis, spectral analysis, filtering methods, principal component analysis, orthogonal mode decomposition, and image processing and compression. Offered: W.
AMATH 583 High-Performance Scientific Computing (5)
Introduction to hardware, software, and programming for large-scale scientific computing. Overview of multicore, cluster, and supercomputer architectures; procedure and object oriented languages; parallel computing paradigms and languages; graphics and visualization of large data sets; validation and verification; and scientific software development. Offered: Sp.
AMATH 584 Applied Linear Algebra and Introductory Numerical Analysis (5)
Numerical methods for solving linear systems of equations, linear least squares problems, matrix eigen value problems, nonlinear systems of equations, interpolation, quadrature, and initial value ordinary differential equations. Offered: jointly with MATH 584; A.
AMATH 585 Numerical Analysis of Boundary Value Problems (5)
Numerical methods for steady-state differential equations. Two-point boundary value problems and elliptic equations. Iterative methods for sparse symmetric and non-symmetric linear systems: conjugate-gradients, preconditioners. Prerequisite: AMATH 581 or MATH 584 which may be taken concurrently. Offered: jointly with MATH 585; W.
AMATH 586 Numerical Analysis of Time Dependent Problems (5)
Numerical methods for time-dependent differential equations, including explicit and implicit methods for hyperbolic and parabolic equations. Stability, accuracy, and convergence theory. Spectral and pseudospectral methods. Prerequisite: AMATH 581 or AMATH 584. Offered: jointly with ATM S 581/MATH 586; Sp.
CSE 410 Computer Systems (3)
Structure and components of hardware and software systems. Machine organization, including central processor and input-output architectures; assembly language programming; operating systems, including process, storage, and file management. Intended for non-majors. No credit to students who have completed CSE 351, CSE 378, or CSE 451. Prerequisite: CSE 373.
CSE 417 Algorithms and Computational Complexity (3)
Design and analysis of algorithms and data structures. Efficient algorithms for manipulating graphs and strings. Fast Fourier Transform. Models of computation, including Turing machines. Time and space complexity. NP-complete problems and undecidable problems. Intended for non-majors. Prerequisite: CSE 373.
CSE 446 Machine Learning (3)
Methods for designing systems that learn from data and improve with experience. Supervised learning and predictive modeling: decision trees, rule induction, nearest neighbors, Bayesian methods, neural networks, support vector machines, and model ensembles. Unsupervised learning and clustering. Prerequisite: either CSE 326 or CSE 332; either STAT 390, STAT 391, or CSE 312.
CSE 415 Introduction to Artificial Intelligence (3) NW
Principles and programming techniques of artificial intelligence: LISP, symbol manipulation, knowledge representation, logical and probabilistic reasoning, learning, language understanding, vision, expert systems, and social issues. Intended for non-majors. Not open for credit to students who have completed CSE 473. Prerequisite: CSE 373.
CSE 373 Data Structures and Algorithms (3)
Fundamental algorithms and data structures for implementation. Techniques for solving problems by programming. Linked lists, stacks, queues, directed graphs. Trees: representations, traversals. Searching (hashing, binary search trees, multiway trees). Garbage collection, memory management. Internal and external sorting
If you take a PDE and give it to someone that doesn't understand PDE's, the code won't work. Also this type of work is something that physics Ph.D.'s get hired to do.There are engineers who specialize in writing scientific code. They are fully capable of taking the differential equation (or whatever) and programming the discretized solution. The better ones can do it in any type of hardware. They are fully capable of debugging the physics on their own as long as the physicists supply the test problem and expected answer.
Some do. Astrophysical CFD simulations can and do general gigabytes of data per second, and if you work on one of those projects, you can get very quickly familiar with the nitty-gritty of data storage. People that work on geological systems routinely deal with multi tetrabyte databases. And then there are the bioinformatics people. Once you've sequenced the human genome, storing that information is non-trivial.The terabytes/day of data problem is totally different. For this you must use computer scientists. It's just not what engineers or physicists do.
The big problem with those courses is that they generally don't give you experience in working on hundred-person project teams with millions of source lines of code. Coding is a form of writing, and you learn to write by writing.What about applied math courses?