Chemical Engineering Computation - Rectangular Matrix Formulation

AI Thread Summary
The discussion centers around a chemical engineering homework problem involving the least-squares fit of data using two correlation functions. The main challenge is constructing the initial rectangular matrix for the least-squares analysis in Excel, particularly how to derive the functions for each column corresponding to the x-values. Participants emphasize the importance of understanding regression techniques and matrix algebra, specifically the formation of normal equations to solve for unknown coefficients. There is also concern about the clarity of the professor's provided Excel template, which lacks explanatory notes on the calculations. Overall, the conversation highlights the need for a solid grasp of numerical analysis and regression methods in chemical engineering computations.
SweetBabyLou
Messages
6
Reaction score
0
Hello all,

I've got a somewhat mixed concept question from a Chemical Engineering computational methods class. I'm completely lost with this question, as I do not fully understand what it entails. The question (and data) is this:

Homework Statement


Data Set Given
X Y
0.00 -2.17
0.05 -1.34
0.10 -0.43
0.15 0.32
0.20 0.84
0.25 1.12
0.30 1.74
0.35 1.83
0.40 2.13
0.45 2.54
0.50 2.44
0.55 2.57
0.60 2.43
0.65 2.37
0.70 2.08
0.75 1.68
0.80 1.28
0.85 0.83
0.90 0.32
0.95 -0.17
1.00 -1.19

Homework Equations

Use the rectangular-matrix formulation and Excel matric functionality to obtain a least-squares fit to these data with each of the following two correlations:

f1(x;c1,c2,c3) = c1 + c2x + c3x2

f2(x;c1,c2,c3) = c1 + c2xcos(∏x) + c3sin(∏x)

Make two plots: data points plus f1(x) and data points plus f2(x)

The Attempt at a Solution



I do know the excel end of the function - of how I am supposed to carry out the rest of this problem (taking the inverse of the matrix formulated, finding the boundaries, and finally, finding the coefficients) and I do know that I need a few columns with each row corresponding to the given data. These columns would form the matrix needed to carry out the rest of the problem. What I am hung up on is how I go about building the initial matrix that undergoes these operations. I would think that it would have something to do with ordinary differential equations (ODEs) or basis functions.

At first, I tried deriving the "correlations" until I reached a constant, but quickly realized that I would need C2 to find C3 and vice versa. Unfortunately, I don't have much knowledge in the way of differential equations (save finding general solutions/ solutions to simple ODEs).

Some other knowledge I do know is that these correlations resemble power series, as well.If anyone can spare some extra brain power to explain the concept/give advice as to how I should go about completing this problem, I would greatly, GREATLY appreciate it.

Thanks
 
Last edited:
Physics news on Phys.org
This link:

http://www.newagepublishers.com/samplechapter/001938.pdf

shows the method for setting up and calculating least squares fits for parabolas. It's a very straightforward procedure (no derivative evaluations) to set up a system of equations where the unknowns are the coefficients of the parabola. This procedure is very easy to program in Excel or any other spreadsheet.

The method should be easy to adapt to the second equation format with a little algebra.
 
Thanks for the reply, SteamKing

I took a look at the link you posted. It was highly informative... however, I knew that the matrix was made up as a summation of the data values posted. I just realized that I had addressed the wrong concerns I had with this problem. I apologize for dragging this out a little longer... and please excuse me for the following utterly stupid question:

Attached is an excel file posted by a professor on how the process should be carried out in excel. This file does not use the same correlation functions as used in the assignment, but the sample should still be applicable to my case. I understand that with the correlation provided, the x-values are summed up and put together with an exponent the further along in the series we go. HOWEVER, when I look at the template/model/sample, I see a very LARGE matrix/array with columns labeled Am1, Am2...etc. The values in these columns have been defined by the functions in the cells. How the FUNCTIONS were derived, however, I am stumped.

Furthermore, instead of summing up the values, I see that professor has multiplied this large matrix on its transpose in order to retrieve the "summation" matrix needed.

To make a long, stupid question short: what are the functions I need to place in each column in order to get a corresponding row for each x-value?

Again, thanks for your patience with my idiot brain. Any help given is MUCH appreciated.
Thanks
 

Attachments

I can see that you have not studied any regression techniques in your courses (I hope you will take a numerical analysis course, if you haven't taken one already). The method for fitting a parabola to data is basically the same as fitting a straight line, although with an additional coefficient to calculate.

The equations used to fit the data determine the unknown coefficients. We can write one equation for each set of (X, Y) values. In almost all cases, we have more equations than unknowns and thus an over-determined system to solve, albeit one where we want to minimize the square of the differences between the fitted value of Y and the actual Y from the data.

By means of some matrix algebra I won't go into here, multiplying both sides of the over-determined system by the transpose of the matrix of coefficients gives what are called the 'normal equations', which can then be solved by the usual procedures like Gaussian elimination. The normal equations are equal in number to the number of unknown coefficients in the model equation, for the parabola this number is 3. In most circumstances, solving the normal equations gives the desired regression coefficients. However, there is sometimes a price to be paid for using normal equations in that the accuracy of the solution can sometimes be reduced and the resulting regression is not accurate. There are other methods of solution which don't involve the formation of the normal equations, but I leave a discussion of those methods to your courses in numerical analysis or linear algebra.

I'll take a look at your attachments and get back to you.
 
I'm not a big fan of teaching important concepts using 'black boxes', which is what that spreadsheet is.
Certainly, this spreadsheet could use some explanatory notes and other illustrations to explain the calculations.

If you go to Column I 'Exact', it appears that the spreadsheet is trying to fit a fourth order polynomial and a trigonometric equation to provide a fit for y = cosh(4x-2)/cosh(2), which equation is unfamiliar to me. The values from the fit are used to calculate 'Conc', which apparently is equal to 3*y. Beyond forming the normal equations at the bottom of the sheet and solving for the unknown regression coefficients, I cannot explain what occurs with the A values nor what the 'Weight for Boundary conditions' means.

Regression analyses are important tools to use in analyzing experimental data. It is too bad that your professor has chosen a rather obscure method of explaining and illustrating this method of analysis.
 

Similar threads

Back
Top