Comp Sci Subtracting the mean from a column in an array

AI Thread Summary
The discussion revolves around the concept of subtracting the mean from each column in a NumPy array. It clarifies that this process involves calculating the mean for each column and then subtracting that mean from every element within that column, effectively centering the data around zero. The conversation also highlights the importance of understanding the context of the data, such as whether it represents observations or features. Additionally, it mentions NumPy's broadcasting feature, which simplifies operations between arrays of different dimensions. Overall, the participants emphasize the need for clarity on the task and the utility of NumPy for such operations.
ver_mathstats
Messages
258
Reaction score
21
Homework Statement
Assume that a is a 2-dimensional NumPy array. Subtract the mean from each column of a.
Relevant Equations
Python
Python:
import numpy as np

a=np.array([[1,2,3], [4,5,6]])
print(s)
print()
print(a.mean())

I know how to take the mean of the entire array. However I am having trouble understanding what it means to subtract the mean from each column. Does this mean subtract it from each element in the column? Thank you.
 
Physics news on Phys.org
If this is a homework question, try to ask your teacher. I will give my educated guess, though. If the matrix came from a data table of observations (maybe you want it in a matrix, instead of a dataframe to do some operations). If each row is an observation, then each column represents some feature. If it is people, perhaps you have a column for height, then one for weight, and another for age.

In this situation, you might want to find the mean of each column, and subtract that mean from each value, so that each column is "centered" around zero.

Again, that is how I interpret the context.
 
Last edited:
  • Like
Likes ver_mathstats
Then again, maybe they want the mean of the entire array. In either situation, you would be finding a mean, then that number is subtracted from each element in the corresponding object. Numpy has some nice features called Broadcasting which allow you to perform operations between 2 objects which seem to be incompatible "dimension wise".

Off the top of my head, I don't remember - I'd have to look it up.
 
  • Like
Likes ver_mathstats
scottdave said:
If this is a homework question, try to ask your teacher. I will give my educated guess, though. If the matrix came from a data table of observations (maybe you want it in a matrix, instead of a dataframe to do some operations). If each row is an operation, then each column represents some feature. If it is people, perhaps you have a column for height, then one for weight, and another for age.

In this situation, you might want to find the mean of each column, and subtract that mean from each value, so that each column is "centered" around zero.

Again, that is how I interpret the context.
The question I wrote down was the only information given unfortunately. I understand what you are saying however. I did figure out how to take the mean of each column and did ask the teacher for clarification about what it means to subtract the mean. All of that makes sense. Thank you for the reply.
 
scottdave said:
Then again, maybe they want the mean of the entire array. In either situation, you would be finding a mean, then that number is subtracted from each element in the corresponding object. Numpy has some nice features called Broadcasting which allow you to perform operations between 2 objects which seem to be incompatible "dimension wise".

Off the top of my head, I don't remember - I'd have to look it up.
Yes thank you, I just learned about broadcasting so I understand what you are saying.
 
  • Like
Likes scottdave

Similar threads

Back
Top