Subtracting the mean from a column in an array

Click For Summary

Discussion Overview

The discussion revolves around the concept of subtracting the mean from a column in a NumPy array. Participants explore the implications of this operation, particularly in the context of data analysis and matrix manipulation.

Discussion Character

  • Exploratory, Homework-related, Technical explanation

Main Points Raised

  • One participant expresses uncertainty about whether to subtract the mean of each column or the mean of the entire array from each element.
  • Another participant suggests that if the array represents observations, subtracting the mean of each column would center the data around zero.
  • Some participants mention the concept of broadcasting in NumPy, which allows operations between arrays of different dimensions, though they do not recall the specifics.
  • A later reply indicates that the original poster has learned how to calculate the mean of each column and has sought clarification from their teacher regarding the subtraction process.

Areas of Agreement / Disagreement

Participants do not reach a consensus on whether to subtract the mean of each column or the mean of the entire array, indicating multiple competing views on the interpretation of the task.

Contextual Notes

There are limitations in the provided information, such as the lack of clarity on the context of the data and the specific requirements of the task.

ver_mathstats
Messages
258
Reaction score
21
Homework Statement
Assume that a is a 2-dimensional NumPy array. Subtract the mean from each column of a.
Relevant Equations
Python
Python:
import numpy as np

a=np.array([[1,2,3], [4,5,6]])
print(s)
print()
print(a.mean())

I know how to take the mean of the entire array. However I am having trouble understanding what it means to subtract the mean from each column. Does this mean subtract it from each element in the column? Thank you.
 
Physics news on Phys.org
If this is a homework question, try to ask your teacher. I will give my educated guess, though. If the matrix came from a data table of observations (maybe you want it in a matrix, instead of a dataframe to do some operations). If each row is an observation, then each column represents some feature. If it is people, perhaps you have a column for height, then one for weight, and another for age.

In this situation, you might want to find the mean of each column, and subtract that mean from each value, so that each column is "centered" around zero.

Again, that is how I interpret the context.
 
Last edited:
  • Like
Likes   Reactions: ver_mathstats
Then again, maybe they want the mean of the entire array. In either situation, you would be finding a mean, then that number is subtracted from each element in the corresponding object. Numpy has some nice features called Broadcasting which allow you to perform operations between 2 objects which seem to be incompatible "dimension wise".

Off the top of my head, I don't remember - I'd have to look it up.
 
  • Like
Likes   Reactions: ver_mathstats
scottdave said:
If this is a homework question, try to ask your teacher. I will give my educated guess, though. If the matrix came from a data table of observations (maybe you want it in a matrix, instead of a dataframe to do some operations). If each row is an operation, then each column represents some feature. If it is people, perhaps you have a column for height, then one for weight, and another for age.

In this situation, you might want to find the mean of each column, and subtract that mean from each value, so that each column is "centered" around zero.

Again, that is how I interpret the context.
The question I wrote down was the only information given unfortunately. I understand what you are saying however. I did figure out how to take the mean of each column and did ask the teacher for clarification about what it means to subtract the mean. All of that makes sense. Thank you for the reply.
 
scottdave said:
Then again, maybe they want the mean of the entire array. In either situation, you would be finding a mean, then that number is subtracted from each element in the corresponding object. Numpy has some nice features called Broadcasting which allow you to perform operations between 2 objects which seem to be incompatible "dimension wise".

Off the top of my head, I don't remember - I'd have to look it up.
Yes thank you, I just learned about broadcasting so I understand what you are saying.
 
  • Like
Likes   Reactions: scottdave

Similar threads

Replies
7
Views
3K
Replies
7
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K