Information loss as a grid is coarsened

In summary, the conversation discusses the concept of information loss when aggregating data from a 4x4 grid to a 2x2 grid. The topic of information theory is mentioned and it is noted that defining a probability distribution is necessary for calculating information loss. It is also mentioned that there are subjective aspects involved in defining information. The conversation ends with a book recommendation on the subject of information theory.
  • #1
wvguy8258
50
0
Hi,

Not sure if this is the correct sub-forum or not. Perhaps, general math is better. Anyways..

In the following, a simple reference covering what I am after would be very helpful.

Let's say you have a 4x4 grid of cells each cell contains either a 1 or 0. Let's say it is this.

0101
1010
0101
1010


And it covers a certain spatial area of let's say 4 m X 4, so the resolution of each cell is 1 by 1 m.

If I coarsen the resolution of the grid so that it is now a 2X2 grid covering the same area then I will take the average value for each of the four cells collapsed by aggregation and assign the average value to the new cells. So we have

0.5 0.5
0.5 0.5

Is there a way to capture the information lost in this aggregation? I suppose it could be thought of as the aggregate grid being known and then the original 4x4 grid being a signal and determining how much information is in the original grid given that you know the aggregate values.

Further, if I attempted to estimate the values in the 2x2 grid using some method and came up with

0.4 0.6
0.3 0.2

Is there a way to determine the amount of information held in the 2x2 grid of 0.5 values given that we know the estimate? I suppose this is the amount of "surprise" in the values of the 2x2 grid given the model.

The reason I am asking this. I model land use change using satellite imagery to determine land cover and then try to predict locations of change by using information on things that likely influence land cover change (like road location, topographic slope, etc). You can often increase model accuracy by aggregating the satellite imagery. So, you gain predictive ability but on a data set where information has been lost. I am trying to better understand this trade-off so that recommendations can be made regarding the appropriate level of data coarsening.

Thanks for reading,

Seth
 
Physics news on Phys.org
  • #2
The discipline of "Information Theory" uses methods that assign a measure of information to situations involving probability. For example, on a map where the most likely values in your 4x4 grid were all zeroes then the reduction to a 2x2 grid involves less of a loss of information than on another map where 1's and 0's happen with equal frequency. If you want a measure of information based on Information Theory, you need to make assumptions about probabilities.

Gain and loss of information will depend on what probability distributions you use as your "before" and "after" cases. Defining a probability distribution includes defining the random variable(s) that it involves. For example, you might regard your 4x4 matrix of data as a being generated by an even finer grid or even by a spatially continuous randoms variables. The Entropy of this underlying distribution can be calculated. Given a particular 4x4 matrix the conditional probability distribution for the underlying data given that matrix will typically have less Entropy than the unconditional distribution. If we average this Entropy over all 4x4 matrices (weighted by their probability of occurrence) then we get the average entropy of the various conditional distributions. The difference of between this average entropy and the entropy of the unconditional distribution is a measure of how much Entropy loss (= gain in certainty) we get from have the 4x4 matrix data.

A similar calculation can be done for the 2x2 matrices.

You might want to avoid defining an underlying probability distribution for the data and only define a probability distribution on the 4x4 matrices. Thus you assume that knowing the 4x4 matrix is "knowing everything" so knowing it reduces Entropy to zero. You can calculate the average entropy of conditional distributions given the various 2x2 matrices and call that entropy, the gain in entropy ( = increase in uncertainty) from summarizing the data in 2x2 form.

This shows that there are subjective aspects involved in defining information.

This is good and inexpensive book on the subject: "An Introduction to Information Theory" [Paperback]
by Fazlollah M. Reza
 
Last edited:
  • #3
Thank you for your detailed response.

If the only thing you know of the original data is that it is binary (0/1) and do not assume any underlying distribution, then would the information in bit just be 16?
 
  • #4
wvguy8258 said:
If the only thing you know of the original data is that it is binary (0/1) and do not assume any underlying distribution, then would the information in bit just be 16?

I think the numbers that people give for "information in a bit" are based on the assumption that 0 and 1 are equiprobable. So, I'd have to say "No". If you don't assume any probability distribution, you don't get any measure of information.
 
  • #5


Dear Seth,

Thank you for reaching out with your question. I can provide some insight into the concept of information loss as a grid is coarsened. This is a common issue in data analysis and modeling, especially in the field of remote sensing and land use change studies.

First, let's define information loss. In simple terms, it refers to the reduction of detail or resolution in a dataset. In your example, the original 4x4 grid has a higher resolution compared to the coarsened 2x2 grid. This means that the original grid contains more information about the spatial distribution of 1s and 0s, while the coarsened grid only provides a general overview of the data.

When we aggregate data, we lose some information because we are combining multiple data points into one. In your case, by taking the average value of each 2x2 grid, you are effectively losing some information about the original 4x4 grid. This is a trade-off that is often necessary in data analysis, as it can improve the efficiency and accuracy of models.

To capture the information lost in the aggregation process, you can use a measure called entropy. Entropy is a measure of uncertainty or disorder in a dataset. In simple terms, it tells us how much information is needed to describe a dataset. In your example, the original 4x4 grid has a higher entropy compared to the coarsened 2x2 grid because it contains more information and is more complex. By calculating the entropy of both grids, you can quantify the amount of information lost in the aggregation process.

In your second scenario, when you attempt to estimate the values in the 2x2 grid using a model, you can use a measure called information gain to determine the amount of information held in the 2x2 grid. Information gain is the difference between the entropy of the original grid and the entropy of the estimated grid. This measure tells us how much information the model has provided about the data.

In summary, information loss is an inevitable trade-off in data analysis when we aggregate or coarsen data. However, by using measures such as entropy and information gain, we can quantify the amount of information lost or gained in the process. This can help us make informed decisions about the appropriate level of data coarsening in our studies.

I hope this helps answer your question. If you need further clarification or have any other questions, please do not
 

Related to Information loss as a grid is coarsened

1. What is information loss as a grid is coarsened?

Information loss as a grid is coarsened refers to the reduction in the amount of detailed information when a grid or data set is simplified or reduced in size. This process can result in the loss of important details and nuances in the data.

2. How does information loss occur during grid coarsening?

Information loss occurs during grid coarsening due to the reduction of resolution and the merging of data points. When a grid is coarsened, smaller cells are combined into larger cells, resulting in a loss of detailed information contained within those smaller cells.

3. What are the implications of information loss when a grid is coarsened?

The implications of information loss as a grid is coarsened can vary depending on the specific data and its intended use. However, it can lead to inaccuracies and errors in data analysis and modeling, as well as the potential for important details to be overlooked or misrepresented.

4. Can information loss as a grid is coarsened be avoided?

In some cases, information loss during grid coarsening can be avoided by using more advanced techniques, such as adaptive grid refinement, which selectively refines certain areas of the grid while keeping others at a coarser resolution. However, in many cases, some level of information loss is inevitable when simplifying a grid.

5. How can the effects of information loss as a grid is coarsened be minimized?

The effects of information loss can be minimized by carefully considering the level of coarsening needed and using appropriate techniques to preserve important features of the data. Additionally, sensitivity analysis can be used to evaluate the impact of grid coarsening on the results of a study or analysis.

Similar threads

  • MATLAB, Maple, Mathematica, LaTeX
Replies
2
Views
1K
  • Programming and Computer Science
Replies
1
Views
1K
  • Advanced Physics Homework Help
Replies
7
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
10
Views
2K
Replies
4
Views
1K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
1
Views
1K
Replies
8
Views
2K
  • Programming and Computer Science
Replies
1
Views
2K
Replies
56
Views
6K
Replies
2
Views
2K
Back
Top