If there is a 1024x1024 array of 32 bits numbers and we need to normalize by columns. Algorithm goes through each column, finds max and divide all numbers by the max. It would be certainly wise to store the pages by column? My rationale: 1M (2^20) main memory is allocated and each page is 4K bytes as provided. Because if it is done by rows, 256 rows will be stored. So, for each column there will be 3 page faults when reading numbers and 3 page faults when writing back the normalized numbers.