I am doing a retrospective research on breast cancer using multivariate estimates. The aim of the research is to calculate the probabilty of findng the breast cancer given the multiple independent variables (IVs). So the outcome should be binary in nature (cancer versus benign). Some of IVs are correlated. My question, should I do Factor Analaysis on IVs before starting my logistic regression? I know that FA or PCA are used to reduce the noisy data and get better estimate using higher variance along the direction of the principles components. But intuitively, I guess that if I have multiple variables, eventhough they may be correlated, they still be able to strongly differentiate the cancer from benign cases. The second statement I inferred from Baye`s equation where the posterior odd for cancer will be proportinated to the product of sensitivities of each variable which gives credits to the presence of correlated IVs. So, how do both ideas be conceived with each other?