Help: Convert Matlab to Python code

• MATLAB

Main Question or Discussion Point

I'm working on a convolutional neural network project that classifies histopathological images of breast tumors. All my code is in python. Part of the image preprocessing involves appearance normalisation to account for the variation in histological staining. The only code for the preprocessing function that I found is written in MatLab, which I do not know. or understand. This is the GitHub link:

https://github.com/mitkovetta/staining-normalization/blob/master/normalizeStaining.m

How do I do this on python?

Related MATLAB, Maple, Mathematica, LaTeX News on Phys.org
jedishrfu
Mentor
By hand I’m afraid. You’ll have to relate Matlab code to python numpy lib code for matrix stuff

.Scott
Homework Helper
I am taking a quick look through that MatLab code to see what you might not catch on to.
The size function should be obvious. I is a three-dimensional array and h and w are set to its first two dimensions. That third dimension is the Red/Green/Blue values.
The reshape call simply makes that image array two dimensional, essentially a list of RGB values.
Function log is the natural logarithm. So OD will have the same elements as I (a 2D matrix) but every element in OD will be function -log((x+1)/240) of the corresponding value in I.
Line 79 rather shows off one of MatLab's strong points. ODhat ends up being a sampling of the elements in OD that have at least one color greater than or equal to beta.
Line 82 computes a covariance of ODhat, and saves part of the eigenvalue function in V.
Here are a couple of links:
https://www.mathworks.com/help/matlab/ref/cov.html
https://www.mathworks.com/help/matlab/ref/eig.html
Lines 86, 94, 95 involve actual matrix multiplication with a piece of the V matrix.
https://www.mathworks.com/help/stats/prctile.html
HE becomes a 2 element array.
The key thing with line 106 is the apostrophe at the end. It sets Y to the complex conjugates of OD.
Here's a link for that backslash operator:
https://www.mathworks.com/help/matlab/math/systems-of-linear-equations.html
The bsxfun function is described here:
https://www.mathworks.com/help/matlab/ref/bsxfun.html
Basically, lines 114 and 115 are doing an element by element multiply and divide.
Note the apostrophe in line 124.
Everything else you should be able to parse.

CrazyNeutrino and jedishrfu
FactChecker
Gold Member
There are three general approaches that you might consider:
1) make or get python versions of all the MATLAB functions used and make a python procedure like the MATLAB code.
2) use Mathworks tools to generate C or C++ code and make a python version of that code
3) study the calculations of the MATLAB code and implement equivalent calculations using whatever python libraries and code are available.

jedishrfu
Mentor
The problem with those approaches is that you still may have to implement some Matlab functions used by the code. One such function is this bsxfun.

FactChecker
jedishrfu
Mentor
Here’s some mapping resources for Matlab to python comparison

https://www.numfys.net/howto/matlab-to-python/

http://www.eas.uccs.edu/~mwickert/ece5650/notes/NumPy2MATLAB.pdf

Www.Pyzo.org A python IDE

Octave an open source Matlab. Or freemat another open source Matlab supporting its core functions only.

Julia an open source cousin of Matlab similar to but has slightly different syntax. Julia plays well with python.

Something to explore SMOP says it can convert Matlab to python although you might have to build it and your code is pretty small meaning convert by hand would be less troublesome. Could be useful if you discover a lot more Matlab code.

scottdave
Is there any reason you can't use both? Keep your main program in python, and transfer data in/out of Matlab? I'm assuming there is some programmatic way to pass info into Matlab, and get the results back into your python code?

jedishrfu
Mentor
Yes, one approach is via a simple file format. CSV files or HDF file formats work.

I use custom text formatted files with labels identifying an array of numbers and code to write it out from Matlab and read it back in python because text allows me to check the data and bypasses the endian issues.

HDF and NetCDF files are designed to handle scientific data and have api sets for Matlab and python and other langs.

scottdave
Homework Helper
Great post @scottdave
I wish I had written that, but it was actually @.Scott
And yes it was a great post.

jedishrfu
Mentor
I wish I had written that, but it was actually @.Scott
And yes it was a great post.
My apologies to both of you.

scottdave
Thank you! @scottdave and everyone else. I think I am going to try translating it semantically on my own.

Could I run the function as it is using matlab.engine on python? Image to numpy array which I then feed into the function? Will matlab recognize a numpy array as a matrix?

Alright so I'm stuck at this line:

% remove transparent pixels
ODhat = OD(~any(OD < beta, 2), :);

I'm not even entirely sure what this does. What does the 'any' and the 2 stand for? How would i write this in python?

FactChecker
Gold Member
scottdave
I've had a couple of issues while translating the code.

1.In the line "That = ODhat*V(:,2:3);"
I get matrix multiplication errors when I use numpy matmul. Say my input is a 10X10X3 RBG image for simplicity. We flatten it to 100X3 rgb value. After removing any transparent pixels, lets say it becomes 98X3. The shape of the eigenvector matrix of the covariance of the list of RGB values is then 98X98. If I take the last two eigenvectors as V, then V has a shape of (98,2). How could I possibly multiply this with ODHat which has a shape of (98X3)? The only way this is possible is if ODHat is transposed. Does MatLab do this implicitly?

2. "phi = atan2(That(:,2), That(:,1));"
numpy's atan2 doesn't seem to work with That[: , 2] and That[:, 1] as inputs. It generates an error saying something along the lines of it being an invalid data type. he Numpy documentation says that atan2 takes two array-like inputs so I'm not entirely sure what the problem is.

.Scott
Homework Helper
I started with a 4x4 image. Hope this helps.
Code:
>> I = double(I)

I(:,:,1) =

5     6     7     8
1     2     3     4
9     8     7     6
6     5     4     3

I(:,:,2) =

5     6     7     8
1     2     3     4
9     8     7     6
6     5     4     3

I(:,:,3) =

5     6     7     8
1     2     3     4
9     8     7     6
6     5     4     3

>> I = reshape(I, [], 3)

I =

5     5     5
1     1     1
9     9     9
6     6     6
6     6     6
2     2     2
8     8     8
5     5     5
7     7     7
3     3     3
7     7     7
4     4     4
8     8     8
4     4     4
6     6     6
3     3     3

>> OD = -log((I+1)/240)

OD =

3.6889    3.6889    3.6889
4.7875    4.7875    4.7875
3.1781    3.1781    3.1781
3.5347    3.5347    3.5347
3.5347    3.5347    3.5347
4.3820    4.3820    4.3820
3.2834    3.2834    3.2834
3.6889    3.6889    3.6889
3.4012    3.4012    3.4012
4.0943    4.0943    4.0943
3.4012    3.4012    3.4012
3.8712    3.8712    3.8712
3.2834    3.2834    3.2834
3.8712    3.8712    3.8712
3.5347    3.5347    3.5347
4.0943    4.0943    4.0943

>> ODhat = OD(~any(OD<3.3,2), :)

ODhat =

3.6889    3.6889    3.6889
4.7875    4.7875    4.7875
3.5347    3.5347    3.5347
3.5347    3.5347    3.5347
4.3820    4.3820    4.3820
3.6889    3.6889    3.6889
3.4012    3.4012    3.4012
4.0943    4.0943    4.0943
3.4012    3.4012    3.4012
3.8712    3.8712    3.8712
3.8712    3.8712    3.8712
3.5347    3.5347    3.5347
4.0943    4.0943    4.0943

>> [V, ~] = eig(cov(ODhat));
>> V

V =

0.4082    0.7071    0.5774
0.4082   -0.7071    0.5774
-0.8165         0    0.5774

>> That = ODhat*V(:,2:3)

That =

0.0000    6.3893
0.0000    8.2922
0.0000    6.1223
0.0000    6.1223
0.0000    7.5899
0.0000    6.3893
0.0000    5.8910
0.0000    7.0916
0.0000    5.8910
0.0000    6.7051
0.0000    6.7051
0.0000    6.1223
0.0000    7.0916

>> phi = atan2(That(:,2), That(:,1))

phi =

1.5708
1.5708
1.5708
1.5708
1.5708
1.5708
1.5708
1.5708
1.5708
1.5708
1.5708
1.5708
1.5708

>>