Deleting Duplicate Values Of Matrix In Mathematica?

In summary, the expert suggests reading books and practicing to learn the Mathematica language. They also provide some book suggestions. They explain that the language is huge and requires a lot of time and effort to fully understand. They also mention that there are many bad Mathematica books out there.
  • #1
jasonpatel
35
0
Hey guys,

I have a x row by 3 column matrix (its big and the amount of rows varies).

I want to...

...Keep ONLY pairs of rows with the same value in their first column and delete the rest. Meaning if a row has a unique first column value or if three rows have the same value in their first column, they should be deleted.

I think this is pretty simple but I have read and looked for hours and just could not figure it out. Any help would be greatly appreciated! I attached some sample matrix. Thanks in advanced!
 

Attachments

  • abcd.txt
    3.7 KB · Views: 480
Physics news on Phys.org
  • #2
Sort the entries into order on the first element,
Note: If you can guarantee that your list will always be sorted on the first element you could skip the sort step,
Split the entries into groups on the first element,
Select those groups with two items,
Flatten to get rid of the extra layer of {}.
Note: Sort changes the order. If the order must be maintained then an extra layer will be needed to put them back into the original order.

In[1]:=
t={{0,26,4728}, {0,9,7111}, {1,18,4292}, {1,16,4069}, {1,15,4092}, {1,14,4931},
{1,12,3937}, {1,10,4768}, {1,9,6860}, {1,3,11304}, {2,56,12727}, {2,34,9427},
{2,26,4954}, {2,9,7827}, {3,9,6099}, {4,9,8408}, {5,9,7023}, {6,26,6290}, {6, 9,5565},
{7,26,4630}, {8,57,12798}, {8,56,12633}, {8,34,11512}, {8, 9,4905}, {9,9,5863},
{10,26,4386}, {10,10,4640}, {10,9,6889}, {11,9,5841}, {12,26,4335}, {12,24,11688},
{12,23, 11793}, {12,9,6523}, {13,9,5137}, {14,9,6660}, {15,10,4901}, {15,9,7152},
{16,34,11659}, {16,26,5339}, {16,25,12489}, {16,24,11601}, {16,23,11824}, {16,22, 12250},
{16,21,11903}, {16,19,12927}, {16,9,5692}, {16,2,11727}, {16,1,11276}, {17,26,5864},
{17,9,5809}, {18,9, 6683}, {19,26,4788}, {19,9,8497}, {20,9,6001}, {21,26, 5338},
{21,22,11620}, {21,9,7638}, {22,26,5644}, {22,9,6466}, {23,57,9669}, {23,26,5039},
{23,16,8929}, {23,9,8135}, {24,26,4805}, {25,56,12404}, {25,9,5348}, {26,9,5425},
{27,57,12543}, {27, 26,4470}, {27,9,5067}, {28,57,9897}, {28,9,6840}, {29, 26,4515},
{29,10,4640}, {29,9,8759}, {30,9,6432}, {31,9,7121}, {32,9, 5464}, {33,66,13433},
{33,64,11891}, {33,9,6022}, {34,10,4599}, {34,9,5505}, {35,60,3928}, {35,18,4152},
{35,14,4329}, {35, 9,8435}, {35,3,9568}, {36,26,4377}, {36,9,5208}, {37,10, 7191},
{37,9,5856}, {38,9,9040}, {39,9,5963}, {40,10,4692}, {40,9,5291}, {41,12,4730},
{41,10,4791}, {41,9,4963}, {42,26,4355}, {42,18,5017}, {42,10,7506}, {42, 9,6221},
{43,10,6732}, {43,9,6788}, {44,10,9567}, {45,65,10079}, {45,10,8591}, {45,9,7140},
{46,26,4544}, {46,10,6186}, {46,9,5624}, {47,26,4352}, {47,10,5918}, {47,9,7523},
{48,10,5316}, {48,9,4968}, {49,56,12455}, {49,10,5012}, {49,9,6620}, {50,10,8305},
{50,9,6685}, {51,26,4762}, {51,10,6331}, {51,9,5342}, {52,10,6211}, {52,9,5373},
{53,26,4461}, {53,9,8686}, {54,26,4909}, {54,9,9899}, {55,64,11378}, {55,9,5085},
{56,9,6115}, {57,9,6345}, {58,64,12127}, {58, 26,4638}, {58,9,8528}, {59,26,4664},
{59,9,5191}, {60,24,11795}, {60,21,11849}, {60,9,6786}, {61,26,4789}, {61,23,11513},
{61,9,7752}, {62,26,5046}, {62,24, 11795}, {62,9,7135}, {63,9,7123}, {64,26,4368},
{64,9,8483}, {65,26,4912}, {65,9,5939}, {66,34,9153}, {66,9,5376}, {67,26,4361}, {67,9, 6403},
{68,26,4458}, {69,10,4925}, {69,9,5038}, {70,26,4354}, {70,9,6615}, {70,2,11643},
{71,26,4492}, {71,10,4540}, {71,9,6835}, {72,9,6554}, {73,10,4822}, {73,9,6071},
{74,67,3729}, {74,25,10844}, {74,9,7038}, {75,65,12234}, {75,64,9213}, {75,9,5115},
{76,67,4839}, {76,65,11794}, {76,57,12793}, {76,56,12737}, {76,26,5259}, {76,9,4920},
{77,9,7081}, {78,26,5034}, {78,9,6640}, {79,68,5495}, {79,23,10883}, {79,9,5438},
{80,9,5324}, {81, 26,4393}, {82,26,5050}, {82,18,4161}, {82,16,4011}, {82,15, 3997},
{82,14,4632}, {82,10,4574}, {82,9,6427}, {83,26,4417}, {83,9,5113}, {84,9,7614},
{85,66,12358}, {85,60,5821}, {85,53,6901}, {85,19,12958}, {85,18,4446}, {85,9,7819},
{86,19,11022}, {86,9, 5690}, {87,9,5806}, {88,9,5846}, {89, 9,5583}, {90,64,11487},
{90,24,10515}, {90,22,12108}, {90,9,6876}, {91,9, 9245}, {92,56,12772}, {92,34,11759},
{92,24,11653}, {92, 23,11753}, {92,22,12229}, {92,21,11898}, {92,20,12019}, {92,19,12982},
{92,9,9065}, {93,57, 12778}, {93,56,12772}, {93,19,12941}, {93,9,8544}, {94,9, 6319},
{95,24,11772}, {95,22,9014}, {95,10,7549}, {95,9,7275}, {96,26, 4406}, {96,9,5441},
{97,26,4691}, {98,26,4549}, {98,9,6709}, {99,9,5199}, {100,9,5564}, {101,23,11165}, {101,9,6683}};
Flatten[Select[Split[Sort[t,OrderedQ[{First[#1], First[#2]}]&],First[#1]==First[#2]&],Length[#]==2&] ,1]

Out[2]=
{{0,26,4728},{0,9,7111},{6,26,6290},{6,9,5565}, {15,10,4901},{15,9,7152},{17,26,5864},{17,9,5809},
{19,26,4788},{19,9,8497},{22,26,5644},{22,9,6466}, {25,56,12404},{25,9,5348},{28,57,9897},{28,9,6840},
{34,10,4599},{34,9,5505},{36,26,4377},{36,9,5208}, {37,10,7191},{37,9,5856},{40,10,4692},{40,9,5291},
{43,10,6732},{43,9,6788},{48,10,5316},{48,9,4968}, {50,10,8305},{50,9,6685},{52,10,6211},{52,9,5373},
{53,26,4461},{53,9,8686},{54,26,4909},{54,9,9899}, {55,64,11378},{55,9,5085},{59,26,4664},{59,9,5191},
{64,26,4368},{64,9,8483},{65,26,4912},{65,9,5939}, {66,34,9153},{66,9,5376},{67,26,4361},{67,9,6403},
{69,10,4925},{69,9,5038},{73,10,4822},{73,9,6071}, {78,26,5034},{78,9,6640},{83,26,4417},{83,9,5113},
{86,19,11022},{86,9,5690},{96,26,4406},{96,9,5441}, {98,26,4549},{98,9,6709},{101,23,11165},{101,9,6683}}
 
Last edited:
  • #3
Another method, perhaps simpler to understand.

Select[t, Count[t, {First[#], __}] == 2 &]

This will probably be slower for really large lists because it will have to do n passes over the list and do n pattern match comparisons on each pass.
 
  • #4
Bill Simpson, you're amazing! I knew there was probably an easy way to do this but for the life of me could not figure it out! If I may ask, how did you know to use those specific functions?



I have one last thing I need help on. Now, that the data has been parsed into pairs of rows (1st column common values), I need to:

Check the second column integer (which refers to a channel number) of each of the paired rows:

The row is categorized as "ANODE" if the 2nd column value is within: {1 to 35}
The row is categorized as "CATHODE" if the 2nd column value is within: {36 to 70}

Now, what I need mathematica to do is KEEP every paired rows (1st column common values) that has a 2nd column's of "ANODE and CATHODE" or vice versa and delete the rest. Any pair with that is "CATHODE and CATHODE" or "ANODE and ANODE" should be deleted.

Seriously, thanks in advanced...I'm new to mathematica and have been working hours on this!
 
  • #5
jasonpatel said:
Bill Simpson, you're amazing! I knew there was probably an easy way to do this but for the life of me could not figure it out! If I may ask, how did you know to use those specific functions?

Hundreds and thousands of hours of studying and sharpening the tools.

I remember 35 years ago reading a thin little book on FORTRAN 4 in half an hour. It was less than 1 cm. thick. I think it might have been yellow. I would buy one for old times if I could remember the title or recognize the cover and find a copy on the net.

After I was done I went and asked others "Was that it, was that all there was to this programming thing?" When they told me that was it I replied "How do you get anything done if that is all there is?" I have my copy of the last printed Mathematica reference manual and it is 6 cm. thick and that only covered a small fraction of the language of the language a decade ago. Since that time they have added a thousand or more new commands with every new version and they have said it is impossible to ever be able to print a new reference manual, it would take up a whole shelf.

Why am I telling you this? Because the language is huge. Buy yourself a few good books and read them cover to cover a few times. "Mathematica Navigator" is good. "The Mathematica Cookbook" is good. "Applied Mathematica: Getting Started, Getting It Done" is very old and should be available really cheaply from the discount book dealers. I learned a lot from that fifteen years ago. This much reading should help get you started with the first few hundred hours of starting to learn the tool. Trott's epic tomes are likely too advanced and specialized for almost any reader and there are a number of other really really bad Mathematica books out there.

jasonpatel said:
I have one last thing I need help on. Now, that the data has been parsed into pairs of rows (1st column common values), I need to:

Check the second column integer (which refers to a channel number) of each of the paired rows:

The row is categorized as "ANODE" if the 2nd column value is within: {1 to 35}
The row is categorized as "CATHODE" if the 2nd column value is within: {36 to 70}

Now, what I need mathematica to do is KEEP every paired rows (1st column common values) that has a 2nd column's of "ANODE and CATHODE" or vice versa and delete the rest. Any pair with that is "CATHODE and CATHODE" or "ANODE and ANODE" should be deleted.

Seriously, thanks in advanced...I'm new to mathematica and have been working hours on this!

Take almost what we had before to find your pairs, but don't break them up into individual items yet.

r1 = Select[Split[Sort[t, OrderedQ[{First[#1], First[#2]}] &], First[#1] == First[#2] &], Length[#] == 2 &]

Next filter out the acceptable pairs.

r2 = Select[r1, #[[1, 2]] < 35.5 && #[[2,2]] > 35.5 || #[[1, 2]] > 35.5 && #[[2, 2]] < 35.5 &]

And finally strip off one layer of extra {}, which could be done several different ways.

r3 = Flatten[r2, 1]

leaving you with

{{25, 56, 12404}, {25, 9, 5348}, {28, 57, 9897}, {28, 9, 6840}, {55, 64,11378}, {55, 9, 5085}}
 
  • #6
Awesome...once again thank you! I will be taking a careful look at these codes over the next few days to make sure i know what exactly is going on!
 
  • #7
Thanks for your help Bill! Now, I am curious...

How about if I wanted to characterize the ANODE and CATHODE values different and not in order for example:

-The row is categorized as "ANODE" if the 2nd column value is within: {5,6,7,8,9,10,11,12,13,14,15,16,17,18,26,27,28,29,30,31,32,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,60,61,62,63,67,68,69,70}

-The row is categorized as "CATHODE" if the 2nd column value is within: {1,2,3,4,19,20,21,22,23,24,25,33,34,37,38,39,40,56,57,58,59,64,65,66}I've tried...

A. Replacing the inequality with a double equals and pasting the values:r2 = Select[r1, #[[1, 2]] =={VALUES1} && #[[2,2]]=={VALUES2} || #[[1, 2]] =={VALUES2} && #[[2, 2]]== {VALUES1} &]B. Defining an anode and cathode variable with those values:

r2 = Select[r1, #[[1, 2]] ==var_an && #[[2,2]] ==var_cat || #[[1, 2]] ==var_cat 35.5 && #[[2, 2]] ==var_an &]C. Just pasting the list as I did in A but with || "ors" instead of commas.

D. I used arrows instead of double equals, which seem to not give errors, but not the desired result (gave null result of {}).None of these worked haha.
 
Last edited:
  • #8
  • #9
Thanks for the hint...very helpful and I doubt I would have ever guessed to use MemberQ.

Anyways, here is what worked for me:


-Defining an anode variable (an) and a cathode variable (cat).-r2 = Select[r1, MemberQ[an,#[[1, 2]]] && MemberQ[cat,#[[2,2]]]|| MemberQ[cat,#[[1, 2]]] && MemberQ[an,#[[2, 2]]]&];I attached the file in its entirety. Do you have any helpful suggestions or corrections?
 

Attachments

  • Example.txt
    4.4 KB · Views: 506
  • #10
There is no 35 or 36 in an or cat. Other than that I see nothing other than suggesting you carefully and extensively document your code so you will understand it next time.
 
  • #11
Good eye! But that was done on purpose...Yes, I can't say I fully understand how r1 is done...but I will be studying it. Hope you don't mind if I bug you with more questions hahahaha :)Thanks for all your help though. You seriously saved me hours of work and the hairs on my head!
 

What is the purpose of deleting duplicate values of a matrix in Mathematica?

The purpose of deleting duplicate values of a matrix in Mathematica is to simplify and clean up the data in the matrix, making it easier to analyze and manipulate.

How do I delete duplicate values of a matrix in Mathematica?

To delete duplicate values of a matrix in Mathematica, you can use the function DeleteDuplicates which will remove all duplicate elements in the matrix.

Will deleting duplicate values affect the structure of my matrix?

No, deleting duplicate values will not affect the structure of your matrix. It will only remove the duplicate values, leaving the remaining elements in their original positions.

What happens if my matrix contains both duplicate and non-duplicate values?

In this case, only the duplicate values will be removed, leaving the non-duplicate values intact in their original positions in the matrix.

Can I delete duplicate values from a specific row or column in my matrix?

Yes, you can use the DeleteDuplicatesBy function to specify a specific row or column in your matrix to remove duplicate values from.

Similar threads

  • MATLAB, Maple, Mathematica, LaTeX
Replies
2
Views
1K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
3
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
876
  • Precalculus Mathematics Homework Help
Replies
1
Views
528
Replies
27
Views
2K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
1
Views
1K
  • Linear and Abstract Algebra
Replies
15
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
14
Views
1K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
2
Views
4K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
5
Views
1K
Back
Top