Spreadsheet help: compare two lists to find exceptions

  • Thread starter Thread starter DaveC426913
  • Start date Start date
  • Tags Tags
    Compare
Click For Summary
The discussion revolves around comparing two patient lists in OpenOffice Calc or Excel to identify missing entries from each list. The lists are not normalized, complicating direct comparisons due to differing formats and additional data fields. Users suggest using functions like VLOOKUP and MATCH to find discrepancies, emphasizing the importance of correctly formatting data as numbers or text to avoid errors. The MATCH function is highlighted as a preferred method, particularly when set to return exact matches. There are challenges with derived fields and formatting issues that prevent successful comparisons. Some participants recommend using Excel's Power Query for better data management and suggest considering a database approach for handling large datasets. The conversation also touches on the difficulties of working in a virtual environment and the need for solutions that can be maintained after the original creator departs. Ultimately, the goal is to automate the identification of non-matching records efficiently, given the large volume of data involved.
  • #31
pbuk said:
Coming up: the formulas to achieve this.
Doesn't seem easy in OpenOffice, I'll give you the entries in row 3, the rest just copy down.
Code:
C3 =MATCH(A3;B$2:B$25;0)
D3 =IF(ISNA(C3);0;MAX(D2:D$2)+1)
E3 =E2+1
F3 =MATCH(E3;D$2:D$25;0)
G3 =IF(ISNA(F3);"";INDEX(A$2:A$25;$F3))
 
Computer science news on Phys.org
  • #32
jack action said:
Yes, it does:
It didn't for me (OpenOffice on Windows) - that looks like LibreOffice on Linux to me.
 
  • Like
Likes jack action and DaveC426913
  • #33
Yeah, works in LibreOffice on Windows too, and we can display the formulas - see below (different formulas as we don't need the guard rows now).

1675811998151.png


Can you persuade your client to move to LibreOffice, apart from these there are a number of technical and commercial reasons to drop OpenOffice since the fork?
 
  • #34
pbuk said:
Can you persuade your client...
Well, client (hospital) has Excel on their image for virtual servers, but it is such a pain to tunnel in via VPN to do this kind of finicky work.

The biggest headache is that the virtual environment actively discourages copy/pasting of any sort. I mean, literally copying a number from a Notepad doc and pasting it in to a spreadsheet cell will not work most of the time. There is some sort of chron job or utility that clears the clipboard once per second. If I hit Ctrl-C ten times in a row, and then paste within about two seconds I have maybe a 50% chance of pasting what was in the clipboard. Crazy stuff.
 
  • #35
DaveC426913 said:
Indeed. Too bad it doesn't work in OpenOffice Calc. :frown:
In my case (LibreOffice), after I begin writing a function, when I click on a column heading, it writes B1:B1048576, and then it transforms into B:B after I get out of the cell.

Maybe you can just write the full column range by default?
 
  • #36
jack action said:
Maybe you can just write the full column range by default?
Yeah. I just set it to A$1:A$500, which will cover the largest list I have.

But this really gets into the weeds of how I'm going to move forward with this. If this whole process has to survive me (which I verified is the desire, I'm going to have to use the client's preferred software, which is Excel.

Still, no idea where it's going to live or who will be the keeper of the lore. There's no "owner" and no central repository. But that's not an issue that will be solved here...
 
  • #37
DaveC426913 said:
But this really gets into the weeds of how I'm going to move forward with this. If this whole process has to survive me (which I verified is the desire, I'm going to have to use the client's preferred software, which is Excel.
Oh I have misunderstood, I didn't realise you were being paid for this - my bill is in the post.
 
  • Haha
Likes DaveC426913
  • #38
pbuk said:
Oh I have misunderstood, I didn't realise you were being paid for this - my bill is in the post.
It is technically for my (salaried) job, but I don't get expenses. I'm just expected to produce this report.
 
  • #39
OK, I've switched to Excel. I've got a primitive solution but since it uses the MATCH function, it spits out matches. Any matched records display as #N/A.

What I'm after is the opposite. I need to display non matches.

I've looked at VLOOKUP and HLOOKUP but that's not what I'm trying to do - at least they don't seem to work across two sheets.)

So:

Sheet "RCP":
7072Alice
4154Andy
etc.
Sheet "PSS":
7033Bill
7058Dave
7072Alice
7103Charlie
7113Annie
etc.
I hope to get:

RCP NO matches in PSS:
4154 Andy

PSS NO matches in RCP:
7033 Bill
7058 Dave
7103 Charlie
7113 Annie
 
  • #40
With 16K ids on the list, upload each list as a separate table into MS Access and do an outer join
 
  • #41
BWV said:
With 16K ids on the list, upload each list as a separate table into MS Access and do an outer join
I found an even better solution:
conscript a gaggle of interns to do it with pencil and paper.
 
  • #42
DaveC426913 said:
I found an even better solution:
conscript a gaggle of interns to do it with pencil and paper.
Well you could use the query to check the work, than fire the lot of them for wasting time doing it manually and not knowing how to do a simple query
 
  • Haha
Likes DaveC426913

Similar threads

Replies
18
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 38 ·
2
Replies
38
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K
Replies
7
Views
2K
  • · Replies 12 ·
Replies
12
Views
6K
  • Sticky
  • · Replies 48 ·
2
Replies
48
Views
66K
Replies
2
Views
4K