Spreadsheet help: compare two lists to find exceptions

  • Thread starter Thread starter DaveC426913
  • Start date Start date
  • Tags Tags
    Compare
Click For Summary
SUMMARY

This discussion focuses on comparing two patient lists in OpenOffice Calc to identify missing entries. The user is attempting to automate this process due to the large volume of data, which includes 16,000 patients across 70 doctors. Key functions discussed include MATCH and VLOOKUP, with a strong recommendation to use MATCH for better accuracy. The conversation also highlights the importance of normalizing data fields and ensuring consistent formatting to avoid errors in comparisons.

PREREQUISITES
  • Understanding of OpenOffice Calc functions, specifically MATCH and VLOOKUP
  • Knowledge of data normalization techniques for spreadsheet comparison
  • Familiarity with handling large datasets in spreadsheets
  • Basic skills in using conditional formatting for data visualization
NEXT STEPS
  • Learn how to use Power Query in Excel for advanced data manipulation
  • Research data normalization methods for effective spreadsheet comparisons
  • Explore the use of INDEX and MATCH functions together for data retrieval
  • Investigate scripting options in Python for automating complex data tasks
USEFUL FOR

This discussion is beneficial for data analysts, healthcare administrators, and anyone involved in managing large patient databases who need to efficiently compare and analyze data across multiple lists.

  • #31
pbuk said:
Coming up: the formulas to achieve this.
Doesn't seem easy in OpenOffice, I'll give you the entries in row 3, the rest just copy down.
Code:
C3 =MATCH(A3;B$2:B$25;0)
D3 =IF(ISNA(C3);0;MAX(D2:D$2)+1)
E3 =E2+1
F3 =MATCH(E3;D$2:D$25;0)
G3 =IF(ISNA(F3);"";INDEX(A$2:A$25;$F3))
 
Computer science news on Phys.org
  • #32
jack action said:
Yes, it does:
It didn't for me (OpenOffice on Windows) - that looks like LibreOffice on Linux to me.
 
  • Like
Likes   Reactions: jack action and DaveC426913
  • #33
Yeah, works in LibreOffice on Windows too, and we can display the formulas - see below (different formulas as we don't need the guard rows now).

1675811998151.png


Can you persuade your client to move to LibreOffice, apart from these there are a number of technical and commercial reasons to drop OpenOffice since the fork?
 
  • #34
pbuk said:
Can you persuade your client...
Well, client (hospital) has Excel on their image for virtual servers, but it is such a pain to tunnel in via VPN to do this kind of finicky work.

The biggest headache is that the virtual environment actively discourages copy/pasting of any sort. I mean, literally copying a number from a Notepad doc and pasting it in to a spreadsheet cell will not work most of the time. There is some sort of chron job or utility that clears the clipboard once per second. If I hit Ctrl-C ten times in a row, and then paste within about two seconds I have maybe a 50% chance of pasting what was in the clipboard. Crazy stuff.
 
  • #35
DaveC426913 said:
Indeed. Too bad it doesn't work in OpenOffice Calc. :frown:
In my case (LibreOffice), after I begin writing a function, when I click on a column heading, it writes B1:B1048576, and then it transforms into B:B after I get out of the cell.

Maybe you can just write the full column range by default?
 
  • #36
jack action said:
Maybe you can just write the full column range by default?
Yeah. I just set it to A$1:A$500, which will cover the largest list I have.

But this really gets into the weeds of how I'm going to move forward with this. If this whole process has to survive me (which I verified is the desire, I'm going to have to use the client's preferred software, which is Excel.

Still, no idea where it's going to live or who will be the keeper of the lore. There's no "owner" and no central repository. But that's not an issue that will be solved here...
 
  • #37
DaveC426913 said:
But this really gets into the weeds of how I'm going to move forward with this. If this whole process has to survive me (which I verified is the desire, I'm going to have to use the client's preferred software, which is Excel.
Oh I have misunderstood, I didn't realise you were being paid for this - my bill is in the post.
 
  • Haha
Likes   Reactions: DaveC426913
  • #38
pbuk said:
Oh I have misunderstood, I didn't realise you were being paid for this - my bill is in the post.
It is technically for my (salaried) job, but I don't get expenses. I'm just expected to produce this report.
 
  • #39
OK, I've switched to Excel. I've got a primitive solution but since it uses the MATCH function, it spits out matches. Any matched records display as #N/A.

What I'm after is the opposite. I need to display non matches.

I've looked at VLOOKUP and HLOOKUP but that's not what I'm trying to do - at least they don't seem to work across two sheets.)

So:

Sheet "RCP":
7072Alice
4154Andy
etc.
Sheet "PSS":
7033Bill
7058Dave
7072Alice
7103Charlie
7113Annie
etc.
I hope to get:

RCP NO matches in PSS:
4154 Andy

PSS NO matches in RCP:
7033 Bill
7058 Dave
7103 Charlie
7113 Annie
 
  • #40
With 16K ids on the list, upload each list as a separate table into MS Access and do an outer join
 
  • #41
BWV said:
With 16K ids on the list, upload each list as a separate table into MS Access and do an outer join
I found an even better solution:
conscript a gaggle of interns to do it with pencil and paper.
 
  • Haha
Likes   Reactions: Tom.G
  • #42
DaveC426913 said:
I found an even better solution:
conscript a gaggle of interns to do it with pencil and paper.
Well you could use the query to check the work, than fire the lot of them for wasting time doing it manually and not knowing how to do a simple query
 
  • Haha
Likes   Reactions: DaveC426913

Similar threads

Replies
18
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 38 ·
2
Replies
38
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K
Replies
7
Views
2K
  • · Replies 12 ·
Replies
12
Views
6K
  • Sticky
  • · Replies 48 ·
2
Replies
48
Views
68K
  • · Replies 2 ·
Replies
2
Views
4K