Merging columns from two text files

Click For Summary

Discussion Overview

The discussion revolves around the merging of columns from two text files into a single file. Participants explore various methods to achieve this, including programming solutions in Python and C++, as well as command-line utilities in bash and text editors. The focus is on practical approaches to handle text file manipulation.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Exploratory

Main Points Raised

  • One participant suggests reading the first file into a 2D array and then merging it with the second file's data before writing to a new file.
  • Another participant proposes using Python's pandas library to read both files and concatenate them, providing sample code for clarity.
  • Some participants express a preference for using Excel for this task, citing its ease of use despite potential manual work involved.
  • A participant mentions using bash commands like 'paste' and 'awk' for a straightforward command-line solution.
  • One participant shares a C# code snippet that utilizes parallel processing to merge the files, emphasizing its efficiency.
  • Another participant raises concerns about the need for automation versus occasional manual merging, suggesting text editors with rectangular selection capabilities for infrequent tasks.
  • There is a discussion about file handling issues related to the participant's code, with questions about whether files are being properly closed after use.

Areas of Agreement / Disagreement

Participants express a range of opinions on the best approach to merging text files, with no clear consensus on a single preferred method. Some favor programming solutions, while others advocate for using text editors or Excel.

Contextual Notes

Participants mention limitations related to the handling of open files in their code, as well as the potential for different file lengths between the two text files being merged. There are also concerns about the manual effort required when using Excel.

Who May Find This Useful

This discussion may be useful for individuals looking for various methods to merge text files, particularly those interested in programming, command-line utilities, or text editing techniques.

ChrisVer
Science Advisor
Messages
3,372
Reaction score
465
Suppose I have two .txt files:
Python:
>>text1.txt
X    Y    Z
x1  y1   z1
x2  y2   z2
       ...
xN  yN  zN

>>text2.txt
W   R
w1   r1
w2   r2
   ...
wN   rN

Is it possible to merge the collumns in a single txt file like:
Python:
>>merged.txt
X    Y    Z   W   R
x1  y1   z1   w1  r1
x2  y2   z2   w2  r2
       ...
xN  yN  zN   wN  rN
?
Either with some c++ or python code, or with bash or vim.
 
Technology news on Phys.org
Yes, the first but maybe inefficient solution that comes to my mind is as follows,
You read the first file and put the values into a 2D array(think of it as a matrix) and then you can read the second file and put the values into the corresponding index of the 2D array and then you can write a new file from this array.
 
  • Like
Likes   Reactions: ChrisVer
I hate excel, but for this kind of thing I would be inclined to use it.
 
  • Like
Likes   Reactions: harborsparrow and .Scott
A simple way to do this in python is to use pandas.

Code:
>>> import pandas as pd

>>> text1 = pd.read_csv("text1.txt", sep = "\t")

>>> text1

    X   Y   Z

0  x1  y1  z1

1  x2  y2  z2
>>> text2 = pd.read_csv("text2.txt", sep = "\t")

>>> text2

    W   R

0  w1  r1

1  w2  r2
>>> result = pd.concat([text1,text2],axis = 1)

>>> result

    X   Y   Z   W   R

0  x1  y1  z1  w1  r1

1  x2  y2  z2  w2  r2

result.to_csv("save_me.txt",sep = "\t")

Also if you want to use bash then this works too:

Code:
paste text1.txt text2.txt | awk '{print $1,$2,$3,$4,$5}'

or if you want to write it out to another file.

paste text1.txt text2.txt | awk '{print $1,$2,$3,$4,$5}' > result.txt
 
Last edited:
  • Like
Likes   Reactions: Pepper Mint and ChrisVer
ChrisVer said:
Suppose I have two .txt files:
Python:
>>text1.txt
X    Y    Z
x1  y1   z1
x2  y2   z2
       ...
xN  yN  zN

>>text2.txt
W   R
w1   r1
w2   r2
   ...
wN   rN

Is it possible to merge the collumns in a single txt file like:
Python:
>>merged.txt
X    Y    Z   W   R
x1  y1   z1   w1  r1
x2  y2   z2   w2  r2
       ...
xN  yN  zN   wN  rN
?
Either with some c++ or python code, or with bash or vim.
Assuming that the two files are laid out as you show, with three columns of data in text1.txt and two columns of data in text2.txt, and both files have N lines of data, here is an algorithm:

Code:
Do
    Read x, y, z from text1.txt
    Read w, r from text2.txt
    Write x, y, z, w, r to merged.txt
While (text1.txt contains data AND text2.txt contains data)
 
Did you say vim? As in one time shot with an editor? I don't know vim, but good editors like Notepad++ (on Windows), Nedit (on Linux) and others will allow you to select the two columns of the one file as a rectangular selection; then, you simply paste on the other file.
 
gsal said:
Did you say vim?
I said vim because I know that vim allows a visual block selection (I guess it's the same as the rectangular selection you mentioned)...
So I was guessing that if it's able to select blocks, it should also be able to paste them in some particular position (like an appending of collumn).
 
bash means Linux. There are a lot of excellent text manipulation utilities in Linux. My point is do not write code when a command does it.

Try paste. Example shell script - you can simply run the one line "paste .." command on the command line.
Code:
paste test1.txt text2.txt > newfile.txt

IMO writing code for this is counterproductive.
 
  • Like
Likes   Reactions: bigfooted, websterling, phyzguy and 1 other person
Parallel processing library of C# I really love works like a charm. You should give it a try! :wink:
PHP:
private static void TextFileMerge(string fPath1, string fPath2, string fOutPath)
        {
            try {
                string[] fileContent1 = File.ReadAllLines(fPath1);
                string[] fileContent2 = File.ReadAllLines(fPath2);
                if (fileContent1.Length != fileContent2.Length)
                {
                    throw new ArgumentException("Lengths of two files are different");
                }
                string[] fileMerge = new string[fileContent1.Length];
                using (StreamWriter sw = new StreamWriter(fOutPath))
                {
                    Parallel.For(0, fileContent1.Length, x =>
                    {
                        fileContent1[x] += " " + fileContent2[x];

                    });
                    foreach (string s in fileContent1)
                    {
                        sw.WriteLine(s);
                    }
                }
            }
            catch (Exception)
            {
                //handle
            }
        }
 
  • #10
I would import to Excel, then copy and paste. Might require only minutes.
 
  • #11
Is this something you need to automate or just an occasional need? If it is an occasional need find a text editor that does rectangular select. You can just copy and paste the data. For Windows I recommend Notepad++.

BoB
 
  • #12
harborsparrow said:
I would import to Excel, then copy and paste. Might require only minutes.
1. needs excel and too much manual work (txt->excel->txt).
2. excell is not able to accept columns. I think it would put everything into a single cell.

rbelli1 said:
Is this something you need to automate
yes... I want the program to do the calculations and stuff, the result is saved into text files in a form that can be immediately be compiled into a latex table.
Because the code I am running has problems with closing files I get a system error of too many opened files. In order to overcome this problem I run the code several times with less file inputs. SO, the results are saved into different txt files which I want to merge to compile.
 
Last edited:
  • #13
ChrisVer said:
Because the code I am running has problems with closing files I get a system error of too many opened files.
When you are finished with a file, are you closing it? Since you started this thread asking about C++ and python, my answer is in that context. If you use either of those languages (and others besides those two), good housekeeping dictates that you close a file once you're done with it.
 
  • #14
Mark44 said:
When you are finished with a file, are you closing it?
No, there was generally an issue with how the code handled the open files. I know they solved the problem in the new version of their FW but my analysis is based on the old one [when I tried to update funny things happened].
 
  • #15
ChrisVer said:
No, there was generally an issue with how the code handled the open files.
Which code are you talking about?
ChrisVer said:
I know they solved the problem in the new version of their FW but my analysis is based on the old one [when I tried to update funny things happened].
What is FW?
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
6K
  • · Replies 6 ·
Replies
6
Views
6K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 14 ·
Replies
14
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
4K
Replies
8
Views
6K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 7 ·
Replies
7
Views
3K