Python Python code for mapping numbers n text file

AI Thread Summary
The discussion revolves around a programming challenge involving data extraction from a text file using Python. The user seeks to map pairs of numbers that appear within the same brackets, specifically the numbers "4758" and "3895," while ignoring any occurrences where these numbers are in different brackets. The desired output format includes a count of occurrences for each key.Initial attempts at coding the solution were unsuccessful, prompting the user to seek advice. A response provided a more refined approach using regular expressions to parse the data effectively. The suggested code iterates through each line of the file, matches patterns to extract relevant numbers, and counts their occurrences. Despite implementing the provided code, the user encountered issues with the search functionality, specifically when trying to find the numbers "4758" and "3895." The user expressed gratitude for the help and indicated a willingness to learn from the code, while also requesting further assistance to resolve the search issue. Regular expressions were highlighted as a complex yet valuable tool for such tasks.
Bala06
Messages
10
Reaction score
0
Dear Programmers

I am having text file having huge data.

From the text file i would like to map the two numbers that are present in the same brackets and the values should not considered if they are present in different numbers.

say for example I want to map number "4758 & 3895" that are in the saem brackets.

I have attached the copy of the data for your reference.

I am expecting the results something like this:

712 0
713 0
714 0
1049 4758 3895 1
1050 4758 3895 1
1051 0
1052 0
1053 0
1054 4758 3895 1
1055 4758 3895 1

where "1" represent the number of counts.

I'm trying python code like this, but its not working.

Code:
def main():
    #Open file for reading
    infile = open('numbers.txt', 'r')
    #Read the contents of the file into a list
    number = infile.readlines()
    #Convert each element into an int
    index = 0
    while index !=len(account_number):
        number = [int(4758 3895) for num in number]
        index +=1
    #Get an the search value
    search = raw_input('Enter a number to search for: ')
    #Determine whether this record matches the search value.
    if search in number:
        print 'Number found.', search
    else:
        print 'The number was not found.'
    #Close the file.
    infile.close()        

#Call the main function.
main()
I'm not familiar in python kindly advice.

Many Thanks
Balaji
 

Attachments

Technology news on Phys.org
I gave it a crack. I'm sure there's more work for you to do, but that should give you an idea of at least one way to do what you want. Wouldn't be surprised if some python expert can reduce it to 5 lines or something, but that's what I've got. :biggrin:

Trying to explain all this to you would take a loooong time, and I think you'll learn a lot if you just see some code. I try and not post code in this way, generally, so all I ask is that you try and understand what I've written so you learn and it will not be wasted. Time spent learning them is time well spent.

Of course, ask away if you can't figure out something I've done. I put in a few comments that should help. And yes, regular expressions are hard to read and hard to learn. However, you will never regret learning them. They are powerful and supremely useful.

Code:
import re

def main():
    numberDict = {}
    #Open file for reading
    infile = open('numbers.txt', 'r')
    # Iterate over each line in the file
    for line in infile:
        # Match all lines, isolating the key number and data values using
        # regular expression groups
        mo = re.match(r"^(\d+)-hydrogen-bond-frame\.dat\.c\.d\.lw\s\d+-hydrogen-bond-frame\.dat\.c\.d\.pw\s\[(.*)\]", line)
        if mo != None:
            key = mo.group(1)
            data = mo.group(2)
            data = re.sub('[\(\)]', '', data)
            #print "Key: ", key
            #print "Data: ", data
            # Isolate each group of 3 values that go together in the data
            subs = re.findall(r"\d+,\s\'\w+\',\s\'[\w\s]+\'", data)
            numbers = [];
            # Iterate over each substring (group of 3 values) and isolate
            # the first integer, which is what we're interested in
            for substr in subs:
                submo = re.match(r"(\d+),\s\'\w+\',\s\'[\w\s]+\'", substr)
                if submo != None:
                    num = submo.group(1)
                    #print num
                    # Concatenate the numbers into a string with spaces
                    # between each
                    numbers.append(num)
            #print "Value: ", numbers 
            # Add all the numbers as the value in the dictionary using the
            # previously isolated key
            # IMPORTANT NOTE: No attempt is made to ensure there are no
            # repeated values.
            if key in numberDict:
                count = numberDict[key][0] + 1
                numberDict[key][0] = count
                numberDict[key] = numberDict[key].append(numbers)
            else:
                numbers.insert(0, 1)
                numberDict[key] = numbers
        else:
            print 'No match.'

    # Done with file, so close it
    infile.close();

    #print numberDict

    #Get an the search value
    search = raw_input('Enter a number to search for: ')
    #Determine whether this record matches the search value.
    if search in numberDict:
        print 'Number found.', search
        print 'Values: ', numberDict[search]
    else:
        print 'The number was not found.'

#Call the main function.
main()
 
Grep said:
I gave it a crack. I'm sure there's more work for you to do, but that should give you an idea of at least one way to do what you want. Wouldn't be surprised if some python expert can reduce it to 5 lines or something, but that's what I've got. :biggrin:

Trying to explain all this to you would take a loooong time, and I think you'll learn a lot if you just see some code. I try and not post code in this way, generally, so all I ask is that you try and understand what I've written so you learn and it will not be wasted. Time spent learning them is time well spent.

Of course, ask away if you can't figure out something I've done. I put in a few comments that should help. And yes, regular expressions are hard to read and hard to learn. However, you will never regret learning them. They are powerful and supremely useful.

Code:
import re

def main():
    numberDict = {}
    #Open file for reading
    infile = open('numbers.txt', 'r')
    # Iterate over each line in the file
    for line in infile:
        # Match all lines, isolating the key number and data values using
        # regular expression groups
        mo = re.match(r"^(\d+)-hydrogen-bond-frame\.dat\.c\.d\.lw\s\d+-hydrogen-bond-frame\.dat\.c\.d\.pw\s\[(.*)\]", line)
        if mo != None:
            key = mo.group(1)
            data = mo.group(2)
            data = re.sub('[\(\)]', '', data)
            #print "Key: ", key
            #print "Data: ", data
            # Isolate each group of 3 values that go together in the data
            subs = re.findall(r"\d+,\s\'\w+\',\s\'[\w\s]+\'", data)
            numbers = [];
            # Iterate over each substring (group of 3 values) and isolate
            # the first integer, which is what we're interested in
            for substr in subs:
                submo = re.match(r"(\d+),\s\'\w+\',\s\'[\w\s]+\'", substr)
                if submo != None:
                    num = submo.group(1)
                    #print num
                    # Concatenate the numbers into a string with spaces
                    # between each
                    numbers.append(num)
            #print "Value: ", numbers 
            # Add all the numbers as the value in the dictionary using the
            # previously isolated key
            # IMPORTANT NOTE: No attempt is made to ensure there are no
            # repeated values.
            if key in numberDict:
                count = numberDict[key][0] + 1
                numberDict[key][0] = count
                numberDict[key] = numberDict[key].append(numbers)
            else:
                numbers.insert(0, 1)
                numberDict[key] = numbers
        else:
            print 'No match.'

    # Done with file, so close it
    infile.close();

    #print numberDict

    #Get an the search value
    search = raw_input('Enter a number to search for: ')
    #Determine whether this record matches the search value.
    if search in numberDict:
        print 'Number found.', search
        print 'Values: ', numberDict[search]
    else:
        print 'The number was not found.'

#Call the main function.
main()



Dear Grep

Thanks a lot for the code.

Sure i'll try to understand & if any doubts get back to you.

I tried to run the code by mentioning the file in the respective directory & enter the number to search as 4758 3895 for mapping in the same brackets & should be eliminated if they are present in different brackets, but its displays as numbers not found.

kindly advice.

Many Thanks
Balaji
 
Dear Peeps I have posted a few questions about programing on this sectio of the PF forum. I want to ask you veterans how you folks learn program in assembly and about computer architecture for the x86 family. In addition to finish learning C, I am also reading the book From bits to Gates to C and Beyond. In the book, it uses the mini LC3 assembly language. I also have books on assembly programming and computer architecture. The few famous ones i have are Computer Organization and...
I have a quick questions. I am going through a book on C programming on my own. Afterwards, I plan to go through something call data structures and algorithms on my own also in C. I also need to learn C++, Matlab and for personal interest Haskell. For the two topic of data structures and algorithms, I understand there are standard ones across all programming languages. After learning it through C, what would be the biggest issue when trying to implement the same data...
Back
Top