Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Python code for mapping numbers n text file

  1. Jun 20, 2011 #1
    Dear Programmers

    I am having text file having huge data.

    From the text file i would like to map the two numbers that are present in the same brackets and the values should not considered if they are present in different numbers.

    say for example I want to map number "4758 & 3895" that are in the saem brackets.

    I have attached the copy of the data for your reference.

    I am expecting the results something like this:

    712 0
    713 0
    714 0
    1049 4758 3895 1
    1050 4758 3895 1
    1051 0
    1052 0
    1053 0
    1054 4758 3895 1
    1055 4758 3895 1

    where "1" represent the number of counts.

    I'm trying python code like this, but its not working.

    Code (Text):
    def main():
        #Open file for reading
        infile = open('numbers.txt', 'r')
        #Read the contents of the file into a list
        number = infile.readlines()
        #Convert each element into an int
        index = 0
        while index !=len(account_number):
            number = [int(4758 3895) for num in number]
            index +=1
        #Get an the search value
        search = raw_input('Enter a number to search for: ')
        #Determine whether this record matches the search value.
        if search in number:
            print 'Number found.', search
            print 'The number was not found.'
        #Close the file.

    #Call the main function.
    I'm not familiar in python kindly advice.

    Many Thanks

    Attached Files:

  2. jcsd
  3. Jun 20, 2011 #2
    I gave it a crack. I'm sure there's more work for you to do, but that should give you an idea of at least one way to do what you want. Wouldn't be surprised if some python expert can reduce it to 5 lines or something, but that's what I've got. :biggrin:

    Trying to explain all this to you would take a loooong time, and I think you'll learn a lot if you just see some code. I try and not post code in this way, generally, so all I ask is that you try and understand what I've written so you learn and it will not be wasted. Time spent learning them is time well spent.

    Of course, ask away if you can't figure out something I've done. I put in a few comments that should help. And yes, regular expressions are hard to read and hard to learn. However, you will never regret learning them. They are powerful and supremely useful.

    Code (Text):

    import re

    def main():
        numberDict = {}
        #Open file for reading
        infile = open('numbers.txt', 'r')
        # Iterate over each line in the file
        for line in infile:
            # Match all lines, isolating the key number and data values using
            # regular expression groups
            mo = re.match(r"^(\d+)-hydrogen-bond-frame\.dat\.c\.d\.lw\s\d+-hydrogen-bond-frame\.dat\.c\.d\.pw\s\[(.*)\]", line)
            if mo != None:
                key = mo.group(1)
                data = mo.group(2)
                data = re.sub('[\(\)]', '', data)
                #print "Key: ", key
                #print "Data: ", data
                # Isolate each group of 3 values that go together in the data
                subs = re.findall(r"\d+,\s\'\w+\',\s\'[\w\s]+\'", data)
                numbers = [];
                # Iterate over each substring (group of 3 values) and isolate
                # the first integer, which is what we're interested in
                for substr in subs:
                    submo = re.match(r"(\d+),\s\'\w+\',\s\'[\w\s]+\'", substr)
                    if submo != None:
                        num = submo.group(1)
                        #print num
                        # Concatenate the numbers into a string with spaces
                        # between each
                #print "Value: ", numbers
                # Add all the numbers as the value in the dictionary using the
                # previously isolated key
                # IMPORTANT NOTE: No attempt is made to ensure there are no
                # repeated values.
                if key in numberDict:
                    count = numberDict[key][0] + 1
                    numberDict[key][0] = count
                    numberDict[key] = numberDict[key].append(numbers)
                    numbers.insert(0, 1)
                    numberDict[key] = numbers
                print 'No match.'

        # Done with file, so close it

        #print numberDict

        #Get an the search value
        search = raw_input('Enter a number to search for: ')
        #Determine whether this record matches the search value.
        if search in numberDict:
            print 'Number found.', search
            print 'Values: ', numberDict[search]
            print 'The number was not found.'

    #Call the main function.
  4. Jun 20, 2011 #3

    Dear Grep

    Thanks a lot for the code.

    Sure i'll try to understand & if any doubts get back to you.

    I tried to run the code by mentioning the file in the respective directory & enter the number to search as 4758 3895 for mapping in the same brackets & should be eliminated if they are present in different brackets, but its displays as numbers not found.

    kindly advice.

    Many Thanks
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook