Algorithm Optimization [Python]

Click For Summary
SUMMARY

The discussion focuses on optimizing the algorithm for finding two integers in a list that sum to a specified value using Python. The initial implementation, which uses a nested loop, is inefficient for large lists, especially those exceeding 10 million elements. The user explores alternative methods, including organizing numbers into positive and negative lists and considering data structures like hashtables for faster lookups. The goal is to reduce the execution time from 12,000 milliseconds by leveraging more efficient algorithms and data structures.

PREREQUISITES
  • Understanding of Python programming (Python 3.x)
  • Familiarity with algorithm complexity and optimization techniques
  • Knowledge of data structures, particularly lists and hashtables
  • Basic understanding of sorting algorithms
NEXT STEPS
  • Implement a hashtable-based solution for the sum_pairs function
  • Research the time complexity of different data structures in Python
  • Learn about sorting algorithms and their impact on algorithm performance
  • Explore Python's built-in functions for optimizing list operations
USEFUL FOR

Python developers, algorithm enthusiasts, and anyone looking to improve their skills in algorithm optimization and data structure utilization.

Jamison Lahman
Messages
142
Reaction score
35

Homework Statement


Given a list of integers and a single sum value, return the first two values (parse from the left please) in order of appearance that add up to form the sum.
Python:
sum_pairs([11, 3, 7, 5],         10)
#              ^--^      3 + 7 = 10
== [3, 7]

sum_pairs([4, 3, 2, 3, 4],         6)
#          ^-----^         4 + 2 = 6, indices: 0, 2 *
#             ^-----^      3 + 3 = 6, indices: 1, 3
#                ^-----^   2 + 4 = 6, indices: 2, 4
#  * entire pair is earlier, and therefore is the correct answer
== [4, 2]

sum_pairs([0, 0, -2, 3], 2)
#  there are no pairs of values that can be added to produce 2.
== None/nil/undefined (Based on the language)

sum_pairs([10, 5, 2, 3, 7, 5],         10)
#              ^-----------^   5 + 5 = 10, indices: 1, 5
#                    ^--^      3 + 7 = 10, indices: 3, 4 *
#  * entire pair is earlier, and therefore is the correct answer
== [3, 7]
Negative numbers and duplicate numbers can and will appear.

NOTE: There will also be lists tested of lengths upwards of 10,000,000 elements. Be sure your code doesn't time out.This is a problem off of codewars.com. I have made two versions that satisfy the eight examples but there is a 12000ms time restraint that I need to overcome. I am not versed in CS so I don't know much about algorithms or how to optimize them.

2. Homework Equations

The examples are:
Python:
l1= [1, 4, 8, 7, 3, 15]
l2= [1, -2, 3, 0, -6, 1]
l3= [20, -13, 40]
l4= [1, 2, 3, 4, 1, 0]
l5= [10, 5, 2, 3, 7, 5]
l6= [4, -2, 3, 3, 4]
l7= [0, 2, 0]
l8= [5, 9, 13, -3]

print(sum_pairs(l1, 8) == [1, 7]) #"Basic: %s should return [1, 7] for sum = 8" % l1)
print(sum_pairs(l2, -6) == [0, -6]) #"Negatives: %s should return [0, -6] for sum = -6" % l2)
print(sum_pairs(l3, -7) == None) #"No Match: %s should return None for sum = -7" % l3)
print(sum_pairs(l4, 2) == [1, 1]) #"First Match From Left: %s should return [1, 1] for sum = 2 " % l4)
print(sum_pairs(l5, 10) == [3, 7]) #"First Match From Left REDUX!: %s should return [3, 7] for sum = 10 " % l5)
print(sum_pairs(l6, 8) == [4, 4]) #"Duplicates: %s should return [4, 4] for sum = 8" % l6)
print(sum_pairs(l7, 0) == [0, 0]) #"Zeroes: %s should return [0, 0] for sum = 0" % l7)
print(sum_pairs(l8, 10) == [13, -3]) #"Subtraction: %s should return [13, -3] for sum = 10" % l8)

They should all return True

The Attempt at a Solution


My first attempt is simple and straight forward:
Python:
def sum_pairs(ints, s):
    prev_list = []
    for num in ints:
        for test in prev_list:
            if num+test == s:
                return ([test, num])
        prev_list += [num]
My second attempt is the same structure however I store and organize previous test numbers by either positive or negative:
Python:
def sum_pairs(ints, s):
    neg_list = []
    pos_list = []
    for num in ints:
        if num > s:
            for neg in neg_list:
                if neg == s - num:
                    return [neg,num]
        else:
            for pos in pos_list:
                if pos == s - num:
                    return [pos,num]
        if num < 0:
            neg_list += [num]
        else:
            pos_list += [num]
With this new method, I went from 50 total possible returns to 31. I am not sure if this is the right direction to go. It very well may be I'm doing more work by organizing than I am saving.

From here, my plan was to make four lists: two negatives with one where abs(number) >= abs(s) and the other abs(number) < abs(s) with a similar pattern for positive numbers. I think it might be helpful to find the sign of 's' at the very beginning but I haven't figured out how to use that to my advantage..
 
Last edited:
Technology news on Phys.org
You just need to look at all the numbers once, and store them is such a way while doing this, you can immediately look up if the new number forms a pair with any number that you have already seen.
This can be done in a lot of ways: keeping the list sorted, storing the list as a hashtable or tree, or use an array of all the possible values that can occur.
 
willem2 said:
You just need to look at all the numbers once, and store them is such a way while doing this, you can immediately look up if the new number forms a pair with any number that you have already seen.
This can be done in a lot of ways: keeping the list sorted, storing the list as a hashtable or tree, or use an array of all the possible values that can occur.
Ahhh I think I was trying to do this intuitively but lacked the programming experience. I think I will try to sort the list first:
Code:
if prev_num not in prev_list:
    prev_list += [prev_num]
    prev_list = sorted(prev_list)
After that, I will be able search the list backwards or forwards until (s-(num+test)) changes signs.

If that doesn't do the trick, I will try hashtables. The book I used to learn python briefly mentioned how fast hashtables were but didn't go over them in-depth.
Thanks
 

Similar threads

  • · Replies 28 ·
Replies
28
Views
5K
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 16 ·
Replies
16
Views
2K
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 34 ·
2
Replies
34
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K