Permutation Counting Algorithm for Large Integers

  • Thread starter Thread starter DrDu
  • Start date Start date
  • Tags Tags
    Permutation
Click For Summary
SUMMARY

The discussion focuses on developing an efficient permutation counting algorithm for large integers, specifically to construct a vector b from a vector a of permuted numbers. The proposed solutions include using a binary tree structure to track higher numbers efficiently, achieving better than O(n^2) complexity, and employing a modified mergesort algorithm that operates in O(n log n) time. Additionally, the theoretical possibility of achieving O(n) complexity with unbound integers and bitmask operations is explored.

PREREQUISITES
  • Understanding of binary tree data structures
  • Familiarity with the mergesort algorithm
  • Knowledge of time complexity analysis
  • Concept of unbound integers and bitmask operations
NEXT STEPS
  • Implement a binary tree structure for counting higher elements in Python
  • Study the mergesort algorithm and its applications in counting inversions
  • Research unbound integers and their implications in algorithm design
  • Explore advanced data structures like Fenwick Trees or Segment Trees for similar problems
USEFUL FOR

Software developers, algorithm designers, and computer scientists interested in optimizing permutation counting and understanding advanced data structures and algorithms.

DrDu
Science Advisor
Messages
6,423
Reaction score
1,004
Remark: This is not homework

Given a vector a of permuted numbers from 1 to n, I want to construct the vector b which contains in the ith position the number of elemets of a_j with j <i which are larger than a_i. E.g.
a=(5,2,3,1,4), b=(0,1,1,3,1). Is there an algorithm which scales faster than O(n^2) ? I think yes, maybe a variant of mergesort or the like.

Thank you
DrDu
 
Technology news on Phys.org
There is no element with j<i if i=1.
 
Just blue-skying here without a lot of thought.

Make the first array 2-dimensional with the 2nd dimension containing the element index-1
Sort the full 2-dimensional array based on the original data
The 2nd-dimensional data, carrying the original index, will be in the position of <No.-of-smaller-values>

Cheers,
Tom

EDIT: This will give you the ≤ function rather than <. Hope that's adequate. If not, a second pass over the sorted data to massage the duplicates is needed.
 
Dear Tom,

if I understand correctly you are constructing the inverse permutation. It cannot coincide with the vector b I am looking for, as the vector b usually contains repeated entries, as in the example.
 
Can you use a tree? Each node has a number and at most two children, one with a lower number and one with a higher number. Each node keeps track of how many higher children it has.

You start with a root node containing 5. Then you add 2 as the "lower child" of 5, then 3 as the "higher child" of 2 (because it's lower than 5, but 5 already has a lower child but 2 doesn't have a higher child), then 1 as the "lower child" of 2 (5 already has a lower child but 2 doesn't), then 4 as the "higher child" of 3 (5 has a lower child, 2 has a higher child, but 3 doesn't).

If you keep track of how many child nodes have been added on the higher subtree of each one, I think you can work out how many higher numbers you've already seen without having to check every one, which ought to be better than ##O(n^2)##.
 
Last edited:
Python:
class TreeNode:
    def __init__(self,v):
        self.v=v # A number from the original list
        self.nGreater=0 # The number of higher numbers we've seen.
        self.lowerChild=None
        self.greaterChild=None
    def addNode(self,newV):
        if newV<self.v:
            if self.lowerChild is None:
                # New number is lower than this node and we have not seen any such numbers before. Return 1 (this node) plus how many greater-than-this-node children
                self.lowerChild=TreeNode(newV)
                return 1+self.nGreater
            else:
                # New number is lower than this node. Return 1 (this node) plus how many greater-than-this-node children plus anything between the new node and this node
                return 1+self.nGreater+self.lowerChild.addNode(newV)
        else:
            self.nGreater+=1
            if self.greaterChild is None:
                # Greater than this node and we haven't seen such a number before - return that there are no larger numbers below this node
                self.greaterChild=TreeNode(newV)
                return 0
            else:
                # Greater than this node - return how many greater-than-the-new-node values are found in the subtree
                return self.greaterChild.addNode(newV)
    def printTree(self):
        chstr=""
        if self.lowerChild is None:
            chstr+="-"
        else:
            chstr+=str(self.lowerChild.v)
        if self.greaterChild is None:
            chstr+=" -"
        else:
            chstr+=" "+str(self.greaterChild.v)
        print "Node:",self.v," Children:",chstr
        if self.lowerChild is not None:
            self.lowerChild.printTree()
        if self.greaterChild is not None:
            self.greaterChild.printTree()

a=[5,2,3,1,4]
rootNode=TreeNode(a[0])
b=[0]
for a_i in a[1:]:
    b.append(rootNode.addNode(a_i))
print b
rootNode.printTree()
This has not been extensively tested, and was put together in half an hour on a tiny phone. No guarantees it's right!
 
Last edited:
Thank you Ibix!
In the meanwhile, I figured out how to do this using mergesort:

1. Initialize the vector b to n times 0.
When joining two lists, which are already sorted in inverse order, you create a new list, taking always the larger left element of the two sub-lists. Just keep a counter of the number of elements already in the new list which have been selected from the left list. When an element from the right list is joined to the new list, increase b by the value of that counter.

Edit: replaced right with left.
 
Last edited:
  • Love
Likes   Reactions: Tom.G
Btw, a note on this: mergesort is O(n log n).

I do believe this is O(n) if one can use unbound (ie infinitely-long) integers (by using 2 bitmasks). Of course this is theoretical, as there's no machine that can implement unbound integers.

Then one could consider variable-length integers; then the algorithm itself is still O(n), but each bitmask operation in current hardware for variable-length integers is O(n) by itself, so that makes the algorithm actually O(n^2), but one could argue that's a hardware limitation. If one had a (theoretical) hardware that bitmask operations in variable length integers were O(1), then the algorithm would be O(n).
 
Last edited:

Similar threads

  • · Replies 4 ·
Replies
4
Views
3K
Replies
9
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
Replies
2
Views
7K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 23 ·
Replies
23
Views
2K
Replies
12
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
Replies
2
Views
2K