MHB Divide Binary Search Tree at Key k: Algorithm & Analysis

evinda · Nov 23, 2014

Hello! (Wave)

Given a binary search tree $B$, I want to write an algorithm, that divides $B$ into two new trees $B_1, B_2$, so that the first one contains all the keys of $B$ that are smaller than $k$ and the second one contains all the keys of $B$ that are greater than $k$.
Hint : Execute a search of $k$, "cutting out" the pointers that "hang" at the left side and the right side of the path. After that, merge the trees of the forest, you get.

I tried to find like that, the position, at which $k$ is:

Code:

if (B==k) p=B;
else if (B>k){
      while (B->left>x){
               B=B->left;
      }
      p=B;
else if (B<x){
      while (B->right<x){
               B=B->right;
      }
      p=B;
}

Is it right? Or have I done something wrong? (Thinking)

evinda · Nov 24, 2014

Or do we have to do it in an other way, according to the hint? (Thinking)

I like Serena · Nov 24, 2014

evinda said:
Hello! (Wave)

Given a binary search tree $B$, I want to write an algorithm, that divides $B$ into two new trees $B_1, B_2$, so that the first one contains all the keys of $B$ that are smaller than $k$ and the second one contains all the keys of $B$ that are greater than $k$.
Hint : Execute a search of $k$, "cutting out" the pointers that "hang" at the left side and the right side of the path. After that, merge the trees of the forest, you get.

I tried to find like that, the position, at which $k$ is:
Code:
if (B==k) p=B;
else if (B>k){
      while (B->left>x){
               B=B->left;
      }
      p=B;
else if (B<x){
      while (B->right<x){
               B=B->right;
      }
      p=B;
}
Is it right? Or have I done something wrong? (Thinking)

Hey! (Smile)

evinda said:

Or do we have to do it in an other way, according to the hint? (Thinking)

I think you are supposed to set up a recursive algorithm.
Something like:

Code:

function algorithm(B, *B1, *B2)
  if B == NULL
    return
  extractLess(B, B->data, B1)
  extractGreater(B, B->data, B2)function extractLess(node, k, *destination)
  if node == NULL
    return
  if node->data < k
    addToTree(node->data, destination)

  extractLess(node->left, k, destination)
  extractLess(node->right, k, destination)

...

(Wasntme)

evinda · Nov 28, 2014

I like Serena said:

Hey! (Smile)

I think you are supposed to set up a recursive algorithm.
Something like:

Code:

function algorithm(B, *B1, *B2)
  if B == NULL
    return
  extractLess(B, B->data, B1)
  extractGreater(B, B->data, B2)function extractLess(node, k, *destination)
  if node == NULL
    return
  if node->data < k
    addToTree(node->data, destination)

  extractLess(node->left, k, destination)
  extractLess(node->right, k, destination)

...

(Wasntme)

If we have, for example, this tree $B$:

View attachment 3614

and we want to divide it into two new trees $B1,B2$, so that the first one contains all the keys of $B$ that are smaller than $5$ and the second one contains all the keys of $B$ that are greater than $5$, won't we compare all the values of the nodes with $12$, instead of $5$? (Worried)
Or have I understood it wrong? (Thinking)

I like Serena · Nov 29, 2014

evinda said:

If we have, for example, this tree $B$:

and we want to divide it into two new trees $B1,B2$, so that the first one contains all the keys of $B$ that are smaller than $5$ and the second one contains all the keys of $B$ that are greater than $5$, won't we compare all the values of the nodes with $12$, instead of $5$? (Worried)
Or have I understood it wrong? (Thinking)

I misread the problem statement, treating $k$ as the value of the root node, which it doesn't have to be. (Blush)

johng1 · Nov 29, 2014

I found this to be an interesting exercise. The following somewhat lengthy C program solves the problem. I recommend that you read this, even if you're not familiar with C. There are several binary search tree functions that are applicable in a lot of places.
If anyone can help me, I have noticed empirically, that the heights of the split trees are never more than the height of the original tree. I can't prove this, though.

Code:

#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#define  MAX (1000)

typedef struct node_tag {
    int key;
    struct node_tag *left_child,*right_child;
}
node;

int insertIntoSearchTree(node* root,int x);
node* buildRandomTree(int n);
node* getNode(void);
int isSearchTree(node* root);
void splitTree(node* root,int x,node** lessRoot,node** greaterRoot);
node* appendToTree(node* root,node* p);
int countNodes(node* root);
int findMinKey(node* root);
int findMaxKey(node* root);
node* copyTree(node* root);

node* getNode(){
    node *p;
    if ((p=(node *) malloc(sizeof(node)))==NULL) {
        fprintf(stderr,"Memory allocation failure\n");
        exit(1);
    }
    p->key=0;
    p->left_child=p->right_child=NULL;
    return(p);
}

// Just frees all the nodes in the tree pointed to by root.

void destroy(node *root) {
    if (root==NULL) {
        return;
    }
    destroy(root->left_child);
    destroy(root->right_child);
    free(root);
}/* Inserts a new node (as a leaf) with key x into the NON-EMPTY binary search tree with root root.
   Return 1 if successful, but 0 if node with key x already in tree.
*/
int insertIntoSearchTree(node* root,int x) {
    node *p,*parent;
    p=root;
    while (p!=NULL && p->key!=x) {
        parent=p;
        if (x<p->key) {
            p=p->left_child;
        }
        else {
            p=p->right_child;
        }
    }
    if (p!=NULL && p->key==x) {
        return(0);
    }
    p=getNode();
    p->key=x;
    if (x<parent->key) {
        parent->left_child=p;
    }
    else {
        parent->right_child=p;
    }
    return(1);
}

/* Following builds a "random" tree with n nodes.  Found by first generating a random permutation and then
inserting the nodes in turn into an initially empty binary tree.
*/
node* buildRandomTree(int n) {
    node* t;
    int i,j,temp;
    int randomPerm[MAX];
    for (i=0;i<n;randomPerm[i]=2*i,i++);
    for (i=n-1;i>0;i--) {
        j=rand()%i;
        temp=randomPerm[i];
        randomPerm[i]=randomPerm[j];
        randomPerm[j]=temp;
    }
    t=getNode();
    t->key=randomPerm[0];
    for (i=1;i<n;i++) {
        insertIntoSearchTree(t,randomPerm[i]); // all components of randomPerm different
    }
    return(t);
}

/*  Finds the minimum key in tree root; artificially returns INT_MAX for
    an empty tree.  Convenient for comparisons made else where.
*/

int findMinKey(node* root) {
    node* p,*parent;
    if (root==NULL) {
        return(INT_MAX);
    }
    for (parent=root,p=root->left_child;p!=NULL;parent=p,p=p->left_child);
    return(parent->key);
}

// Dual of findMinKey

int findMaxKey(node* root) {
    node* p,*parent;
    if (root==NULL) {
        return(INT_MIN);
    }
    for (parent=root,p=root->right_child;p!=NULL;parent=p,p=p->right_child);
    return(parent->key);
}/* isSearchTree returns true iff the binary tree with root root is a binary search tree.
*/
int isSearchTree(node* root) {
    node* p,*parent;
    int min,max;
    if (root==NULL) {
        return(1);
    }
    int valid=isSearchTree(root->left_child) && isSearchTree(root->right_child);
    if (!valid) {
        return(0);
    }
    max=findMaxKey(root->left_child);
    min=findMinKey(root->right_child);
    return(max<root->key && root->key<min);
}

/* SPECIAL append function does NOT work for arbitrary trees.  Here the search
   tree rooted at p is appended to the search tree rooted at root with the result
   a search tree. The condition on p is that ANY node in p with key k would be inserted
   into root at the same place.  This is called only by splitTree.
*/

node* appendToTree(node* root,node* p) {
    int x;
    node *q,*parent;
    if (root==NULL) {
        return(p);
    }
    if (p==NULL) {
        return(root);
    }
    x=p->key;
    parent=NULL;
    q=root;
    while (q!=NULL) {
        parent=q;
        if (x<q->key) {
            q=q->left_child;
        }
        else {
            q=q->right_child;
        }
    }
    if (x<parent->key) {
        parent->left_child=p;
    }
    else {
        parent->right_child=p;
    }
    return(root);
}

/*  The tree at root is split into two trees lessRoot and greaterRoot with the key
    of any node in lessRoot < x and the key of any node in greaterRoot is >= x.  So if
    x is a key in the original tree root, x is the least key in greaterRoot; however,
    it is still valid if x is not a key in root.  No new nodes are allocated, so the original
    tree is invalid after the split.
*/

void splitTree(node* root,int x,node** lessRoot,node** greaterRoot) {
    node* p,*parent;
    p=root;
    while (p!=NULL && p->key!=x) {
        parent=p;
        if (x<p->key) {
            p=p->left_child;
            parent->left_child=NULL;
            *greaterRoot=appendToTree(*greaterRoot,parent);
        }
        else {
            p=p->right_child;
            parent->right_child=NULL;
            *lessRoot=appendToTree(*lessRoot,parent);
        }
    }
    if (p!=NULL) {
        *lessRoot=appendToTree(*lessRoot,p->left_child);
        p->left_child=NULL;
        *greaterRoot=appendToTree(*greaterRoot,p);
    }
}

// Just counts the nodes

int countNodes(node* root) {
    int n;
    if (root==NULL) {
        return(0);
    }
    n=countNodes(root->left_child);
    n+=countNodes(root->right_child);
    return(n+1);
}

/* Debugging aid to make certain above splitTree works. splitTree was called with pivot
   x and the origingal tree had total nodes.
*/

int checkSplitTree(int x,int total,node* lessRoot,node* greaterRoot) {
    int n,max,min;
    n=countNodes(lessRoot)+countNodes(greaterRoot);
    // first make sure no nodes got lost
    if (n!=total) {
        return(0);
    }
    // now the two split trees better be binary search trees
    if (!isSearchTree(lessRoot) || !isSearchTree(greaterRoot)) {
        return(0);
    }
    // finally, the split worked
    max=findMaxKey(lessRoot);
    min=findMinKey(greaterRoot);
    return(max<x && x<=min);
}

// Just copies the tree into a whole new tree.
node* copyTree(node* root) {
    node* p;
    if (root==NULL) {
        return(NULL);
    }
    p=getNode();
    p->key=root->key;
    p->left_child=copyTree(root->left_child);
    p->right_child=copyTree(root->right_child);
    return(p);
}

// Compute the height of the tree:
int height(node* root) {
    int h1,h2;
    if (root==NULL) {
        return(-1);
    }
    h1=height(root->left_child);
    h2=height(root->right_child);
    return((h1<=h2) ? h2+1 : h1+1);
}int main(){
    node *root,*root1;
    node *rootLess,*rootGreater;
    int i,n=800,valid;
    int ht,htLess,htGreater,h;
    //    root=buildComplete(1,n);
    root1=buildRandomTree(n);
    valid=1;
    ht=height(root1);
    htLess=htGreater=INT_MIN;
    for (i=0;i<2*n;i++) {
        root=copyTree(root1);
        rootLess=NULL;
        rootGreater=NULL;
        splitTree(root,i,&rootLess,&rootGreater);
        if (!checkSplitTree(i,n,rootLess,rootGreater)) {
            printf("Oops, something wrong! at %d\n",i);
            valid=0;
            break;
        }
        h=height(rootLess);
        if (h>htLess) {
            htLess=h;
        }
        h=height(rootGreater);
        if (h>htGreater) {
            htGreater=h;
        }
        destroy(rootLess);
        destroy(rootGreater);
    }
    if (valid) {
        printf("everything ok!\n");
        printf("original ht:%d , max of rootLess: %d, max of rootGreater: %d\n",ht,htLess,htGreater);
    }
    return(0);
}

evinda · Dec 1, 2014

Could you describe me what we have to do? (Thinking)

I like Serena · Dec 1, 2014

evinda said:

Could you describe me what we have to do? (Thinking)

The algorithm I suggested is still a good way to go.
We should just pass $k$ to it, and also pass $k$ on to the recursive calls. (Wasntme)

evinda · Dec 1, 2014

I like Serena said:

The algorithm I suggested is still a good way to go.
We should just pass $k$ to it, and also pass $k$ on to the recursive calls. (Wasntme)

So, does the function have to be of this form?

Code:

function algorithm(B, k,B1, B2)
  if B == NULL
    return
  extractLess(B, k,B->data, B1)
  extractGreater(B, k,B->data, B2)

If so, what type of parameters should the functions [m] extractLess [/m] and [m] extractGreater [/m] have? (Thinking)

johng1 · Dec 1, 2014

As I read the original question and the hint, I assumed that the problem was to split the given tree into two new trees with no new nodes allocated. As I understand ILikeSerena's suggestion, his algorithm creates new trees with new nodes. It also appears that his suggestion is of order n (number of nodes in the original tree), whereas the solution I gave is of order the height of the original tree -- for a reasonably balanced tree, this will result in order ln(n).

Btw, here's some unsolicited advice. I think the study of algorithms is just like mathematics; it's not a spectator sport. Surely you must know some programming language (Java, C++, Python etc.). Implement your proposed solution in your favorite language. When I used to teach algorithms, I almost always expected a working program that implemented a given exercise.

evinda · Dec 2, 2014

johng said:

As I read the original question and the hint, I assumed that the problem was to split the given tree into two new trees with no new nodes allocated. As I understand ILikeSerena's suggestion, his algorithm creates new trees with new nodes. It also appears that his suggestion is of order n (number of nodes in the original tree), whereas the solution I gave is of order the height of the original tree -- for a reasonably balanced tree, this will result in order ln(n).

Btw, here's some unsolicited advice. I think the study of algorithms is just like mathematics; it's not a spectator sport. Surely you must know some programming language (Java, C++, Python etc.). Implement your proposed solution in your favorite language. When I used to teach algorithms, I almost always expected a working program that implemented a given exercise.

I want the algorithm to be of order the height of the original tree.. (Nod)

So, in pseudocode should it be like that?

Code:

pointer appendToTree(pointer root,poiter p) {
    int x;
    pointer q,parent;
    if (root==NULL) {
        return(p);
    }
    if (p==NULL) {
        return(root);
    }
    x=p->key;
    parent=NULL;
    q=root;
    while (q!=NULL) {
        parent=q;
        if (x<q->key) {
            q=q->left_child;
        }
        else {
            q=q->right_child;
        }
    }
    if (x<parent->key) {
        parent->left_child=p;
    }
    else {
        parent->right_child=p;
    }
    return(root);
}
void splitTree(pointer root,int x,pointer lessRoot,pointer greaterRoot) {
    pointer p,*parent;
    p=root;
    while (p!=NULL && p->key!=x) {
        parent=p;
        if (x<p->key) {
            p=p->left_child;
            parent->left_child=NULL;
            greaterRoot=appendToTree(greaterRoot,parent);
        }
        else {
            p=p->right_child;
            parent->right_child=NULL;
            lessRoot=appendToTree(lessRoot,parent);
        }
    }
    if (p!=NULL) {
        lessRoot=appendToTree(lessRoot,p->left_child);
        p->left_child=NULL;
        greaterRoot=appendToTree(greaterRoot,p);
    }
}

Or is it wrong? (Thinking)

At the algorithm [m] splitTree [/m], why is [m] p [/m] a pointer, but [m]parent[/m] a pointer to a pointer? (Worried)

Can the type of a function in pseudocode be [m] pointer [/m] ?

MHB Divide Binary Search Tree at Key k: Algorithm & Analysis

Attachments

Similar threads

How to increase phone signal strength by lying about it

A Crisis for Newly Minted CompSci Majors -- entry level jobs gone

How to calculate Tension for a series of connected points?

Learning Assembly and computer architecture for x86

Sequential Analog Computers?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers