# [C] about delete in BST

1. Jan 27, 2013

### abramhum.c.l

Hi:
I was in a maze about delete in binary search tree. The followings
is its codes:
Code (Text):

void delete(struct bst_node** node) {
struct bst_node* old_node = *node;
if ((*node)->left == NULL) {
*node = (*node)->right;
free_node(old_node);
} else if ((*node)->right == NULL) {
*node = (*node)->left;
free_node(old_node);
} else {
delete node with two children
}
}

This problem is from the current node and its parents. In the above codes, obviously,
I cannot image how the parent node connect to the current node's right child when if (*node)->left == NULL; In theoretically, to find out the parent node, I should write codes like following:
Code (Text):

struct bst_node* p;
if(current_node!=null)
{
p=current_node; current_node=current->right;
}

and then when delete the current, the function will connect the parent node's right child to
current node's right child. However, the codes in top section not including this part, I didn't
know why and how it work. Any instruction is appreciated, thanks.

2. Jan 30, 2013

### chiro

Hey abramhum.c.l.

My guess is this command recursively deletes all children which means they all get freed from memory and the parent node sets whatever child pointer to null (as it does in the branch statement).

3. Jan 30, 2013

### abramhum.c.l

Not exactly, this is a part of BST, and its purpose is to delete the
node that user designated.
The problem make me feel puzzled is the process of recursive.
for example, if these exist four nodes like A, B,C, D; A is the parents
of B, and C is the right child of B, and then D is the left child of C.
If we want to delete B, then in BST algorithm, since the left child of
B is null, it will find C first, and then connect A to C, make the C become
the right child of A, and then delete B.

But in the code snippet I posted, i don't know how it find out the A,
and it seems make no sense.
Thanks a lot

4. Jan 30, 2013

### rcgldr

The function names are misleading. It appears that for this case, you pass a pointer to A as a parameter to the delete() function, which then deletes one of the child nodes of A.

The code is missing is what is supposed to be done if A has two child nodes. What if each of those two child nodes also have two child nodes? This creates a dilemma because A would need 4 node links if it's two child nodes were deleted.

5. Jan 30, 2013

### abramhum.c.l

6. Jan 30, 2013

### D H

Staff Emeritus

Three key points here:
1. The function delete takes a pointer to a pointer to the node to be deleted.
2. The input to delete is the output of a call to search.
3. It's the data that needs to be deleted from the tree, not necessarily the node itself.

Suppose a binary search tree whose data comprises single characters is created by inserting, in order, the letters D (in node #1), F (node #2), B (node #3), C (node #4), E (node #5), and A (node #6). Graphically, the tree will look like this:
Code (Text):

D
/   \
B     F
/ \   /
A   C E
The nodes will contain the following:
Code (Text):

root: &node #1

node #1:
data: D
left: &node #3
right: &node #2

node #2:
data: F
left: &node #5
right: null

node #3:
data: B
left: &node #6
right: &node #4

node #4:
data: C
left: null
right: null

node #5:
data: E
left: null
right: null

node #6:
data: A
left: null
right: null
Suppose you want to delete the "A" from the tree. The things need to be done to accomplish this are:
- The left child of node #3 needs to be set to null, removing node #6 from the tree.
- Node #6 and the allocated data that it contains needs to be freed.

The way to do this is to call delete with a pointer to the left element of node #3. That is exactly what the function search will return when asked to search for "A". Deleting C or F from this tree is similarly easy. Deleting E also is surprisingly easy.

What's tricky is deleting B or D from the tree. One way to delete D from the tree is to make it look like
Code (Text):

C
/ \
B   F
/   /
A   E
This is what the code at the referenced site accomplishes.

7. Jan 31, 2013

### abramhum.c.l

Hi:
DH, if I hope to delete the point F, based on the codes, it should perform the following
section:
Code (Text):

void delete(struct bst_node** node) {
struct bst_node* old_node = *node;
...
if ((*node)->right == NULL) {
*node = (*node)->left;
free_node(old_node);
}

since old_node is the original one, like F, it's left child should be E, however, how to
find its parents D, *node = (*node)->left; this clause seems to make F to E, but, in the finally state, the D's left child should be E. I can not figure out how this be done on the upper code snippet. That's why I feel confused. Thanks a lot.

8. Jan 31, 2013

### D H

Staff Emeritus

The function delete doesn't need to find the parent of the node containing F.

The function will change something in that parent node. In this case is will reset the right child element of that parent node to point to the node containing E. So how can delete change the contents of that right child element given that there's no way to traverse up the tree? The answer is simple: What you must provide as input to delete is the address of that right child element. This is exactly what the function search will return when asked to find "F". The key to this whole puzzle is that delete expects to be fed the output of the search function.

So what if you want to delete D from the tree rather than F? In this case, the function search will return a pointer to the root pointer. What this version of a binary search tree does is to
1. Find the pointer to the node that contains the immediate predecessor to D. In this case, that node is node #4, and the thing that points to it is node #3's right child element.

2. Swap the data contents of node #1 and node #4. This step makes the tree look like this (which violates the heap principle):
Code (Text):

C
/   \
B     F
/ \   /
A   D E
3. The above step makes the tree not be a tree; it violates the heap principle. The way to fix this problem, and to simultaneous achieve the goal of deleting "D" from the tree is to recursively call delete with the pointer from step 1 (the pointer to the right child element of node #3, which points to node #4). This recursive call to delete will make the tree look like this:
Code (Text):

C
/ \
B   F
/   /
A   E

9. Jan 31, 2013

### abramhum.c.l

I still feel confused about delete F. I think it is somehow the process
like link list, but seems something different. In link-list, if I hope to delete
a node, like B in this case: A->B->C, if the structure is following:
Code (Text):

struct list_node {
void* data;
struct bst_node* right;
};

what I need to do is set A->right=B->right; and then delete B.
But the process which in deleting F in BST is different.
What is the theorem that cause the algorithm can reduce the above process.
Thanks a lot.

10. Feb 1, 2013

### D H

Staff Emeritus

There's not much more that I can say without repeating myself other than to execute the code by hand or to execute the code in a debugger.

11. Feb 3, 2013

### abramhum.c.l

12. Feb 3, 2013

### D H

Staff Emeritus
Did you walk through the code, step by step, with a debugger?

Better yet, execute the code by hand: act as if YOU are the computer. It's the best way to understand what is going on.

This code is a bit different than the first one you find. The first version required the caller to pass the correct handle, a pointer to a pointer, to insert and delete. This new version doesn't use handles, but instead requires that the outside caller reassign the root pointer to the result returned by insert and delete.