Intersection-find Data Structure

In summary, the union-find data structure can be used to store an indexed set of sets and support the intersection operation in sub-linear time.
  • #1
mXSCNT
315
1
Is there a data structure like the union-find data structure, but for intersections (and the sets needn't be disjoint)?

I'm looking for a data structure that stores an indexed set of sets A_1, A_2, ..., A_n, and supports the following operations:
replace the sets A_i, A_j with their intersection A_i n A_j, in sub-linear time, and returns the index of the intersection thus formed.
given the index of a set, list its contents in linear time
 
Technology news on Phys.org
  • #2
You can do it in sub-linear time only if the elements have a position in some K-dimensional space...allowing you to store them in a BST that can be partitioned sub-linearly for intersection. Otherwise, you're going to be stuck with linear time.
 
  • #3
Well, suppose the elements are integers (therefore they are in a 1 dimensional space). What would you do with that?

Or did you mean that the {A_i} are "simply shaped" regions in a k-dimensional space (such as cubes or intervals)?
 
  • #4
mXSCNT said:
Well, suppose the elements are integers (therefore they are in a 1 dimensional space). What would you do with that?

Or did you mean that the {A_i} are "simply shaped" regions in a k-dimensional space (such as cubes or intervals)?

In this case you can store the set as a list of subranges, and the intersection time will be O( (m+n) log (m+n) ), for intersecting subsets that have m and n subranges, respectively.

For example, if your sets are:

A = {1,2,3,4,9,10,11}
B = {3,4,5,6,12,13,14}

Then, represented as subranges you have,
A = { 1-4, 9-11 }
B = {3-6, 12-14 }

Sorting takes O(k log k) time (k=m+n),

1as 3bs 4ae 6be 9as 11ae 12bs 14be

(note: 1as = subrange from A starting with 1)

Then you can perform the final intersection in a final O(k) pass by choosing start point after both subranges have started, and end point as soon as either subrange ends. Result:

AintB = { 3-4 }
 
  • #5
True, the problem is not difficult if you are only dealing with interval sets. I meant the more general case.
 
  • #6
mXSCNT said:
True, the problem is not difficult if you are only dealing with interval sets. I meant the more general case.

It is obviously impossible to do this in sub-linear time in the general case because you want to process a linear number of elements. At the very least you will need to consider each element since there is no structure, and that means the algorithm must be at least linear time. QED

However when using integers, you can use Bucket sort to individually sort the two sets of integers in O(N) time, and you can then perform the intersection in one pass in O(N) time using similar method to what I described above. Note that this isn't truly general because it only works on integers and assumes a bounded range on those integers.

If you are repeatedly doing intersections like this, it would make more sense to use a priority queue that keeps them always sorted and then merging can be done in O(N) time for the truly general case (ie, floating point numbers), and with a much lower constant factor.
 

1. What is an intersection-find data structure?

An intersection-find data structure is a type of data structure used in computer science to efficiently store and retrieve data related to intersections between two or more sets of data elements. It allows for quick determination of whether two or more sets have any common elements, and if so, which elements they share.

2. How does an intersection-find data structure work?

An intersection-find data structure typically uses a combination of hash tables and/or binary search trees to store the data elements of the sets being compared. These structures allow for efficient lookup and comparison of elements, making it easier to identify common elements between the sets.

3. What are the benefits of using an intersection-find data structure?

One major benefit of using an intersection-find data structure is its efficiency in determining the existence and contents of intersections between sets. This can be particularly useful in applications such as database management or graph algorithms, where identifying common elements is a common task.

4. What are some common use cases for an intersection-find data structure?

Intersection-find data structures are commonly used in fields such as database management, graph algorithms, and computational geometry. They can also be useful in applications involving large datasets, where efficient identification of common elements can improve performance.

5. Is an intersection-find data structure suitable for all types of data?

No, an intersection-find data structure may not be suitable for all types of data. It works best with data that can be efficiently compared and hashed, such as numbers or strings. It may not be as effective for complex data structures or data that cannot be easily compared.

Similar threads

  • Programming and Computer Science
Replies
10
Views
1K
  • Programming and Computer Science
Replies
1
Views
245
  • Programming and Computer Science
Replies
25
Views
2K
Replies
10
Views
947
  • Engineering and Comp Sci Homework Help
Replies
10
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
7
Views
3K
  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
3
Replies
75
Views
4K
Replies
19
Views
1K
Back
Top