Calculating string given index in Cartesian Set

In summary, Sophie can find the x'th character in a set of a 95 characters by taking the Cartesian product of the source set and then ordering the set based on the order of the source set.
  • #1
SophieP
8
0
Hello all,

I'm new, so please go easy on me if this is a silly question.

If I have a source set of characters, say:

"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~ "​
And I compute the Cartesian product of this source set (sort of).
This gives me every combination of characters in that set, starting with single character strings, then every pair, then every triplet, up to a limit of 3 characters.

This I can do, and have done, but is there a way to calculate the string that will occur at a given index, without computing the whole set and then taking the string from that?

Edit: It's also worth noting that the set resulted by the computation is, for a set of every combination of this source set of 95 characters, up to 4 letters is of size 95^4 + 95^3 + 95^2 + 95^1.

I hope my question isn't too poorly worded, or breaks any rules, it's not homework, it's a project I'm doing alone.

Many thanks,

Sophie
 
Last edited:
Physics news on Phys.org
  • #2
Hi SophieP and welcome to Physicsforums! :smile:

Before I try to answer your question, I need sure I understood is correctly.

For example, if I have a source set {0,1}, then you want to form the set

{0,1,00,01,10,11,000,001,010,011,100,101,111}

(which isn't really a cartesian product by the way). And you want to be able to say "at place 8, I can find 001" without calculating the entire set?

If this is what you mean, then I see a small problem, I can also form the set

{0,1,00,10,01,11,000,100,010,110,001,101,111}

then at place 8 I will find 100. And there are other possible ways of ordering the set. So before we can answer your question, you will have to say how exactly you order your set...
 
  • #3
Hello Micromass!

For example, if I have a source set {0,1}, then you want to form the set

{0,1,00,01,10,11,000,001,010,011,100,101,111}
This is exactly what I'm after.

I did suspect that it wasn't the Cartesian Product, but I took a stab anyway.

As for ordering the set, I'm ordering it based on the order of the source set, which is usually an array of about 100 characters.

So given the toy source set of {a, b}

I'd want {a, b, aa, ab, ba, bb, aaa, aab, aba, abb, baa, bab, bba, bbb}

But given the source {b, a}

I'd get {b, a, bb, ba, ab, aa, bbb, bba, bab, baa, abb, aba, aab, aaa}

Edit: this bit I can do, I just need to know if it's possible to work out, given the source set and the size of the resulting set, what string will occur at point x.

Hope this clears things up, many thanks,

SophieP
 
Last edited:
  • #4
OK, time for some mathematics. I won't give an exact formula because it'll probably be too annoying.

I'll describe the procedure though:

Say you have a source set A with n characters.
Say you want to find the x'th character. Now

  • If [itex]1\leq x\leq n[/itex], then this character will have 1 element.
  • If [itex]n+1\leq x\leq n^2[/itex], then this character will have 2 elements.
  • ...
  • If [itex]n^k+1\leq x\leq n^{k+1}[/itex], then this character will have k elements.

In general, take [itex]\log_n(x)[/itex] and round the result up, call this k. Then x will have k elements. (this does not work for x=1, so you'll need to treat this separately).

Now that we know that x has k elements, we let

[tex]L=x-n^{k-1}-n^{k-2}-...-n-1=L-\frac{n^k-1}{n-1}[/tex]

For example, if n=5 and x=9, then the word contains 2 characters and the constant L equals 3.

Now, all you need to do is expand L in base n, this will tell you what characters you have.
For example, if n=5 and x=9, then L=3. Expanding L in base 5 just yields 3 (but we need 2 characters so we write this as 03). So, you will need the first element in your set A and then the 4th element in your set A.

Another example if n=15 and x=543. We see that k=3, thus our word must have 3 characters. Our L is

[tex]543-15^2-15-1=302[/tex]

Expanding this in base 15 gives me (1,5,2). Thus you need to pick the second element in the set A, then the 6th element in the set A and then the third element in the set A.
 
  • #5
Micromass,

You sir, are a gentleman and a scholar.

When I wake up in the morning (it's 2 a.m. here) I'll have a serious word with this and see if I can code it up.

I owe you a great debt, thank you so much!

I hope I can repay you someday :)

SophieP
 
  • #6
Glad that I could have been of service! :smile:

Once you coded it up, it's best to try some specific examples though. I'm certain the general method is correct, but I might have made some minor mistakes with indices...

Ask if something is not clear!
 
  • #7
Hmmm... I seem to be having a few issues coding this up.

I converted the following into code:

micromass said:
  • If [itex]1\leq x\leq n[/itex], then this character will have 1 element.
  • If [itex]n+1\leq x\leq n^2[/itex], then this character will have 2 elements.
  • ...
  • If [itex]n^k+1\leq x\leq n^{k+1}[/itex], then this character will have k elements.

And the first two work fine, 1 and 2 character length strings are picked up, but 3+ lengths aren't, strangely.

Code:
                for (int k = 3; k <= depthCount; k++)
                {
                    Console.Write("(c " + k + ")");
                    if (((Math.Pow(n, k) + 1) <= x) && (x <= (Math.Pow(n, (k + 1)))))
                    {
                        return k;
                    }
                }

Above I've got the code snippet from the third line, working out if the string has k elements, but this never seems to execute. It's in C#, I hope it's vaguely readable, but it's mainly the following line I'm talking about:

if ( ( (Math.Pow(n, k) + 1) <= x) && (x <= (Math.Pow(n, (k + 1))) ) )

Do you think this is an accurate translation of the logic?

Hope this is coherent,

Many thanks,

SophieP
 
  • #8
I'm sorry, I made a mistake (I knew this would happen). It should be

if [itex]n^{k-1}+1\leq x\leq n^k[/itex], then the string has k elements. What I've written before

For the rest, the code seems to be correct (but I don't understand why you put int k = 3 in the beginning, but I don't know C# so well).
 
  • #9
May I offer another approach? If you have 95 characters, for example, consider them to be the "digits" in a base-95 positional number system. Then to find the nth string, just convert n to base 95. I *think* that gives the sequence you want.

Here is Python code to convert a positive integer x to base "base":

Code:
def convert(x, base):
	while x > 0:
		print x % base
		x = x / base

Note that % is the Python modulus operator and "/" in this case is integer division with truncation. Also note that the "digits" are printed out least significant digit first; e.g., convert(4,2)
prints
0
0
1
which means "100" is 4 in binary.
 
  • #10
awkward said:
Note that % is the Python modulus operator and "/" in this case is integer division with truncation.
Two minor corrections:
  • The result of % has the same sign as the modulus, and division rounds appropriately for this convention. In particular, if the modulus is positive, then division rounds towards negative infinity, not towards zero as you suggest.
  • "/" meaning integer division is deprecated. Starting with python 3.0 (or 2.2, if you import division from __future__), "/" is true division and "//" is integer division. So,
    >>> print 3/2
    1.5
    >>> print 3//2
    1​
 
  • #11
Hurkyl said:
Two minor corrections:
  • The result of % has the same sign as the modulus, and division rounds appropriately for this convention. In particular, if the modulus is positive, then division rounds towards negative infinity, not towards zero as you suggest.
  • "/" meaning integer division is deprecated. Starting with python 3.0 (or 2.2, if you import division from __future__), "/" is true division and "//" is integer division. So,
    >>> print 3/2
    1.5
    >>> print 3//2
    1​
Thanks, that's a good point about //. I used to know that, but I momentarily forgot.
 

1. How is a string calculated given an index in a Cartesian set?

The string is calculated by converting the index into its corresponding Cartesian coordinates, and then using those coordinates to locate the string within the set.

2. What is a Cartesian set?

A Cartesian set is a mathematical concept that represents a collection of ordered pairs or coordinates in a geometric plane.

3. How do you determine the index of a string in a Cartesian set?

The index of a string in a Cartesian set is determined by its position within the set, with the first string having an index of 0 and subsequent strings having increasing indices.

4. Can the index of a string in a Cartesian set be negative?

No, the index of a string in a Cartesian set cannot be negative as it represents a position within the set, and positions are always represented by positive numbers.

5. Is there a specific formula for calculating a string given an index in a Cartesian set?

Yes, the formula for calculating a string given an index in a Cartesian set is: string = set[index mod (length of set)]. This takes into account that the index may be larger than the length of the set, and uses the remainder to locate the correct string within the set.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
986
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
3K
  • Programming and Computer Science
Replies
6
Views
1K
  • Programming and Computer Science
2
Replies
55
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
10
Views
1K
  • Programming and Computer Science
Replies
19
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
5K
  • Programming and Computer Science
Replies
2
Views
2K
Back
Top