Question about the efficiency of using an array or individual variables

yungman · Aug 17, 2020

I have two questions here. In the program, I have two different boolean expressions, one using array A[10], the other just using individual variablesB(0..9). They both do the same thing as in the program. My question which one is faster in real life hardware? My guess is B(0..9) is faster even though it looks stupid. I think each variable use individual address, so it's one step access. The one using array has to take the starting address, add the index to get the address before accessing the value. I put a lot more emphasis on efficiency than make the code looks "nice".

Also, in the process of declaring the bool A[10], I found out I had to initialize with {1, 1, 1, 1, 1, 1, 1, 1, 1, 1} INSTEAD of {1} for all 10 elements. Why?

C++:

//Efficiency
#include <iostream>
using namespace std;

int main()
{
    bool A[10] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1};
    bool B0 = true, B1 = true, B2 = true, B3 = true, B4 = true, B5 = true,
        B6 = true, B7 = true, B8 = true, B9 = true;
    bool An, Bn; // result of AND

    An = A[0] && A[1] && A[2] && A[3] && A[4] && A[5] && A[6] && A[7] && A[8] && A[9];
    Bn = B0 && B1 && B2 && B3 && B4 && B5 && B6 && B7 && B8 && B9;
    cout << " An = " << An << "    Bn = " << Bn << "\n\n";
    return 0;
}

Thanks

jedishrfu · Aug 17, 2020

This is a hard and implementation dependent question. My guess would be arrays would be faster from a memory caching point of view with prefetch of data for the cache.

yungman · Aug 17, 2020

I have another question on array
I just went through the chapter on array, when initialization of an array int A[size]. The size has to be a constant, it cannot be a variable programmed by program later. I even tried to declare const int size = sz; where sz is variable. Still C++ doesn't like it.

The only other way is to use vector(dynamic array) initiated with undersize vector and use push_back to make the size variable and get to the size needed.

Any other suggestion?

Mark44 · Aug 17, 2020

yungman said:

I even tried to declare const int size = sz; where sz is variable. Still C++ doesn't like it.

A variable declared as const has to have its value known at compile time. Since sz is a variable, your declaration won't work.

yungman · Aug 17, 2020

Mark44 said:

A variable declared as const has to have its value known at compile time. Since sz is a variable, your declaration won't work.

Yeh, it's one thing to learn and write the notes, it's when the rubber hits the road by writing a complete program that one really see the details and the limitations. That's why I am NOT following the book on all the search and sorts, I read the flow and logic to understand what they are trying to do, the method, but I then close the book, just get the question and write totally on my own. I am doing the bubble sort right now, I need an array of boolean to keep track of the steps of comparisons. Then I realize the limitations of using array. I have to use vector now in order to leave the choice for user input the number of numbers for sorting. The example of the book use a fixed array of numbers, that's too easy. I want user to be able to enter the amount of numbers as they want and sort.

It's fun to actually write a complete program.

Mark44 · Aug 17, 2020

yungman said:

Also, in the process of declaring the bool A[10], I found out I had to initialize with {1, 1, 1, 1, 1, 1, 1, 1, 1, 1} INSTEAD of {1} for all 10 elements. Why?

If you declare you array this way, bool A[10] = {1};, the 0-index element is initialized to 1, and the other 9 are initialized to 0. If you want all of the elements to be 1, you have to supply an initializer list that does this. (Or you could assign values to the array in a for loop.)

jedishrfu said:

This is a hard and implementation dependent question. My guess would be arrays would be faster from a memory caching point of view with prefetch of data for the cache.

I would say the opposite -- that accessing a scalar variable is faster than accessing an array element. With an array element, the run time has to calculate the address of the array element, based on the starting address of the array, to which is added the offset to the element in question. With a scalar variable, the run time merely needs to access the memory address to get the value.

Edit:
On an x86 Windows machine, here is what the Visual Studio C++ compiler generates. All variables are of type int.
First, a scalar variable is accessed and its value is assigned to variable, x.

Code:

// x = num;
mov eax, dword ptr[num]
mov dword ptr[x], eax

The assembly code above copies the value in num to the EAX register, and then copies it to the variable x.

Next, an element of an array is accessed, and its value is assigned to a variable, y.

Code:

// y = Arr[3]
mov eax, 4
imul ecx, eax, 3
mov edx, dword ptr Arr[ecx]
mov dword ptr [y], edx

The first two lines of assembly set the EAX register to 4 (the size of an int), and then multiply it by 3 (the index in the array), storing the value 12 in the ECX register. The number we're looking for is at an offset of 12 bytes from the beginning of the array. The last two lines copy the value at that location to the EDX register, and then copy it to the variable y.

Mark44 · Aug 17, 2020

yungman said:

Then I realize the limitations of using array. I have to use vector now in order to leave the choice for user input the number of numbers for sorting. The example of the book use a fixed array of numbers, that's too easy. I want user to be able to enter the amount of numbers as they want and sort.

It's fun to actually write a complete program.

The arrays your are using are allocated at compile time on the stack. If you want to use an array that is allocated dynamically at run time, or whatever size the user wants, you can do that in C using malloc(), or in C++ using the new operator in its array form.

These are probably a bit ahead of where you are in the book.

yungman · Aug 17, 2020

Mark44 said:

The arrays your are using are allocated at compile time on the stack. If you want to use an array that is allocated dynamically at run time, or whatever size the user wants, you can do that in C using malloc(), or in C++ using the new operator in its array form.

These are probably a bit ahead of where you are in the book.

No, I have not learn that. I'll wait.

I already experimented using vector and I can do it using vector. But I have a more pressing question now.

In normal situation when we use cin or cin.get, we enter a number, then we have to hit ENTER. If I want to enter any number of integers as I want and want to stop whenever I want, I don't know how to make it easier.

Right now, I can ask user to enter a number, hit enter. Then I ask user to hit any key to enter another number or hit ENTER to quit. But that's a lot of trouble as you enter a number then hit enter, then hit any key follow by enter to repeat. So it takes 4 steps to enter one number as shown in the program:

C++:

//Efficiency
#include <iostream>
#include <vector>
using namespace std;

int main()
{
    vector<int> Av;
    int num;
    char key = 'A';
    while (key != '\n')
    {
        cout << " Enter an integer to be store in the dynamic array Av "; cin >> num;
        Av.push_back(num);
        cout << "\n\n Hit any key to enter another integer, hit ENTER to quit ";
        cin.ignore(255, '\n');
        cin.get(key);
    }
    int x = Av.size();
    cout << " You enter these number into dynamic array Av\n\n";
    cout << " {";
    for (int i = 0; i < x; i++)
        cout << Av[i] << " ";
    cout << "} \n\n";
    return 0;
}

Is there any way to simplify that? Something like number1-spacebar-number2-spacebar...until hit enter to quit? I just want to make it easier to enter numbers. I am sure the chapters I studied so far don't have anything like this.

Thanks

Jarvis323 · Aug 17, 2020

yungman said:
I have two questions here. In the program, I have two different boolean expressions, one using array A[10], the other just using individual variablesB(0..9). They both do the same thing as in the program. My question which one is faster in real life hardware? My guess is B(0..9) is faster even though it looks stupid. I think each variable use individual address, so it's one step access. The one using array has to take the starting address, add the index to get the address before accessing the value. I put a lot more emphasis on efficiency than make the code looks "nice".

Also, in the process of declaring the bool A[10], I found out I had to initialize with {1, 1, 1, 1, 1, 1, 1, 1, 1, 1} INSTEAD of {1} for all 10 elements. Why?
C++:
//Efficiency
#include <iostream>
using namespace std;

int main()
{
    bool A[10] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1};
    bool B0 = true, B1 = true, B2 = true, B3 = true, B4 = true, B5 = true,
        B6 = true, B7 = true, B8 = true, B9 = true;
    bool An, Bn; // result of AND

    An = A[0] && A[1] && A[2] && A[3] && A[4] && A[5] && A[6] && A[7] && A[8] && A[9];
    Bn = B0 && B1 && B2 && B3 && B4 && B5 && B6 && B7 && B8 && B9;
    cout << " An = " << An << "    Bn = " << Bn << "\n\n";
    return 0;
}
Thanks

Hi Yungman, I am writing you from the future, to warn you to watch out for std::vector< bool >. It isn't what it appears to be.

Mark44 · Aug 18, 2020

yungman said:

Right now, I can ask user to enter a number, hit enter. Then I ask user to hit any key to enter another number or hit ENTER to quit. But that's a lot of trouble as you enter a number then hit enter, then hit any key follow by enter to repeat. So it takes 4 steps to enter one number as shown in the program:

It's a problem when you mix different types of input; i.e., numbers for the vector, and characters to signal the loop to stop. Perhaps a better way is to use a sentinel value -- a value that wouldn't be expected for valid input. In the loop below, I'm assuming that valid numbers are positive, so if the user enters a negative number, the input loop exits.

If that's not feasible, you could ask the user how many numbers will be input, and then run the loop that many times.

Some comments:

The user doesn't care that the vector is named Av, or even that you are using a dynamic array. Don't include that extraneous information in the user prompt.
Try to come up with better variable names. Av and x are just about useless, as these names don't convey to the reader (another programmer who is reviewing or maintaining your program).
It's better to use separate using std::xxx statements, rather than bringing in the whole very large std namespace. I gave an example in another thread of what can go wrong. It's especially bad practice to put such a statement into a header file.

Here's my version of your code.

C++:

#include <iostream>   
#include <vector>       
using std::cout;
using std::cin;
using std::vector;

int main()
{
    vector<int> List;
    int num;
    
    while (true)
    {
        cout << " Enter a nonnegative integer, or a negative number to quit "; 
        cin >> num;
        if (num < 0) break;
        List.push_back(num);        
    }

    int listSize = List.size();
    cout << " You entered these numbers\n\n";
    cout << " {";
    for (int i = 0; i < listSize; i++)
        cout << List[i] << " ";
    cout << "} \n\n";    
}

yungman · Aug 18, 2020

Jarvis323 said:

Hi Yungman, I am writing you from the future, to warn you to watch out for std::vector< bool >. It isn't what it appears to be.

Can you explain, it's not in the book. Doing a little guessing here, it works! Maybe, I am too much ahead of myself, wanting to do too much for what I know.
thanks

Jarvis323 · Aug 18, 2020

yungman said:

Can you explain, it's not in the book. Doing a little guessing here, it works! Maybe, I am too much ahead of myself, wanting to do too much for what I know.
thanks

In the early days of C++, they wanted to make vector< bool > more memory efficient. A bool is just true or false, so it can be represented with a single bit. So they specialzed vector< bool > so that it doesn't contain actual bools, it contains bits. This violates some rules about what a vector is supposed to be. And note that memory is addressed and operated on by the machine at the resolution of bytes.

Normally, it's safe to write to different vector elements from different threads. But not in this case.

So a parallel for loop like below,

C:

const size_t END = vb.size();
#pragma omp parallel for
for( size_t i = 0; i < END; ++i )
{
    vb[ i ] = something(i);
}

is a bug if vb is a vector<bool>, whereas it would be fine for any other type of vector. I once spent a few days perplexed why my code was giving wrong results, and then learned this was why.

There are some other issues with vector< bool > as well, and you can find a lot of blog posts about them. It's considered by most to have been a mistake to add this kind of special case for vector< bool >, but for backwards compatibility it's a mistake the C++ committee needs to live with. Nowdays, there is std::bitset as well. If you can help it, I think it's best to just treat vector< bool > as depreciated.

yungman · Aug 18, 2020

Jarvis323 said:
In the early days of C++, they wanted to make vector< bool > more memory efficient. A bool is just true or false, so it can be represented with a single bit. So they specialzed vector< bool > so that it doesn't contain actual bools, it contains bits. This violates some rules about what a vector is supposed to be. And note that memory is addressed and operated on by the machine at the resolution of bytes.

Normally, it's safe to write to different vector elements from different threads. But not in this case.

So a parallel for loop like below,
C:
const size_t END = vb.size();
#pragma omp parallel for
for( size_t i = 0; i < END; ++i )
{
    vb[ i ] = something(i);
}
is a bug if vb is a bool, whereas it would be fine for any other type of vector. I once spent a few days perplexed why my code was giving wrong results, and then learned this was why.

There are some other issues with vector< bool > as well, and you can find a lot of blog posts about them. It's considered by most to have been a mistake to add this kind of special case for vector< bool >, but for backwards compatibility it's a mistake the C++ committee needs to live with. Nowdays, there is std::bitset as well. If you can help it, I think it's best to just treat vector< bool > as depreciated.

Thanks for taking the time to explain, bottom line is one cannot write one bit at a time......BUT, you might give me a new idea.....If I just use char 'T' for true, 'F' for false or something similar. Something along with this idea:

C++:

//Bytewise logic
#include <iostream>
using namespace std;

int main()
{
    short int a, b, c, d, e, f;
    a = 0x00FF;
    b = 0x0000;
    d = 0x00AA;
    e = 0x0055;

    c = a | b;
    cout << " c = " << c << "\n\n";
    c = a & b;
    cout << " c = " << c << "\n\n";

    f = d ^ e;
    cout << " f = " << f << "\n\n";

    return 0;
}

Idea is if I can find a 1byte char or int and set True=0xFF and False=0x00. I can do AND and OR with that. That, I can simulate a vector of boolean, each 1byte...Hell, if I have to use 2bytes short int, so be it. I can create my "boolean" vector or array.

I need to think about this more.

The reason I need this is because I am writing the bubble sort, I need to keep track each pair whether I have to flip them or not, so I need a parallel vector so I can use the same index to access, and check whether all are true, then I know the sort is complete.

I am doing it my own way, I want to write my own bubble sort BEFORE I even look at the book how they do it. If I keep looking at the book first, I'd never learn.

Jarvis323 · Aug 18, 2020

yungman said:
Thanks for taking the time to explain, bottom line is one cannot write one bit at a time......BUT, you might give me a new idea.....If I just use char 'T' for true, 'F' for false or something similar. Something along with this idea:
C++:
//Bytewise logic
#include <iostream>
using namespace std;

int main()
{
    short int a, b, c, d, e, f;
    a = 0x00FF;
    b = 0x0000;
    d = 0x00AA;
    e = 0x0055;

    c = a | b;
    cout << " c = " << c << "\n\n";
    c = a & b;
    cout << " c = " << c << "\n\n";

    f = d ^ e;
    cout << " f = " << f << "\n\n";

    return 0;
}
Idea is if I can find a 1byte char or int and set True=0xFF and False=0x00. I can do AND and OR with that. That, I can simulate a vector of boolean, each 1byte...Hell, if I have to use 2bytes short int, so be it. I can create my "boolean" vector or array.

I need to think about this more.

The reason I need this is because I am writing the bubble sort, I need to keep track each pair whether I have to flip them or not, so I need a parallel vector so I can use the same index to access, and check whether all are true, then I know the sort is complete.

I am doing it my own way, I want to write my own bubble sort BEFORE I even look at the book how they do it. If I keep looking at the book first, I'd never learn.

Yeah, for writing in parallel, you might try using a vector of char or uchar instead of bool. Or maybe you could just be careful that more than one thread cannot write to the same byte and use a bitset?

You'll probably be doing sequential programming for quite a while before you have to worry about it. It's only when you have multiple threads where it becomes unsafe.

willem2 · Aug 18, 2020

Mark44 said:
Edit:
On an x86 Windows machine, here is what the Visual Studio C++ compiler generates. All variables are of type int.
Next, an element of an array is accessed, and its value is assigned to a variable, y.
Code:
// y = Arr[3]
mov eax, 4
imul ecx, eax, 3
mov edx, dword ptr Arr[ecx]
mov dword ptr [y], edx
The first two lines of assembly set the EAX register to 4 (the size of an int), and then multiply it by 3 (the index in the array), storing the value 12 in the ECX register. The number we're looking for is at an offset of 12 bytes from the beginning of the array. The last two lines copy the value at that location to the EDX register, and then copy it to the variable y.

This is in debug mode. in release mode, you get just:

Code:

//static array, the compiler can compute the adress of Arr[3] at compile time.
 move edx, [Arr + 12] for a static array
move [y], edx 
//array on the stack. The data will be at a known offset from ebp 
//ebp = previous value of the stack pointer, used for all local variables and parameters.
move edx, [ebp + offset] 
move [y], edx

And there is no difference between an array, and ordinary local variables

Mark44 · Aug 18, 2020

willem2 said:

This is in debug mode.

willem2 said:

And there is no difference between an array, and ordinary local variables

I'll have to take your word for this. If I compile in Release mode in VS, the debugger shows only some of the assembly code that is generated.

willem2 · Aug 18, 2020

Mark44 said:

I'll have to take your word for this. If I compile in Release mode in VS, the debugger shows only some of the assembly code that is generated.

The code was probably optimized away. It's not uncommon to see your entire test program optimized into

Code:

mov eax, 0
ret

you could make the declaration of the array volatile
volatile int arr[10];
that should keep the compiler from optimizing code that uses the array.

You should be able to see all the disassembly that's actually generated and let the debugger step through it.

yungman · Aug 18, 2020

Jarvis323 said:
In the early days of C++, they wanted to make vector< bool > more memory efficient. A bool is just true or false, so it can be represented with a single bit. So they specialzed vector< bool > so that it doesn't contain actual bools, it contains bits. This violates some rules about what a vector is supposed to be. And note that memory is addressed and operated on by the machine at the resolution of bytes.

Normally, it's safe to write to different vector elements from different threads. But not in this case.

So a parallel for loop like below,
C:
const size_t END = vb.size();
#pragma omp parallel for
for( size_t i = 0; i < END; ++i )
{
    vb[ i ] = something(i);
}
is a bug if vb is a vector<bool>, whereas it would be fine for any other type of vector. I once spent a few days perplexed why my code was giving wrong results, and then learned this was why.

There are some other issues with vector< bool > as well, and you can find a lot of blog posts about them. It's considered by most to have been a mistake to add this kind of special case for vector< bool >, but for backwards compatibility it's a mistake the C++ committee needs to live with. Nowdays, there is std::bitset as well. If you can help it, I think it's best to just treat vector< bool > as depreciated.

Actually from your response, I came up with this to achieve taking a boolean input and store in a vector of programmable length and perform AND function ( I can do others also, but that's what I need). basically I use a vector <short int>, store 0x00FF for true, 0x0000 for false. I can perform boolean as shown:

C++:

//Bytewise logic
#include <iostream>
#include <vector>
using namespace std;

int main()
{
    vector<short int> Bo;
    int i = 0, index=0, andF; 
    char temp, temp1 ;
    cout << " This program creates a vector of boolean and AND them together\n\n";
    do
    {
        cout << " Enter t for True, f for False and any other key to quit "; cin >> temp;
        if (temp == 't')        Bo.push_back(0x00FF);
        else if (temp == 'f')    Bo.push_back(0x0000);
    } while ((temp == 't') || (temp == 'f'));

    cout << " You entered " << Bo.size() << " elements\n\n";
    index = Bo.size() - 1;
    cout << " You entered {";
    for (int i = 0; i <= index; i++)
    {
        if (temp1 = 't') Bo[i] == 0x00FF;
        else if (temp1 = 'f') Bo[i] == 0x0000;
        cout << temp1 << " ";
    }
    cout << "}\n\n";
    for (int j = 0; j <= index; j++)
        cout << Bo[j] << " ";
    cout << "}\n\n";
    cout << " Index = " << index << "\n\n";
    i = 0;
    do //AND all the elements together
    {
        andF = Bo[i] & Bo[i+1];
        Bo[i + 1] = andF;
        i++;
        cout << " i = " << i << ",  Bo[" << i << "] = " << Bo[i] << ",  andF = " << andF << "\n\n";
    } while (i < index);
    cout << " The result of all the elements of Bo AND together is " << andF << "\n\n";
    return 0;
}

Jarvis323 · Aug 18, 2020

yungman said:

Actually from your response, I came up with this to achieve taking a boolean input and store in a vector of programmable length and perform AND function ( I can do others also, but that's what I need). basically I use a vector <short int>, store 0x00FF for true, 0x0000 for false. I can perform boolean as shown:

C++:

//Bytewise logic
#include <iostream>
#include <vector>
using namespace std;

int main()
{
    vector<short int> Bo;
    int i = 0, index=0, andF;
    char temp, temp1 ;
    cout << " This program creates a vector of boolean and AND them together\n\n";
    do
    {
        cout << " Enter t for True, f for False and any other key to quit "; cin >> temp;
        if (temp == 't')        Bo.push_back(0x00FF);
        else if (temp == 'f')    Bo.push_back(0x0000);
    } while ((temp == 't') || (temp == 'f'));

    cout << " You entered " << Bo.size() << " elements\n\n";
    index = Bo.size() - 1;
    cout << " You entered {";
    for (int i = 0; i <= index; i++)
    {
        if (temp1 = 't') Bo[i] == 0x00FF;
        else if (temp1 = 'f') Bo[i] == 0x0000;
        cout << temp1 << " ";
    }
    cout << "}\n\n";
    for (int j = 0; j <= index; j++)
        cout << Bo[j] << " ";
    cout << "}\n\n";
    cout << " Index = " << index << "\n\n";
    i = 0;
    do //AND all the elements together
    {
        andF = Bo[i] & Bo[i+1];
        Bo[i + 1] = andF;
        i++;
        cout << " i = " << i << ",  Bo[" << i << "] = " << Bo[i] << ",  andF = " << andF << "\n\n";
    } while (i < index);
    cout << " The result of all the elements of Bo AND together is " << andF << "\n\n";
    return 0;
}

Why use two bytes instead of one? Also, in C and C++, 0 treated as false, and not zero is treated as true. You can do this for example:

C:

std::vector<uchar> vb;
while( 1 )
{
    char input = getUserInput();
    if(      input == 't' ) vb.push_back( true );
    else if( input == 'f' ) vb.push_back( false );
    else break;
}

if( vb.size() > 1 )
{
    for( size_t i = 1, end = vb.size(); i < end; ++i )
    {
        vb[ i ] = vb[ i ] && vb[ i - 1 ];
    }
    vb[ 0 ] = vb[ 1 ];
}

Also, be careful with code like index = v.size() - 1 You can print out v.size() -1 when v is empty to see what I mean.

pbuk · Aug 18, 2020

I have quite a lot to say, so here goes:

yungman said:

I have two questions here. In the program, I have two different boolean expressions, one using array A[10], the other just using individual variablesB(0..9). They both do the same thing as in the program. My question which one is faster in real life hardware?

Any sensible compiler on any hardware will pre-compute the address of a[10] and access it just as fast as b10. Actually most modern processors (and certainly any 32+ bit x86) have an instruction that makes accessing a[k] (where k is an integer variable) just as fast as b10.

You should learn about Premature Optimization.

You also need to know (but not now) that if you did need to write faster code you need to stay away from std::vector (and for that matter std::string and the other STL classes) because these do a lot of work behind the scenes managing memory allocation for you. With an array you have to do this yourself.

In summary at this stage in your learning about C++ you need to forget about making your code faster, it's like trying to improve your lap times in a car around a race circuit when you haven't even learned how to change gear (C++ is a race car so it has a clutch and a stick shift).

jedishrfu said:

This is a hard and implementation dependent question. My guess would be arrays would be faster from a memory caching point of view with prefetch of data for the cache.

As was pointed out later in the thread a[10] and b10 compile to the same code so even if there was a hardware level user memory prefetch it wouldn't differentiate between them.

yungman said:

In normal situation when we use cin or cin.get, we enter a number, then we have to hit ENTER. If I want to enter any number of integers as I want and want to stop whenever I want, I don't know how to make it easier.
...
Is there any way to simplify that? Something like number1-spacebar-number2-spacebar...until hit enter to quit? I just want to make it easier to enter numbers. I am sure the chapters I studied so far don't have anything like this.

Use std::getline, or if you don't want to learn something new use a sentinel value as @Mark44 suggested.

yungman said:

The reason I need this is because I am writing the bubble sort, I need to keep track each pair whether I have to flip them or not, so I need a parallel vector so I can use the same index to access, and check whether all are true, then I know the sort is complete.

I am doing it my own way, I want to write my own bubble sort BEFORE I even look at the book how they do it. If I keep looking at the book first, I'd never learn.

A bubble sort doesn't need a vector to keep track of anything, just a single boolean.

yungman said:

...

I think you can write better code than in this post, and it will make your life (and ours) much easier if you get into the habit of writing code that is easier to read.

C++:

    // Don't use PascalCase for variables, and pick better names (unless 'bo' really
    // does mean something you will remember in 2 years time).
    vector<short int> Bo;

    // Don't leave variables unitialized. Be consistent with your spacing around '='.
    int i = 0, index=0, andF;

    // You must be able to thing of better names than these - and initialize them.
    char temp, temp1 ;

        // Use constants or even macros  something like TRUE_VALUE and FALSE_VALUE
        // instead of typing literal values each time; you should be able to think of two
        // obvious reasons why, but there are some non-obvious ones too.
        if (temp == 't')        Bo.push_back(0x00FF);
        else if (temp == 'f')    Bo.push_back(0x0000);

    // The name index is usually used to loop over the elements of an array, we would
    // usually call a variable representing the number of elements 'length' or 'size',
    // and we would not reduce it by 1.
    index = Bo.size() - 1;

    // So this line would be for (int i = 0; i < length; i++) - notice we use < instead of <=
    // rather than subtracting 1 from the number of elements.
    for (int i = 0; i <= index; i++)

        // I think you need to revise what '=' and '==' do.
        if (temp1 = 't') Bo[i] == 0x00FF;
        else if (temp1 = 'f') Bo[i] == 0x0000;

    // Why a new variable j, this makes it look like an inner loop - what's wrong with i?
    for (int j = 0; j <= index; j++)

        // Omitting { } around this line is a fast route to a maintenance and debugging nightmare.
        cout << Bo[j] << " ";

    // This time you _have_ resused i!
    i = 0;

        // Why are you changing the values of the array?
        Bo[i + 1] = andF;

yungman · Aug 18, 2020

Jarvis323 said:

C:

std::vector<uchar> vb;
while( 1 )
{
    char input = getUserInput();
    if(      input == 't' ) vb.push_back( true );
    else if( input == 'f' ) vb.push_back( false );
    else break;
}

I don't recall I've seen uchar in the book. It can take true or false? You can perform Boolean operation on each element?

Also, what is while( 1 )?

Jarvis323 said:
C++:
if( vb.size() )
{
    for( size_t i = 1, end = vb.size(); i < end; ++i )
    {
        vb[ i ] = vb[ i ] && vb[ i - 1 ];
    }
    vb[ 0 ] = vb[ 1 ];
}
Also, be careful with code like index = v.size() - 1 You can print out v.size() -1 when v is empty to see what I mean.

I don't follow this part:
What is if( vb.size() ) ? Is it if vb.size() = 0, it's false, or else it's true?

Also: for( size_t i = 1, end = vb.size(); i < end; ++i ) , what is size_t? Is it something I have not learn? I only know for(i=0; i<vb.size(); i++) {} type expression.

I am only on chapter 8 of Gaddis' book.

thanks

Jarvis323 · Aug 18, 2020

yungman said:

I don't recall I've seen uchar in the book. It can take true or false? You can perform Boolean operation on each element?
Also, what is while( 1 )?

uchar is an 8 bit unsigned int. Like I said, in C and C++, 0 is treated as false and not 0 is treated as true.

size_t is an unsigned integer that represents the maximum size that a container can have. This is the type of the value that .size() returns. You might have seen compiler warnings when you assigned .size() to an int. It warns you because size() may be larger than an int can hold without overflowing. If your container had more than 2,147,483,647 elements, then the int would wrap around, for example becoming negative. It's critical to understand the implications of the types you are using. You can play around and try to see what happens when you exceed their limitations, when you use multiple types in a single expression, etc.

https://en.cppreference.com/w/c/types/size_t

yungman · Aug 18, 2020

Jarvis323 said:

uchar is an 8 bit unsigned int. Like I said, in C and C++ 0 is treated as false and not 0 is treated as true.

I see. thanks

Jarvis323 · Aug 18, 2020

pbuk said:

You also need to know (but not now) that if you did need to write faster code you need to stay away from std::vector (and for that matter std::string and the other STL classes) because these do a lot of work behind the scenes managing memory allocation for you. With an array you have to do this yourself.

That's not necessarily true. There is no performance overhead using a vector instead of a raw dynamic array. The only issue is reallocation, which vector will do if it runs out of space while you are pushing new elements. But if you made your own version of a vector, you would need to do the same anyways. Maybe for a custom use case, you can use a better reallocation strategy. That would be the only improvement I can imagine you might make. With vector you can initialize with a specific size, reserve a certain amount of space, etc. If using these features to avoid the vector reallocating memory is enough, then you will gain nothing from using your own vector implementation.

And other STL classes like map, unordered_map, and list are perfectly fine implementations of the corresponding data structures. You can implement your own hash map, but do you really need to, and why should you expect to get a performance benefit?

yungman · Aug 18, 2020

Jarvis323 said:

Also, be careful with code like index = v.size() - 1 You can print out v.size() -1 when v is empty to see what I mean.

How about
index = v.size()

Then in the later lines I use (i < (index - 1)) instead of (i < index)

Jarvis323 · Aug 18, 2020

yungman said:

How about
index = v.size()

Then in the later lines I use (i < (index - 1)) instead?

The point I was making is that v.size() has type size_t, which is an unsigned integer type. If you subtract 1 from an unsigned integer 0, then what happens is that it wraps around to the maximum value. So if subtracting from an unsigned type, you have to consider what happens if the value goes below zero, especially in loop conditions, because in this case subtraction can yield a larger (very large) value where a smaller value was intended.

What you proposed is fine, because index is a signed int, so it can go below zero to end the loop.

Jarvis323 · Aug 18, 2020

yungman said:
I have two questions here. In the program, I have two different boolean expressions, one using array A[10], the other just using individual variablesB(0..9). They both do the same thing as in the program. My question which one is faster in real life hardware? My guess is B(0..9) is faster even though it looks stupid. I think each variable use individual address, so it's one step access. The one using array has to take the starting address, add the index to get the address before accessing the value. I put a lot more emphasis on efficiency than make the code looks "nice".

Also, in the process of declaring the bool A[10], I found out I had to initialize with {1, 1, 1, 1, 1, 1, 1, 1, 1, 1} INSTEAD of {1} for all 10 elements. Why?
C++:
//Efficiency
#include <iostream>
using namespace std;

int main()
{
    bool A[10] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1};
    bool B0 = true, B1 = true, B2 = true, B3 = true, B4 = true, B5 = true,
        B6 = true, B7 = true, B8 = true, B9 = true;
    bool An, Bn; // result of AND

    An = A[0] && A[1] && A[2] && A[3] && A[4] && A[5] && A[6] && A[7] && A[8] && A[9];
    Bn = B0 && B1 && B2 && B3 && B4 && B5 && B6 && B7 && B8 && B9;
    cout << " An = " << An << "    Bn = " << Bn << "\n\n";
    return 0;
}
Thanks

I think that in the future, when you've learned about structs and classes, then you will probably want to implement a series of related variables like B0-10 as a struct if not an array. An example would be a mathematical vector.

C:

struct Vec3
{
    float x;
    float y;
    float z;
} vec3;
//or
float vec3[3];

In that case, they should be basically equivalent under the hood. Both are guaranteed to be contiguous in memory. There isn't any reason I can think of why a compiler should generate different code when either of them are used. Being contiguous can be good for the performance, because a single machine instruction can load multiple at once, or even operate on multiple at once, or cache a large chunk of them.In most cases, don't worry about these kind issues you brought up when it comes to performance. Worry about the algorithms and data structures, worry about making unnecessary copies of large objects (pass them by reference instead of by value), then worry about parallelism and vector instructions or maybe using the GPU, then worry about caching issues and so forth. Focus on the slow parts. After you've gone through all of the serious optimization strategies, then you can worry about these type of issues you brought up. When you do that, you need to do trial and error, maybe look at the assembly, and don't expect it to be the same for a different compiler or machine, or when the whole program has changed.

jtbell · Aug 18, 2020

yungman said:

I don't recall I've seen uchar in the book.

Jarvis323 said:

uchar is an 8 bit unsigned int. Like I said, in C and C++, 0 is treated as false and not 0 is treated as true.

uchar appears to be a Microsoft thing. I didn't remember seeing it before, so I did a Google search for "c++ uchar" and came up with this from Stackoverflow:

If you try to find definition of uchar (which is pressing F12 if you are using Visual Studio), then you'll end up in OpenCV's core/types_c.h:

#ifndef HAVE_IPL
typedef unsigned char uchar;
typedef unsigned short ushort;
#endif

When I tried to compile a simple test program that declares a uchar variable, using g++, I got this:

Code:

% g++ -std=c++11 uchartest.cpp
uchartest.cpp:5:5: error: unknown type name 'uchar'
    uchar foo;
    ^
1 error generated.

yungman's textbook probably sticks to standard C++.

Jarvis323 · Aug 18, 2020

jtbell said:
uchar appears to be a Microsoft thing. I didn't remember seeing it before, so I did a Google search for "c++ uchar" and came up with this from Stackoverflow:
When I tried to compile a simple test program that declares a uchar variable, using g++, I got this:
Code:
% g++ -std=c++11 uchartest.cpp
uchartest.cpp:5:5: error: unknown type name 'uchar'
    uchar foo;
    ^
1 error generated.
yungman's textbook probably sticks to standard C++.

Oh, thanks for pointing that out. I usually use uint8_t so I didn't realize this. I only used uchar when using OpenCV since it's the type OpenCV loads an image as. I just naively thought it was standard, and it compiles fine for me with g++ when using OpenCV, but it turns out that's only because OpenCV defines it in its header.

jedishrfu · Aug 18, 2020

With respect to the array vs single variable issue, there’s a scheme called cache prefetch where sequential memory locations as in instructions or an array can be loaded ie prefetched for use when the processor gets to those steps.

https://en.wikipedia.org/wiki/Cache_prefetching

it can be either hardware, software or a mix of the two schemes to improve program performance. In this instance, arrays in loops can be processed faster than via memory addressing.

yungman · Aug 18, 2020

I just want to update everyone, It is correct I DO NOT need the boolean array, just one boolean variable is good enough. I am on the bubble sort now. I am stuck on some other things that the program got hung up. I want to spend some time before I cry for help. I need to learn to solve my own problem...at least give it a real try.

Thanks.

jedishrfu · Aug 18, 2020

Always remember the print statement is your friend. It can always help you to find most of your bugs unless you introduce one in it too.

yungman · Aug 18, 2020

jedishrfu said:

Always remember the print statement is your friend. It can always help you to find most of your bugs unless you introduce one in it too.

If I can give you a double like, I would! I did NOT know I can print the .cpp! I always had to copy onto word.

Yes, it's much easier to read on paper, so far I am stuck, hopefully reading from the paper can help. I still have not taken a peek at the the program in the book, I want to do it on my own first. But it's getting very short already. I must have put something in an infinite loop somewhere. But when I put in cout<<"..." to print out where I am in the program at different spot, it gave me correct answer.

Thanks

BTW, I only follow the uchar discussion on and off, too busy. So should I put in my notes to use it if it only works in VS? But in reality, even if I use short int that is 2bytes, it's not a big deal as memory is cheap now a days. At least I don't need it now, but spending a day looking into this sure learn a lot.

yungman · Aug 18, 2020

I got it. With the print out, it's easier, I spotted one error from the print, but the most important and stupid one is I put a ';' at the end of the function definition. It skip the function all together!

C++:

// 8.4 Alan bubble sort
#include <iostream>
#include <vector>
using namespace std;
void sortAr(vector <int>&, int&);//sorting function with reference parammeters of vector SA and size of the vector
void InputAr(vector <int>&, int&);// to get numbers from user.
int main()
{
    int sz;
    vector <int> SA;// vector to store the numbers from user for sorting
    InputAr(SA, sz);// For user to input numbers for sorting
    cout << " The size of vector SA is " << SA.size();
    cout << "     SA = {";
    for (int i = 0; i < sz; i++)
        cout << SA[i] << " ";
    cout << "} \n\n";

    sortAr(SA, sz);
    cout << " The result after sorting = {";
    for (int i = 0; i < sz; i++)
        cout << SA[i] << " ";
    cout << "} \n\n";
    return 0;
}
void InputAr(vector<int>&Av, int &size)// to get numbers from user.
{
    int num;
    do
    {
        cout << " Enter any +ve integer to be sorted, enter -1 to quit "; cin >> num;
        if (num >= 0)
            Av.push_back(num);// Put new number into the last element starting with Av[0].
    } while (num >= 0);// only take numbers >=0, -ve number to terminate.
    cout << "\n\n";
    size = Av.size();
}

void sortAr(vector <int>& AR, int &size)
{       
    bool nSwap;
    int h = 1, temp;
    do
    {
       
        nSwap = true;// start the do-while with nSwap = true, let the inner while loop to change to false.
        h = 1; //Reset h = 1 and start over again.
        while (h < size)
        {
            if (AR[h-1] > AR[h])
            {
                nSwap = false;// If there is a swap, it is change to false
                temp = AR[h-1];
                AR[h-1] = AR[h];
                AR[h] = temp;//swap AR[h-1] with AR[h]
            }
            h = h + 1;
        }
    } while (nSwap == false);// repeat as long as one pair of elements need to be swap.
}

The ';' I put was at the end of line 38! That screwed up everything.

I looked at the program in the book, the sorting is very similar, but I made it fancier by letting user to input any number and any amount of numbers to sort. The book only use a fixed 6 elements array to sort. That's too easy. I am stoked.BUT, I still have an issue. I am supposed to put -1 to signal I want to quit after entering the numbers. BUT if I hit other key like 'q', it will run non stop and I can see the cmd screen keep rolling and rolling until I close the cmd window. Why?

Thanks

jedishrfu · Aug 19, 2020

I got burned by the semicolon a few times when writing code in a hurry:

C:

while(x<3);
{
  ...do something...
}

vs

C:

while(x<3) { ;
   ...do something...
}

The semicolon error of the first one didn't execute the { } block repeatedly while x<3 so I changed my programming style to use the { on the same line where now any errant semicolon is inside the block and so becomes a null statement.

I think modern IDE tools may have options to spot the "mistake" and flag it.

Question about the efficiency of using an array or individual variables

Similar threads

Hot Threads

Recent Insights