Are Binary Files and Text Files the Same?

  • Thread starter Thread starter yungman
  • Start date Start date
  • Tags Tags
    Binary files
Click For Summary
Binary files and text files are fundamentally different in how they are interpreted by programs, despite both being sequences of bytes. Binary files contain data in a format specific to their type, while text files represent characters using encodings like ASCII or Unicode. The distinction between the two is crucial for understanding how data is stored and accessed, as programs treat them differently, particularly in file operations. Reading a file in binary mode retrieves the raw bytes, while reading in text mode interprets those bytes as characters, which can lead to the same visible output if the content is purely text. Understanding these differences is essential for effective file handling in programming.
  • #31
yungman said:
See, that's the problem in the book, I red lined on that page, the first line is dataType is the data type that you are converting to, and the value is the value that you are converting. You tell me.
Here's what it says on that page, including part that you underlined, and part that you didn't.
where dataType is the data type that you are converting to, and value is the value that you are converting. For example, the following code uses the type cast to store the address of an int in a char pointer variable.
Here's the line in question:
C++:
 ptr=reinterpret_cast<char*>(&x);
In what I quoted, dataType is char *, or char pointer, and value is &x, which is int pointer. The sentence that starts with "For example, ..." explains that an address of one type is being cast as an address of another type. IOW, the cast is changing the type of pointer.
yungman said:
I know for expert like you, you know exactly what's going on. But put yourself in the shoe of a student trying to learn, how deceiving can this be? One side of the mouth said it's data conversion, the other side said it's a pointer!!...in two sentences one after the other! I know they are character pointers, converting from one to another. But without finding more info, I have to assume the reinterpret_cast TRANSLATE the value to characters and put it in memory and giving back the pointer char* to point to that to be used! What else can I assume?
It's not deceiving at all if you read it carefully. And no, reinterpret_cast does not translate a value to characters. It converts a pointer of one type to a pointer of another type.
 
Technology news on Phys.org
  • #32
yungman said:
I kind of give up on this, the book make a big sting about file written in .txt mode and binary mode. The program in post #4 shows if I wrote and store in binary file, I can read back the EXACT same thing just reading back in text mode by using getline!

Post #16 is very similar also. I quote the page and show what exactly the book said. But there is inconsistency again.

Normally, when I study a subject, I get like 3 or 4 books and I am going to get a clear answer for all my questions. Not in this C++. C++ is NOT the most difficult subject in all do respect, it's just there is no straight answers. It really doesn't NOT help going on line. This is an example of looking for reinterpret_cast:
https://en.cppreference.com/w/cpp/language/reinterpret_cast
This anything BUT showing how to use reinterpret_cast. This is NOT that hard if you just read from the page of the book that I copy out, just a simple translation from one type to the other! But then it's NOT.

I studied advanced calculus, electromegnatics and microwave RF on my own and used them on the jobs for years, I don't think I am slow. You cannot put C++ in the same league as those. It's the inconsistency, it's almost like they make it up as they go. Like I got tripped by the "*" and "&"in the pointer and address. I never seen other scientific subject have the same symbol meaning totally different things.

It's like in post 16, the book clearly it is transforming from integer to char. Try use that as if it's char. It doesn't work. Unless I am so totally wrong reading the few lines in the page, it has problem. 4 books, no answer, go on line, you get those pieces like that.

If only I can be happy to work out problems like the book, no more and no less, I would not have nearly as much question. In fact I don't have any question if I just follow the procedure in the book! But as soon as I walk out of the line a little, then I find all the holes that the book missed, that never explain.

https://en.cppreference.com/w/cpp/language/reinterpret_cast

That seems like a pretty precise and complete explanation to me. And it does have examples. You can even edit the example, and recompile and run it right there on the web page.

The explanation gives more and more detail as it goes on, but I think this is all of it that you need to read to get the idea.

Screenshot_2020-09-19_11-10-39.png


In fact, you can understand it pretty well if you just carefully read this sentence alone:

"Converts between types by reinterpreting the underlying bit pattern."

It might be important to know that "reinterpret the underlying bit pattern" implies that the underlying bit pattern is not changed, just interpreted differently.

All of that extra detail, in the list and so forth, is difficult to understand, even for intermediates. In C++ there is a lot of detail, and you haven't even gotten (and probably never will) get to the most difficult to understand or memorize stuff. I feel that C++ is almost like a bottomless pit. Most people just stop at some point and work with the knowledge they have. It can take many many years and dedication to become a true expert in C++. I've been using C++ for about 10 years, and I wouldn't consider myself even close to being a C++ expert. There is a lot in the language that I don't use and haven't bothered to memorize. My philosophy is to keep things relatively simple anyways. Whenever I feel the need to use features that I haven't studied in depth, or don't trust my memory, I have to check documentation. This is how programming is in general though, you get used to reading documentation efficiently and carefully.

Binary files become more useful when writing and reading non-text data (e.g. numeric data). For example, `float 2.984938984398489 in text uses an 8 bit ASCII code for each digit as well as the period. That's 17 bytes. Meanwhile, its binary representation is 4 bytes. Also its binary representation is exactly what is stored in memory, so writing that you can expect to read it back and get exactly the same thing. Also, if you were to read it as text, you would have to then use an algorithm to convert the string to a float.

Also binary file IO is much faster than text IO, because it's just literally copying the bits into memory and not doing any parsing/checking/encoding. For example, if you are using getline, it is scanning through the text for the next '\n'. That will be slow compared to just directly copying some fixed amount of it directly into memory.
 
Last edited:
  • #33
Mark44 said:
Here's what it says on that page, including part that you underlined, and part that you didn't.

Here's the line in question:
C++:
 ptr=reinterpret_cast<char*>(&x);
In what I quoted, dataType is char *, or char pointer, and value is &x, which is int pointer. The sentence that starts with "For example, ..." explains that an address of one type is being cast as an address of another type. IOW, the cast is changing the type of pointer.
It's not deceiving at all if you read it carefully. And no, reinterpret_cast does not translate a value to characters. It converts a pointer of one type to a pointer of another type.

Thanks for the explanation.

So there's no translation, just forcing the int* pointer to become char* pointer that point to the same address &x? That's it? Would it kill them to put in two extra words <datatype> is a pointer to data type it's converting to? I would not have wasted two days.

BTW, I tried your suggestion of &Iw[index] instead Iw[index] in post 24. This is what is written in the file:

Compile error Listing 7.2.jpg


Iw[] = {1,2,3,4,5};
 
  • #34
yungman said:
So there's no translation, just forcing the int* pointer to become char* pointer that point to the same address &x? That's it? Would it kill them to put in two extra words <datatype> is a pointer to data type it's converting to? I would not have wasted two days.
It is just treating it as if it's a different type, but not changing what is stored in memory.

datatype in this example is just a type, not necessarily a pointer.

You could have saved those days just reading this sentence from en.cppreference carefully:

"Converts between types by reinterpreting the underlying bit pattern."
 
  • Like
Likes sysprog
  • #35
Jarvis323 said:
It is just treating it as if it's a different type, but not changing what is stored in memory.

datatype in this example is just a type, not necessarily a pointer.

You could have saved those days just reading this sentence from en.cppreference carefully:

"Converts between types by reinterpreting the underlying bit pattern."
Ha ha, my English is bad, this sounds even worst.
 
  • Like
Likes sysprog
  • #36
yungman said:
Ha ha, my English is bad, this sounds even worst.
Nevertheless, it's just one sentence you need to understand carefully. Reading several books and searching for different explanations is not going to help much. I'm just saying it might help to just zero in on the explanation at hand and try to understand it very carefully.
 
Last edited:
  • Like
Likes sysprog
  • #37
Jarvis323 said:
Nevertheless, it's just one sentence you need to understand carefully. Reading several books and searching for different explanations is not going to help much. I'm just saying it might help to just zero in on the explanation at hand and try to understand it very carefully.
I think that it might be helpful to @yungman if he were to read and carefully think over this: https://www.fluentcpp.com/2017/03/31/how-typed-cpp-is-and-why-it-matters/
 
  • #38
yungman said:
Would it kill them to put in two extra words <datatype> is a pointer to data type it's converting to? I would not have wasted two days.
Maybe it wouldn't kill them, but the extra words make the explanation incorrect. <datatype> could be a float, for example.

Jarvis323 said:
I'm just saying it might help to just zero in on the explanation at hand and try to understand it very carefully.
+1

yungman said:
BTW, I tried your suggestion of &Iw[index] instead Iw[index] in post 24. This is what is written in the file:

compile-error-listing-7-2-jpg.jpg

Iw[] = {1,2,3,4,5};
The last code you posted was in post #24. That code couldn't have produced this output. Also, if you write to a binary file, to see what's in it, you need to use a binary file viewer. You can't just use "cout << ...".
 
  • Like
Likes sysprog
  • #39
@yungman, you have exhibited a lot of confusion in this thread, as has also been the case in several other threads.
Here's the short version of the part of this thread, from post #16 all the way to post #33.
In post #16, you were confused about an explanation of reinterpret_cast.
I explained what was happening in posts 18 and 23.

In post #24, you said "this is converting from integer value of x to a character."
This indicates that you didn't understand the previous two explanations.
I gave another explanation for what is happening in the cast in post #25.

In post #30, you were still confused, saying that the book is deceiving you.
I replied in post #31 with another explanation.

Finally in post #33, you got it.

I see several reasons for your frequent confusion.
  1. Either you don't read the explanations, and press on anyway, or you read them, but don't understand the terms being used. If you don't understand what someone is saying, ask for an explanation. I know that English is not your native language, and some of the others here do, too, so we don't mind rephrasing things in different words.
  2. Sometimes you are too loose in your use of words when you are explaining something. For example, saying several times that an int was being cast as a character, or that an int was being cast as a char pointer. You really need to pay more attention to the details.
  3. You seem to think that because you used to design hardware, or you wrote code in Pascal, Fortran, and some assembly 40 or however many years back, that learning C++ should be easy. You've managed to get through 700 pages, of which the first 400 or 500 were pretty straightforward, but once you get to pointers, things have gotten a lot more tricky. When you ask about a problem you're having, and an explanation has been given, go back and check your assumptions, and see if what you're assuming might be false. Also see item 1 in this list.
  4. Get more comfortable with the debugger.
 
  • Like
Likes sysprog, Vanadium 50 and PeterDonis
  • #40
I have to spend time to read the responses, I have been reviewing files from the beginning and run into a strange thing I can't explain:
Code:
    ofstream outF;
    ifstream inF;
    outF.open("data.txt", 51);
    outF << Cw; 
    outF.close();
    inF.open("data.txt", 51);    //outF.close();
    //inF >> Cr;
    inF.getline(Cr, 51);
    cout << " read from file into Cr: " << Cr << "\n\n";
    inF.close();
When I put outF.close() in line 5, it will NOT write the content of char Cw[] into "data.txt".

BUT when I put outF.close() in line 6 as shown in the comment, it works. I can read the data.txt and it's there. It works if I put anywhere before before return 0;
I did rebuild solution, clean solution, getting out of the program and even restart the computer. Seems to be something I did wrong here in the program.
Why?

Thanks
 
  • #41
I think that if you use the debugger facility, as repeatedly suggested by @Mark44, then you will probably understand better how and why things are working as they are . . .
 
  • #42
I start reading the debug instruction and run VS, I hit F5, nothing happen, I went to the debug menu and start debug, it just ran the program. I started the debug, then I try to set break point by putting the cursor on the line of code and hit F9, nothing happened. Basically, nothing works.

Is it when I set the break point at certain line, it will stop and display the value of the variables? The article use C# as example, I really don't follow the program.

Did I miss something to get start, seems like none of the F keys works. Can anyone advice?

Thanks
 
  • Like
Likes sysprog
  • #43
On many keyboards the function keys default to 'multimedia keys' so that they control the volume, screen brightness etc - there may be different-coloured symbols on them illustrating this. You can get the function keys to perform their 'normal' function by holding down the 'Fn' key (usually near the bottom left of the keyboard).

Some keyboards have a 'function lock' key (often Fn-Esc) so you don't have to hold down Fn, or you can change the default setting in the BIOS.

Check the documentation/manufacturers site for your laptop.
 
  • Like
Likes yungman and sysprog
  • #44
did you compile for debug, that is the first thing to check. setting a breakpoint is as easy as right clicking on rthe very left hand side of the line.
 
  • Like
Likes yungman and sysprog
  • #45
yungman said:
I start reading the debug instruction and run VS, I hit F5, nothing happen
If you don't have a breakpoint set, VS will run the program and exit the debugger.
yungman said:
then I try to set break point by putting the cursor on the line of code and hit F9, nothing happened. Basically, nothing works.
F9 toggles a breakpoint - Use F10 to single-step through a program.
If you're working on a laptop, you might have to hit another key in addition to F5 or F10, as pbuk mentioned.
yungman said:
Is it when I set the break point at certain line, it will stop and display the value of the variables? The article use C# as example, I really don't follow the program.
Things work the same in C# or C++.
In the Debug menu, it gives all the function key shortcuts.
What I do is put a breakpoint somewhere near the start of the program, then hit F5 to execute to the breakpoint, then hit F10 to single-step from there.

I use F11 to "step into" a function if I want to see what it's doing. F10 will "step across" a function, so won't go into the body of the function.
 
  • Like
Likes yungman and sysprog
  • #46
yungman said:
I have to spend time to read the responses,
Please do this before making any new posts.
Part of the problems you've been having are because you haven't read the responses.
yungman said:
I have been reviewing files from the beginning and run into a strange thing I can't explain:
Code:
    ofstream outF;
    ifstream inF;
    outF.open("data.txt", 51);
    outF << Cw;
    outF.close();
    inF.open("data.txt", 51);    //outF.close();
    //inF >> Cr;
    inF.getline(Cr, 51);
    cout << " read from file into Cr: " << Cr << "\n\n";
    inF.close();
Is this from one of the books you've been reading? If so, using a "magic number", 51, for the open mode is very bad style. It took me about 20 minutes to find the definitions of the flag values that make up 51. When you open the input stream, inF, you are opening it in binary and input modes (which is good), but you are also opening it in truncate mode, so the contents are getting erased immediately before the call to getline(). This means that the output statement at the end will not display anything for the char array or string.
yungman said:
When I put outF.close() in line 5, it will NOT write the content of char Cw[] into "data.txt".

BUT when I put outF.close() in line 6 as shown in the comment, it works. I can read the data.txt and it's there. It works if I put anywhere before before return 0;
I did rebuild solution, clean solution, getting out of the program and even restart the computer. Seems to be something I did wrong here in the program.
Why?
I made one small change in your code, and it worked just fine, so either you were looking at the wrong thing as the cause of what you were seeing, or I don't understand what you're asking.
The change I made was where you opened the input stream inF.
C++:
inF.open("data.txt", std::ios::in | std::ios::binary);

You're apparently still confused about binary files. If you write a character string to a file in binary mode, you'll get the same string back if you reopen the file in text mode. W

Here's a different example that shows the difference between two files that have the same value written to them, but one file is opened in binary mode and the other in text mode. Even though both were written with the same value, the contents of the two files are different.
C++:
#include <iostream>
#include <fstream>
using namespace std;

int main()
{
    fstream dataFile1, dataFile2;
    dataFile1.open("demofile1.txt", ios::binary|ios::out);
    dataFile2.open("demofile2.txt", ios::out);
    unsigned x = 0xDEADBEEF;
    dataFile1.write((char *)(&x), sizeof(x));
    
    dataFile2 << x;
    dataFile1.close();
    dataFile2.close();    
}
Using a binary file viewer, the files contain the following bytes:
demofile1.txt - EF BE AD DE (4 bytes)
These four bytes are the big-endian representation of the hex number 0xDEADBEEF

demofile2.txt - 33 37 33 35 39 32 38 35 35 39 (9 bytes)
These nine bytes are the the ASCII codes for the decimal number 3735928559, which is the base-10 representation of the hex number 0xDEADBEEF.
 
  • Like
Likes yungman
  • #47
Mark44 said:
Please do this before making any new posts.
Part of the problems you've been having are because you haven't read the responses.
Is this from one of the books you've been reading? If so, using a "magic number", 51, for the open mode is very bad style. It took me about 20 minutes to find the definitions of the flag values that make up 51. When you open the input stream, inF, you are opening it in binary and input modes (which is good), but you are also opening it in truncate mode, so the contents are getting erased immediately before the call to getline(). This means that the output statement at the end will not display anything for the char array or string.
I made one small change in your code, and it worked just fine, so either you were looking at the wrong thing as the cause of what you were seeing, or I don't understand what you're asking.
The change I made was where you opened the input stream inF.
C++:
inF.open("data.txt", std::ios::in | std::ios::binary);

You're apparently still confused about binary files. If you write a character string to a file in binary mode, you'll get the same string back if you reopen the file in text mode. W

Here's a different example that shows the difference between two files that have the same value written to them, but one file is opened in binary mode and the other in text mode. Even though both were written with the same value, the contents of the two files are different.
C++:
#include <iostream>
#include <fstream>
using namespace std;

int main()
{
    fstream dataFile1, dataFile2;
    dataFile1.open("demofile1.txt", ios::binary|ios::out);
    dataFile2.open("demofile2.txt", ios::out);
    unsigned x = 0xDEADBEEF;
    dataFile1.write((char *)(&x), sizeof(x));
  
    dataFile2 << x;
    dataFile1.close();
    dataFile2.close();  
}
Using a binary file viewer, the files contain the following bytes:
demofile1.txt - EF BE AD DE (4 bytes)
These four bytes are the big-endian representation of the hex number 0xDEADBEEF

demofile2.txt - 33 37 33 35 39 32 38 35 35 39 (9 bytes)
These nine bytes are the the ASCII codes for the decimal number 3735928559, which is the base-10 representation of the hex number 0xDEADBEEF.
I am so sorry!
That was wrong, I was playing with getline() that require the size that I set to 51. I don't need that for file.open. I was just modifying the file a step at a time from getline to file and I just copy over without thinking.

Yes, I deleted the 51 and it works. But I stop the C++ at the moment to learn the debugger. I have a heavy moment of initia, takes a lot to get me off on what I am doing, now that I am off and onto debugger, I am going to try to learn the debugger before I come back to C++.

Thanks
 
Last edited:
  • #48
I was testing my laptop, the function key is NOT working as F keys. It's for volume up and down and all that. I confirmed by using LTSpice simulation and the F keys don't work at all. No wonder. I have been on HP site, I can't find my computer, but the general instruction is turn off the computer and power on and hit the F10 key to get into BIOs etc. I am going to have to do this first.

Finally I have to stop and learn the debugger, no choice, sounds quite useful it it can stop at a line and read the variables.

I never even stop and look at the F keys since I got this computer about 3 or 4 months ago. I have not been doing LTSpice lately as I am busy with C++. Never realize the F key is not working. Who needs the sound volume control and all that. I still use the volume bar on all the applications to turn the volume up and down. I know my other laptop works. Also, this stupid laptop comes with touch screen, I hate that, I have the mouse and keyboard, why do I need touch screen other than gave me trouble when I accidentally touch the screen. I had to turn that off also.

Thanks
 
  • #49
yungman said:
I was testing my laptop, the function key is NOT working as F keys.
Does your computer have function keys? I looked at one of the HP laptops on their site, and they have a row of what they call action keys along the top. The picture I saw wasn't very clear, but it looks like these keys have something else written on them, like f1, f2, and so on. Possibly they work by pressing the fn key at the lower left. Or maybe you have to disable the action key feature.

In any case, the function keys are useful shortcuts, but they're not necessary. All the commands under the Debug menu can be used by selecting them with the mouse cursor.
 
  • Like
Likes yungman and sysprog
  • #50
yungman said:
I have been on HP site, I can't find my computer.
If you go to https://support.hp.com/ there is a form to enter the serial number which will be on a label on the underside of your laptop. This will identify it unambiguously.

yungman said:
the general instruction is turn off the computer and power on and hit the F10 key to get into BIOs etc. I am going to have to do this first.
That sounds about right, the option to change the keys default mode should be fairly easy to find.
 
  • Like
Likes yungman and sysprog
  • #51
Thanks guys, I got it, after I hit F10 and got into the BIOS, I just search around and found the Action Key Mode and disable it, and then restarted the computer. It works. I can use the F keys like normal. I got into VS and manage to create breakpoints at will.

Also, I learn how to step line by line. But I can only start the stepping after I run to the first break point, then I can step one step at a time using the mouse arrow.

But I cannot manage to get the window that show the value of the variables now. I know I've seen it before, but now I cannot make it appear. How do you bring up that window?

Thanks
 
  • #52
yungman said:
But I cannot manage to get the window that show the value of the variables now. I know I've seen it before, but now I cannot make it appear. How do you bring up that window?
Debug menu --> Windows --> Locals
 
  • Like
Likes sysprog and yungman
  • #53
I stepped through the program and I can see step by step the value/string of variables. That works. My question is how do I determine whether inF.open(), outF.open() working or not? Also whether the stuffs are written into file or read back looking at the window below?
Compile error Listing 7.2.jpg


The program is working, just want to know how to look at it if it fails.

Also, what else I should learn on debugger other than reading these. I want to get back to the C++!

Thanks
 
Last edited:
  • #54
yungman said:
I stepped through the program
In my view that's good. The 'diagnostic techniques' set of topics constitute an eventually-advanced topic set that pretty much begins with step-by-step analysis. I recommend that you please continue with your use of the debugger tool-set.
 
Last edited:
  • Like
Likes yungman
  • #55
yungman said:
The program is working, just want to know how to look at it if it fails.
Most of the I/O routines return a value or have a way to check whether they succeeded. In the code that I showed in post #46, I added a line to see if the file opened successfully.
C++:
dataFile1.open("demofile1.txt", ios::binary|ios::out);
if (dataFile1.fail()) std::cout << "Failed to open demofile1.txt" << "\n";
If the call to dataFile1.fail() is true, the file couldn't be opened for some reason, and my code prints a message to that effect.

I added a line after the call to write().
C++:
dataFile1.write((char *)(&x), sizeof(x));
if (!dataFile1) std::cout << "Failed to write to demofile2.txt" << "\n";
The write() function returns a reference to the stream it's writing to. If the call to write() fails, the function returns null, and my code prints a message.
sysprog said:
I recommend that you please continue with your use of the debugger tool-set.
+1
 
  • Like
Likes yungman and sysprog
  • #56
I am back to char* ptr; ptr = reinterpret_cast<char*>(&x) where int x; I have been reading back old posts. I know now this is just changing an integer pointer(&x) to char pointer. So now it's char ptr = &x? Meaning it's still the address of x.
 
  • #57
yungman said:
I am back to char* ptr; ptr = reinterpret_cast<char*>(&x) where int x; I have been reading back old posts. I know now this is just changing an integer pointer(&x) to char pointer.
Yes. The bit pattern that represents an address doesn't change in the slightest. All that changes is that the address is now considered to be the address of a char instead of an int.
yungman said:
So now it's char ptr = &x? Meaning it's still the address of x.
No, not quite. The left side of what you wrote is wrong.
This says that ptr is a character, not the address of a character.
Here is a revision of the code you've been asking about that might clarify things for you. Each line is commented to explain what is happening.
C++:
int x = 0x1234;   // Declaration + initialization of x
pInt = &x;        // pInt holds the address or x
char * pChar;     // Declaration --pChar can hold the address of a character.
                  // Since pChar isn't initialized, it contains only a garbage value.
pChar = reinterpret_cast<char *>(pInt);     // The address in pInt is copied to pChar

After the last line executes, pInt and pChar contain exactly the same bit pattern; the address of x. The only difference is how this address is interpreted -- as the address of an int or the address of a byte.
Because of this difference, if we dereference these two pointers, we'll get different results.
cout << *pInt displays 4660, the decimal representation of 0x1234.
cout << *pChar displays 4, because the first byte of the four-byte representation of 0x1234 is 0x34. As a char, this is the ASCII code for the character '4.' This value is due to the way bytes are stored in Big-endian form on Windows machines.
 
Last edited:
  • Like
Likes sysprog
  • #58
We're up to 57 posts now, and I believe the two major questions of this thread have been exhaustively answered, so I will close the thread. @yungman, if you have any lingering questions, you can send me PM. For new questions, please start a new thread.
 
  • Like
Likes Tom.G

Similar threads

  • · Replies 33 ·
2
Replies
33
Views
3K
  • · Replies 29 ·
Replies
29
Views
3K
  • · Replies 8 ·
Replies
8
Views
6K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 22 ·
Replies
22
Views
2K
Replies
10
Views
2K
Replies
65
Views
4K
  • · Replies 75 ·
3
Replies
75
Views
6K
  • · Replies 32 ·
2
Replies
32
Views
3K