Hex editor

  • Thread starter kdinser
  • Start date
  • #1
337
0

Main Question or Discussion Point

Hi all, I want to make a very basic text editor for editing save game files. I already know the locations in the file I need to edit, but I'm not sure how to go about finding them in a bit file(right term???). I'm pretty familiar with general visual basic.net programming and I can get along well enough with c and java when I have to.

If someone can just get me started on how to open up a non ASCII file (visual basic preferred), find byte locations (addresses???), and edit them, I think I can figure out the rest from there as far as writing functions to edit the values I want to goes. I want to test some very specific scenarios in the game and it would take me many hours to edit the files using a regular hex editor. Thanks for any help.
 

Answers and Replies

  • #2
221
0
It's the same as reading from an ASCII file, except the read method is "rb", as opposed to just "r". The "b" is for binary.

The buffer will receive the ordinal values of each byte consequtively, exactly as it would for an ASCII file. The only reason you use the read method "rb" instead of "r", is so the buffer will not mistake a binary "0000 0000" for the end of file.

If the file doesn't have any 0 bytes in it, you could even just read it as if it were an ASCII file. It would make no difference. Just because an ASCII file is humanly readable, doesn't mean its not stored in binary.
 
Last edited:
  • #3
337
0
Thanks Sane, I think I get what your saying, I'll try it tomorrow and ask some more questions if I can't figure it out.

BTW, what kind of college class would something like this be covered in? I've taken 2 levels of java programming, but never took the level 3 class called data structures. Is that where you learn about stuff like this?
 
  • #4
-Job-
Science Advisor
1,146
1
Not really, the Data Structures course usually covers stuff like Hashtables, Heaps, Linked Lists, Vectors, Trees, etc.
Anyway, your Hex file is probably not ASCII, but just a sequence of arbitrary bytes. The idea is to read the bytes off the file and convert them into Hex digits. There will be two Hex digits per byte (since 1Hex digit = 4 bits and 1 byte = 8 bits). Using Java this is straightforward. Here's a Java function i used recently, it receives an array of bytes and returns a Hex string.
Code:
public String toHex(byte[] data) {
	StringBuffer sb = new StringBuffer();
	for (int i = 0; i < data.length; i++) {
		String s = Integer.toHexString(((int) data[i]) & 0xFF);
		switch(s.length()){
			case 1:
				sb.append("0" + s);
				break;
			case 2:
				sb.append(s);
				break;
		}
	}
	return sb.toString();
}
Then you can open the file, read the bytes and convert them into Hex format.
Code:
FileInputStream fis = new FileInputStream(new File("myfile.file"));
byte[] Buffer = new byte[1024];
StringBuffer str = new StringBuffer();
while(fis.read(Buffer) > 0){
	str.append(toHex(Buffer));
}
 
Last edited:
  • #5
221
0
BTW, what kind of college class would something like this be covered in?
I've never learned any of this in a "class", persay. I guess inheritance just took care of it for me. Hahaha. :uhh: But in all seriousness, I assume you'd learn about this in any computer science course?

Here's a Java function i used recently, it receives an array of bytes and returns a Hex string.
For your program, you shouldn't ever need to convert the binary into ASCII hex in order to use it. I'm not quite sure the motivation behind such, nor why Job would suggest the tedious feat. Everything should be kept in its native binary form, as it can be manipulated and represented most readily.

This is assuming the language you're using has an output format specifier that automatically, in constant time, converts to hexedecimal. Such exists in C, C++, Python, Java, to name a few.

In other recent news, Sun has just released Java 6.
 
Last edited:
  • #6
-Job-
Science Advisor
1,146
1
For your program, you shouldn't ever need to convert the binary into ASCII hex in order to use it. I'm not quite sure the motivation behind such, nor why Job would suggest the tedious feat. Everything should be kept in its native binary form, as it can be manipulated and represented most readily.
The motivation is to be able to view/edit the binary data in the file in HEX format, with HEX digits. Or is that not the idea and i'm misunderstanding?
 
  • #7
chroot
Staff Emeritus
Science Advisor
Gold Member
10,226
34
If all you're looking for is an editor which can be used to edit files in hexadecimal, consider UltraEdit (for Windows).

- Warren
 
  • #8
221
0
The motivation is to be able to view/edit the binary data in the file in HEX format, with HEX digits. Or is that not the idea and i'm misunderstanding?
I see no advantage to editing the hex representation, over editing its binary representation. Conversely, using the hex in ASCII form requires extra memory, extra functions, and more work for the programmer. It seems, all in all, ignorant to not take advantage of the fact that hex is half of a byte in binary (although that is a severe abuse of technicalities).

Here is some C code, for purposes of demonstration. Note how everything is manipulated in its binary form, no need for any alternate representations. The same can be done with Java.

Output :
Code:
d5 17 62 4c
4e 18 24 a6
90 37 38 39

Index: [color=blue]3[/color]
Change (Enter In HEX): [color=blue]ae[/color]

d5 17 62 [color=red][b]ae[/b][/color]
4e 18 24 a6
90 37 38 39
Code :
Code:
#include <stdio.h>

#define SENTINEL -1
#define WIDTH 4

void showFile (int *contents) {
        int i;
        for (i=0; contents[i] != SENTINEL; i++)
                printf ("%2x%c", contents[i], (i+1)%WIDTH ? ' ' : '\n');
}

int main () {
        // Load the file's contents from a binary stream
        int file[] = {213, 23, 98, 76, 78, 24, 36, 166, 144, 55, 56, 57, SENTINEL};
        int index, change;
        
        // Display the file's contents in HEX
        showFile (file);
        
        // Prompt the user for a change
        printf ("\nIndex: ");
        scanf ("%d", &index);
        printf ("Change (Enter In HEX): ");
        scanf ("%x", &change);
        printf ("\n");
        
        // Change the appropriate index
        file[index] = change;
        
        // Show the changes
        showFile (file);

        return 0;
}
 
Last edited:
  • #9
-Job-
Science Advisor
1,146
1
So exactly where is the difference between your code and mine? You read in a byte array, then for each byte you print its hexadecimal representation. I read a byte array and convert to the corresponding hexadecimal representation, except i'm using my own function instead of using C's printf formatting to accomplish it.
The reason for that is that Java doesn't have unsigned integers, so if you read in a byte whose most significant bit is 1 and use Java's function to convert to Hex, you'll get -XX, because Java interprets that as a negative number (you'll probably want to fix that in your code by ensuring you are using only unsigned integers).
All the function toHex does is convert each byte to an integer (2 bytes) and perform the bitwise AND of the lower byte with 11111111. This ensures the most significant bit is a 0, and so each byte is correctly converted to its hexadecimal representation.
 
  • #10
221
0
You don't get it. Your method converts manually. My method does this implicitely. In fact, if you compare the times of printing out ASCII hexadecimal, against implicitely converting a byte to HEX with '%x', they take exactly the same amount of time.

This is because the machine does just as much work buffering your binary representation of ASCII hex onto the screen, as it does implicitely converting my binary representation to real HEX on the fly.

Furthermore, conversions to and from ASCII hexadecimal require more code, more work, more memory, more resources, and more time. Binary is most readily manipulatable and readable.

Keeping it in binary is the best way to get things done. Thinking otherwise is just silly.
 
Last edited:
  • #11
-Job-
Science Advisor
1,146
1
For the last time, i'm using Java without a %x equivalent and without unsigned integers. Why don't you code that in Java first and then get back to me.
 
  • #12
221
0
I'm not sure what you're getting at about not using '%x', as it exists in Java. Lack of support for unsigned integers is no problem. Do your homework. There are a couple ways around this:

Explicit Pointer Conversion

You can still use a signed char. Unsigned and signed are analogous to eachother. Changing from one to another does not change its binary format, only that it is now interpreted in a different manner.

Even if 213 is a signed char or an unsigned char, it is still going to be 1101 0101.

The only difference is: now you need to prevent 1 from denoting a negative value. You can do this quite easily, and extremely efficiently, by explicitely converting the pointer to a larger size. Then filtering out the garbage.

Code:
void showFile (signed char *contents) {
        int i;
        signed int *expP;
        for (i=0; contents[i] != SENTINEL; i++) {
                expP = (signed int *)(contents + i);
                printf ("%x%c", (* expP) & 0x000000FF, (i+1)%WIDTH ? ' ' : '\n');
        }
}
Unsurprisingly, printing this out 10,000 times still takes exactly the same amount of time as printing it out 10,000 times without the explicit pointer conversion. Why? Because this is very fast. It's a direct copy of memory. Your machine can handle this with an infinitely negligible speed.

Use A Larger Size

Why even bother using signed chars in the first place? Use a signed integer. −32,768 to 32,767 is plenty. Your RAM will laugh at the extra 8 bits per byte.

And Here's The Clincher

If you store it in ASCII HEX, you will be using 8 bits for each HEX digit, or 16 bits for each byte. Exactly the same size as if you were to simply use signed integers to solve your problem of a limited range. Not only that, but you're using more space than explicit pointer conversion would require you to allocate.
 
Last edited:
  • #13
-Job-
Science Advisor
1,146
1
The pointer conversion technique is what my function toHex does. It casts a byte as an int (2 bytes) and performs the bitwise AND with 0000 0000 1111 1111. This ensures the sign bit is 0.
I'm not aware of a Java equivalent to %x. And to clarify, the only reason i'm converting to ASCII Hex is to send that to the screen and not so that they can be manipulated in ASCII.
You would want to convert each byte to ASCII Hex and send it to the screen directly. The reason why the code keeps everything in a string is because the project i used this in wasn't a hex editor and had to send ASCII Hex strings in chunks.

EDIT: and by the way, are you telling me to do my research about signed/unsigned integers, etc, then post the very same approach/code to solve that issue as i did? Why don't you research my posts before telling me to research something which i already know.
 
Last edited:
  • #14
221
0
You would want to convert each byte to ASCII Hex and send it to the screen directly.
And why? I already said that the implicit conversion provided by %x is just as fast as printing out the ASCII for the HEX.

Since no one else has posted in response to this discussion, I felt it necessary to see what a pal of mine had to say about this.

I wasn't sure if I was missing a point of yours, so I asked him to help clarify. He agrees:

Qifan said:
I believe that you are right (...) His binary-to-hex conversion takes more effort and ignores printf's ability to format the output to the desired base representation.
Am I missing something?

My remark about research was with regards to: "lack of unsigned integers being no problem", and "%x exists in Java". Besides, it seems even more silly, seeing as I already pointed this out twice:

Sane said:
Such exists in C, C++, Python, Java, to name a few.
Sane said:
Here is some C code, for purposes of demonstration (...) The same can be done with Java.
 
Last edited:
  • #15
-Job-
Science Advisor
1,146
1
If you show me how to use %x with Java i'll agree with you.
 
  • #16
221
0
Code:
        System.out.printf("Integer output tests:\n");
        System.out.printf("An integer-----------------------------|%d|\n", integer);
        System.out.printf("An integer as an octal-----------------|%o|\n", integer);
        System.out.printf("An integer as hex----------------------|%x|\n", integer);
Useful, no? :smile:
 
  • #17
-Job-
Science Advisor
1,146
1
I guess that's what i get for still using 1.4.2. I agree that's the easier way then if you use 1.5+.
 
Last edited:

Related Threads on Hex editor

  • Last Post
Replies
7
Views
3K
  • Last Post
Replies
3
Views
30K
  • Last Post
Replies
0
Views
2K
  • Last Post
Replies
13
Views
3K
Replies
6
Views
597
Replies
22
Views
2K
Replies
1
Views
342
Replies
10
Views
1K
Replies
34
Views
3K
Top