C/C++ How to Parse Two Characters into a Single Hexadecimal Number in C++?

  • Thread starter Thread starter Labyrinth
  • Start date Start date
AI Thread Summary
The discussion focuses on parsing two characters from a string in C++ into a single hexadecimal number, specifically converting 'A' to 8 and 'B' to 0xC to form 0x8C. Participants suggest using functions like strtol or sscanf for conversion, followed by bit manipulation techniques such as bit shifting and masking to combine the values. A more efficient approach involves creating a lookup table (array) for character translations, which allows for quick retrieval and combination of values. Concerns about the necessity of this optimization for the application are raised, alongside suggestions for alternative methods like nested if-else statements for character translation. The conversation emphasizes the importance of viewing the problem through an arithmetic lens rather than purely as string manipulation.
Labyrinth
Messages
26
Reaction score
0
I am using c++ and trying to parse two characters from a string into a single hexadecimal number.

For example part of the string goes 'AB'. I want to state that A is 8, and B is 0xc (0xc), but then assign both of these into a single value to form a unique number: 0x8c

This is because I am trying to save space with a 1 byte char (8 bits), having the parser effectively write to each 4 bit block (nibble) to be considered separately later.

If I cannot do this using c++ can it be done using assembly?

Thanks in advance for any help.
 
Technology news on Phys.org
Use strtol or sscanf to convert the hex character to a byte then split the byte using bitmasks and shifts
 
I'm curious: what does the title of this thread, "wilds for each digit", mean?
 
My first reaction to reading this was that posted already by mgb_phys. But then I realized that I'm totally lost here:

Labyrinth said:
For example part of the string goes 'AB'. I want to state that A is 8, and B is 0xc (0xc), but then assign both of these into a single value to form a unique number: 0x8c

You've got a string with two bytes interpreted as chars:

01000001 (65, 'A')
01000010 (66, 'B')

And now, you want to somehow associate "A" with 8?

01000001 => 00001000 ?

And then assign B to 0xC, AKA 12?

01000010 => 00001100 ?

I would understand if you wanted to make "A" => 4, or "A" => 1 (reading only the first half of the byte or the last half of the byte), or if you wanted "A" => 10 (since that's the human-readable definition of "A"), but 8? Similarly for B, I would understand "B" => 4, or "B" => 2, or B => 11, but I don't understand making it 12?

So, all told, it's:

01000001 01000010 => 10001100 ?

I guess I'm a little lost on the example.

DaveE
 
davee123 said:
You've got a string with two bytes interpreted as chars:

01000001 (65, 'A')
01000010 (66, 'B')

And now, you want to somehow associate "A" with 8?

Yes, I want to "decipher" them into numbers I have decided upon.

Pseudocode:

When you read an A, interpret it as an '8'. // Parse 1
When you read B, interpret it as 'c' // Parse 2

char My_new_number = 0x$$ // each $ is what I call a 'wild' for lack of a better term.

Take the answer from Parse1 and replace the first wild with it.
Take the answer from Parse2 and replace the second wild with it.

Result:

My_new_number = 0x8c

What mgb_phys mentioned seems promising, I am still investigating.
 
A sanity check:
This is because I am trying to save space
Are you sure this really is something important for your application, and worth devoting your time to?


Anyways, the main thing, I think, is to stop thinking of the entire thing as string manipulation -- you should start thinking of some of it in terms of arithmetic. (Well, I suppose bit shifts and masking can be reasonably thought of as string manipulation as well -- just with a different alphabet)
 
Labyrinth said:
char My_new_number = 0x$$ // each $ is what I call a 'wild' for lack of a better term.

"wild card" is a better term.
 
Labyrinth said:
What mgb_phys mentioned seems promising, I am still investigating.

Depending on how fast you need it to be, and how many resources you can take, there are a few ways to do it.

The best way would be arithmetic, assuming you can find a few patterns that does the trick. That'd probably minimize the computing power and time needed-- you just need to find a pattern that works.

A fast way to do it (if you have to repeat this a LOT of times in a row) that uses more memory would be to effectively build a hash. An array of 256 single bytes-- the 65th is set to 8, the 66th is set to 9, etc. For each character you'd like to translate, set the Nth value in the array to the translated hex number, where N is the ASCII value of the character. Then, read in the 1st character, fetch the translation from the array, then bitshift the resulting value over by 4. Then read in the 2nd character, fetch the translation, and OR it with the result of the 1st character, yielding the total value.

Downside is that you're using a 256-byte array, which may be more than you want to take, especially if this is only happening a couple times in your program.

Otherwise, I'd probably set up a nested if...else clause (better performance than a switch, I think), and test each character to get the translation, and again OR-ing the results together to get the full translated byte. Makes for an ugly bunch of code, but it should be reasonably quick.

DaveE
 
Back
Top