Tuesday, February 23, 2010

Convert long to byte array in C++ or C

This post explains how to convert an unsigned long int into a byte array in C/C++. This post assumes that the datatype unsigned long int uses 4 bytes of internal storage; however the examples can easily be adapted to other built-in datatypes (unsigned short int, unsigned long long int, etc).

The code to convert a 4-byte unsigned long int into a 4-byte array:


unsigned long int longInt = 1234567890;
unsigned char byteArray[4];

// convert from an unsigned long int to a 4-byte array
byteArray[0] = (int)((longInt >> 24) & 0xFF) ;
byteArray[1] = (int)((longInt >> 16) & 0xFF) ;
byteArray[2] = (int)((longInt >> 8) & 0XFF);
byteArray[3] = (int)((longInt & 0XFF));


So what's happening in the above code? We're basically using a combination of bit shifting and bit masking in order to chop up the unsigned long into 4 pieces. Each of these pieces ends up being a value small enough to be stored in the unsigned char array (remember an unsigned char is 1 byte, and capable of holding values 0-255).

The bit shifting "drops" the right-most bytes, and the bit masking serves to convert the "new" right-most byte into a hex value between 0-255.

Note that in the last line of code we didn't need to do any bit shifting; here we're converting the right-most byte of the unsigned long, and therefore don't want to throw it away.

An alternate solution would be to first apply the mask, and then shift:

byteArray[0] = (int)((longInt & 0xFF000000) >> 24 );
byteArray[1] = (int)((longInt & 0x00FF0000) >> 16 );
byteArray[2] = (int)((longInt & 0x0000FF00) >> 8 );
byteArray[3] = (int)((longInt & 0X000000FF));


Next, let's convert the 4-byte array back into an unsigned long int:


unsigned long int anotherLongInt;

anotherLongInt = ( (byteArray[0] << 24)
+ (byteArray[1] << 16)
+ (byteArray[2] << 8)
+ (byteArray[3] ) );


Here we're taking each piece of the byte array, but now shifting the bits to the left, and adding the results. In essence this is taking each value between 0-255 and depending on the position padding the right-side with an appropriate number of zeroes in order to replicate the significance of the individual values before they are summed.

And an alternate solution to accomplish the same:

anotherLongInt = ((unsigned int) byteArray[0]) << 24;
anotherLongInt |= ((unsigned int) byteArray[1]) << 16;
anotherLongInt |= ((unsigned int) byteArray[2]) << 8;
anotherLongInt |= ((unsigned int) byteArray[3]);


And that's it!

Note that additional fortifications are required when these operations are required in portable code. In that case you won't want to make assumptions on the size of the data types and should instead use additional logic to automatically detect the data type sizes at runtime for the platform on which you're running. Otherwise, the above should be fine if you have a homogeneous and controlled environment in which your code will run.

5 comments:

Jesper Melin said...

I have a question when you convert from an unsigned long int to a 4-byte array. Why do you typecast using int and not char?
Your code:
byteArray[0] = (int)((longInt >> 24) & 0xFF);

Why not?
byteArray[0] = (char)((longInt >> 24) & 0xFF) ;

Anonymous said...

Thanks for this article! After trying different stuff with memcopy etc this is most simple way!

Some Guy said...

And for the magic unsigned long long data type, is there a solution?
Some compilers like codewarrior 68k doesn't know how to shift long long values.
Any idea is good.

HighTemplar999 said...

#include

union {
unsigned int myInt;
unsigned char myByteArray[4];
} TheUnion;

void main()
{
TheUnion.myInt=355;
printf("INT: %d\n",TheUnion.myInt);
printf("CHAR[0] %d\n",TheUnion.myByteArray[0]);
printf("CHAR[1] %d\n",TheUnion.myByteArray[1]);
printf("CHAR[2] %d\n",TheUnion.myByteArray[2]);
printf("CHAR[3] %d\n",TheUnion.myByteArray[3]);
}

//A union is like a structure but all members occupy the same memory location.

Anonymous said...
This comment has been removed by a blog administrator.