Exposing bytes for what they really are
Posted
#1
(In Topic #946)
Regular

ASCII (American Standard Code for Information Interchange) is the mapping of the byte values from 0 to 127 to the common printed characters. The values from 0 to 31 are control characters, meaning they coordinate transmission on a teletype. The values from 48 to 57 are the decimal digit characters "0" to "9". The values from 65 to 90 are "A" to "Z", and 97 to 122 are "a" to "z".
Gambas makes dealing with byte value fairly simple. The sample program and its output demonstrate some of the concepts and syntax.
Code (gambas)
- '=============================================================================
- '---- Sample string of ordinary characters
- DisplayByteValue(theByteValue)
- '---- Some ASCII characters
- '---- Some special characters
- '=============================================================================
- '=============================================================================
- '=============================================================================
- ' theMaskValue = Shr(theMaskValue, 1)
- theMaskValue /= 2
- '=============================================================================
Code
0 00110000 &30& 48 3 * 16 + 0 32 + 16
1 00110001 &31& 49 3 * 16 + 1 32 + 16 + 1
2 00110010 &32& 50 3 * 16 + 2 32 + 16 + 2
3 00110011 &33& 51 3 * 16 + 3 32 + 16 + 2 + 1
00100000 &20& 32 2 * 16 + 0 32
A 01000001 &41& 65 4 * 16 + 1 64 + 1
B 01000010 &42& 66 4 * 16 + 2 64 + 2
C 01000011 &43& 67 4 * 16 + 3 64 + 2 + 1
D 01000100 &44& 68 4 * 16 + 4 64 + 4
00100000 &20& 32 2 * 16 + 0 32
G 01000111 &47& 71 4 * 16 + 7 64 + 4 + 2 + 1
a 01100001 &61& 97 6 * 16 + 1 64 + 32 + 1
m 01101101 &6D& 109 6 * 16 + 13 64 + 32 + 8 + 4 + 1
b 01100010 &62& 98 6 * 16 + 2 64 + 32 + 2
a 01100001 &61& 97 6 * 16 + 1 64 + 32 + 1
s 01110011 &73& 115 7 * 16 + 3 64 + 32 + 16 + 2 + 1
00100000 &20& 32 2 * 16 + 0 32
Z 01011010 &5A& 90 5 * 16 + 10 64 + 16 + 8 + 2
z 01111010 &7A& 122 7 * 16 + 10 64 + 32 + 16 + 8 + 2
0 0 @ `
1 1 A a
2 2 B b
3 3 C c
4 4 D d
5 5 E e
Null aka \0 = 0
Tab aka \t = 9 9
LF aka \n = 10 10
CR aka \r = 13 13
Here is a related post from long ago:
https://forum.gambas.one/viewtopic.php?p=1553
.... and carry a big stick!
Posted
Regular

Consider the following Gambas code:
A = 10, B = 11, and E = 14 so the result should be 11*16^3 + 10*16^2 + 11*16 + 14, right?
The experts snicker.
Somewhere, stored in memory, it would look something like this:
Code
Address Hex Binary
::::::::
######9E ??
######9F ??
Varptr--> ######A0 BE 1011 1110
######A1 BA 1011 1010
######A2 ??
######A3 ??
######A4 ??
######A5 ??
::::::::
This is a little endian representation of a two byte integer value. In Gambas this is the variable type 'Short'. The common integer type is a four byte version, also little endian. (BTW, serial transmissions are also little endian bitwise, with the most significant bit, often used as a parity bit, comes last.)
In a signed integer variable, the highest order bit determines the sign. If it is set, the number is negative. The difference between signed and unsigned integers can be understood by looking at a three bit example.
Code
000 0 0
001 1 1
010 2 2
011 3 3
100 4 -4
101 5 -3
110 6 -2
111 7 -1
Code
00000XXX
Code
000000XX Positive values
111111XX Negative values
Code
######9F ??
Varptr--> ######A0 BE 1011 1110
######A1 BA 1011 1010
######A2 FF 1111 1111
######A3 FF 1111 1111
######A4 ??
Let's check the output of the program:
Code
Priceless! Priceless!
Priceless! Priceless!
Priceless! Priceless!
.... and carry a big stick!
Posted
Regular

Code
0 1 2 3 4 5 6 7 8 9 A B C D E F
00100000 &20& 32 ! " # $ % & ' ( ) * + , - . /
00110000 &30& 48 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
01000000 &40& 64 @ A B C D E F G H I J K L M N O
01010000 &50& 80 P Q R S T U V W X Y Z [ \ ] ^ _
01100000 &60& 96 ` a b c d e f g h i j k l m n o
01110000 &70& 112 p q r s t u v w x y z \{ | } ~
Here is the code that produced it.
Code (gambas)
- Print " ";
- theHighValue = theHighNybble * 16 ' &10& 00010000b shl by 4
- theByteValue = theHighValue + theLowNybble
.... and carry a big stick!
Posted
Regular

Priceless.
Code (gambas)
- '=============================================================================
- DisplayMemoryOfInteger(&Babe&)
- DisplayMemoryOfInteger(&HBabe)
- '=============================================================================
- Inc theAddress
- '=============================================================================
Code
FFFF92797028: BE 1011 1110
FFFF92797029: BA 1011 1010
FFFF9279702A: 00 0000 0000
FFFF9279702B: 00 0000 0000
FFFF92797028: BE 1011 1110
FFFF92797029: BA 1011 1010
FFFF9279702A: FF 1111 1111
FFFF9279702B: FF 1111 1111
Code (gambas)
Code
-256 -255
Code
255 11111111
http://gambaswiki.org/wiki/lang/type/integer
.... and carry a big stick!
Posted
Regular

Code (gambas)
Code
F 15 15
FA 250 250
FAD 4013 4013
FADE 64222 -1314
FADED 1027565 1027565
Here is another illustration of the boundary, (or the odometer rollover), for signed integers.
Code (gambas)
- Print theValueAsInteger,
Code
-4 100 11111100 FC FC FFFFFFFFFFFFFFFC FFFFFFFFFFFFFFFC FFFFFFFFFFFFFFFC
-3 101 11111101 FD FD FFFFFFFFFFFFFFFD FFFFFFFFFFFFFFFD FFFFFFFFFFFFFFFD
-2 110 11111110 FE FE FFFFFFFFFFFFFFFE FFFFFFFFFFFFFFFE FFFFFFFFFFFFFFFE
-1 111 11111111 FF FF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF
0 000 00000000 00 0 0 0 0
1 001 00000001 01 1 1 1 1
2 010 00000010 02 2 2 2 2
3 011 00000011 03 3 3 3 3
Or like grouping decimal numbers with commas (U.S. style) to effectively make a base 1000 numbering system.
In summary:
Code
Number Function String of characters
Byte ----> Chr ----> Character
Byte <---- Asc <---- Character
Integer ----> Str ----> Text Decimal Representation
Integer <---- Val <---- Text Decimal Representation
Integer ----> Hex ----> Text Hexadecimal Representation
Integer <---- Val(& &) <---- Text Hexadecimal Representation
http://gambaswiki.org/wiki/lang
That's where I found that bin and hex can take a second argument specifying the zeropadded length. The previous code in this post would look better converted, but I'm not going to change them.
The next steps are how floating point numbers can be stored in the same bit patterns, and on the character side, how Utf-8 works. Fixed point formats are just integers with an implied whole/fraction partition. ("Decimal point" doesn't fit, and "Binary point" just doesn't seem to apply.)
Then how strings and objects are stored. After that, you'll be ready to write, or at least understand, function calls to shared libraries. Even write shared libraries of your own.
.... and carry a big stick!
1 guest and 0 members have just viewed this.



