Reference section - SMS
GSM Alphabet
GSM Default 7-bit Alphabet
GSM Alphabet (or GSM-7)
The GSM default alphabet is a character encoding standard, defined in 3GPP 23.038, that packs the most commonly used letters and symbols in many languages into a 7-bit representation for use on GSM networks. As SMS messages are transmitted in 140 8-bit octets at a time, GSM default alphabet encoded SMS messages can carry up to 160 characters per SMS.
Each character in the basic character set is represented in an SMS message by a septet (7-bits). For characters in the basic character set extension, the ESC (0x1B) character selects the extension set.
Basic character set
0x0_ | 0x1_ | 0x2_ | 0x3_ | 0x4_ | 0x5_ | 0x6_ | 0x7_ | |
---|---|---|---|---|---|---|---|---|
0x_0 | @ | Δ | SP | 0 | ¡ | P | ¿ | p |
0x_1 | £ | _ | ! | 1 | A | Q | a | q |
0x_2 | $ | Φ | " | 2 | B | R | b | r |
0x_3 | ¥ | Γ | # | 3 | C | S | c | s |
0x_4 | è | Λ | ¤ | 4 | D | T | d | t |
0x_5 | é | Ω | % | 5 | E | U | e | u |
0x_6 | ù | Π | & | 6 | F | V | f | v |
0x_7 | ì | Ψ | ' | 7 | G | W | g | w |
0x_8 | ò | Σ | ( | 8 | H | X | h | x |
0x_9 | Ç | Θ | ) | 9 | I | Y | i | y |
0x_A | LF | Ξ | * | : | J | Z | j | z |
0x_B | Ø | ESC | + | ; | K | Ä | k | ä |
0x_C | ø | Æ | , | < | L | Ö | l | ö |
0x_D | CR | æ | - | = | M | Ñ | m | ñ |
0x_E | Å | ß | . | > | N | Ü | n | ü |
0x_F | å | É | / | ? | O | § | o | à |
Basic character set extension
The following characters are accessible if the 7-bit extension mechanism is supported using the ESC character prefix. If not supported, then the ESC is interpreted as a space and the following character is interpreted as though there was no leading ESC.
Value | Character | Sequence (7-bit) |
---|---|---|
0x0A | FF | 0x1B 0x0A |
0x0D | CR2 | 0x1B 0x0D |
0x14 | ^ | 0x1B 0x14 |
0x1B | SS2 | 0x1B 0x1B |
0x28 | { | 0x1B 0x28 |
0x29 | } | 0x1B 0x29 |
0x2F | \ | 0x1B 0x2F |
0x3C | [ | 0x1B 0x3C |
0x3D | ~ | 0x1B 0x3D |
0x3E | ] | 0x1B 0x3E |
0x40 | | | 0x1B 0x40 |
0x65 | € | 0x1B 0x65 |
National language shift tables
Shift tables allow characters relevant to other languages to be accessible in SMS messages. Shift tables are selected using the User Data Header of an SMS message. A locking shift table can be used to specify the table for the whole message, whereas a single shift table can be used for an individual character.
Shift tables exist for the following languages:
- Spanish
- Portuguese
- Turkish
- Urdu
- Hindi
- Bengali and Assamese
- Punjabi
- Gujarati
- Oriya
- Tamil
- Telugu
- Kannada
- Malayalam