Python: Encoding
At the deepest level, computers operate exclusively using the numbers 0
and 1
. This is what's known as binary code, and the ones and zeros are called bits, from “binary digit”.
The numbers from the decimal system that we know and love are encoded using binary numbers:
- 0 ← 0
- 1 ← 1
- 2 ← 10
- 3 ← 11
- 4 ← 100
- 5 ← 101
But does it deal with text? Computers don't know about letters, punctuation, or any other text characters. All these symbols are also encoded with numbers.
We can take the English alphabet and give each letter a number, starting with one:
- a ← 1
- b ← 2
- c ← 3
- d ← 4
- ...
- z ← 26
This is the essence of encoding.
When they're working, programs use encodings to convert numbers into characters and vice versa. And the program itself has no idea about the meaning of these characters.
hello
→8
5
12
12
15
7
15
15
4
→good
These tables that match letters and numbers are called encodings. Besides letters of the alphabet, encoding tables include punctuation marks and other useful characters. You have probably encountered encodings such as ASCII and UTF-8.
Different encodings contain different numbers of characters. At first, small tables like ASCII were enough for programmers. But it has only Latin letters, a few simple characters like %
and ?
and special control characters like line feeds.
As computers spread further and further, countries needed their own comprehensive tables. This included Cyrillic letters, Chinese and Japanese characters, Arabic script, additional mathematical and typographic symbols, and later on emojis.
Today, it's usually [Unicode] variants that are used most often Unicode. It includes characters from almost all the written languages of the world.
Instructions
In Python, you can query and display any ASCII character. The function chr()
is used for this. For example:
print(chr(63))
Symbol no. 63 - the question mark ?
. You can print any character this way.
Use the ASCII code table. In this table, want to know about the decimal code (dec or decimal) with which the characters are encoded.
Using the example above and the table, display the following (each on its own line): ~
, ^
and %
.
(Of course, you could be sneaky and cheat the tests by just doing print('~')
etc., but that would be no fun at all :)
Tips
Definitions
Encoding a set of characters encoded with numbers to represent text electronically.