St-Takla.org  >   General-Knowledge-Articles  >   03-Computer-n-Internet
 

UTF-8 Codepage Layout Charset


 
UTF-8
  —0 —1 —2 —3 —4 —5 —6 —7 —8 —9 —A —B —C —D —E —F
 
0−
NUL
0000
0
SOH
0001
1
STX
0002
2
ETX
0003
3
EOT
0004
4
ENQ
0005
5
ACK
0006
6
BEL
0007
7
BS
0008
8
HT
0009
9
LF
000A
10
VT
000B
11
FF
000C
12
CR
000D
13
SO
000E
14
SI
000F
15
 
1−
DLE
0010
16
DC1
0011
17
DC2
0012
18
DC3
0013
19
DC4
0014
20
NAK
0015
21
SYN
0016
22
ETB
0017
23
CAN
0018
24
EM
0019
25
SUB
001A
26
ESC
001B
27
FS
001C
28
GS
001D
29
RS
001E
30
US
001F
31
 
2−
SP
0020
32
!
0021
33
"
0022
34
#
0023
35
$
0024
36
%
0025
37
&
0026
38
'
0027
39
(
0028
40
)
0029
41
*
002A
42
+
002B
43
,
002C
44
-
002D
45
.
002E
46
/
002F
47
 
3−
0
0030
48
1
0031
49
2
0032
50
3
0033
51
4
0034
52
5
0035
53
6
0036
54
7
0037
55
8
0038
56
9
0039
57
:
003A
58
;
003B
59
<
003C
60
=
003D
61
>
003E
62
?
003F
63
 
4−
@
0040
64
A
0041
65
B
0042
66
C
0043
67
D
0044
68
E
0045
69
F
0046
70
G
0047
71
H
0048
72
I
0049
73
J
004A
74
K
004B
75
L
004C
76
M
004D
77
N
004E
78
O
004F
79
 
5−
P
0050
80
Q
0051
81
R
0052
82
S
0053
83
T
0054
84
U
0055
85
V
0056
86
W
0057
87
X
0058
88
Y
0059
89
Z
005A
90
[
005B
91
\
005C
92
]
005D
93
^
005E
94
_
005F
95
 
6−
`
0060
96
a
0061
97
b
0062
98
c
0063
99
d
0064
100
e
0065
101
f
0066
102
g
0067
103
h
0068
104
i
0069
105
j
006A
106
k
006B
107
l
006C
108
m
006D
109
n
006E
110
o
006F
111
 
7−
p
0070
112
q
0071
113
r
0072
114
s
0073
115
t
0074
116
u
0075
117
v
0076
118
w
0077
119
x
0078
120
y
0079
121
z
007A
122
{
007B
123
|
007C
124
}
007D
125
~
007E
126
DEL
007F
127
 
8−

+00
128

+01
129

+02
130

+03
131

+04
132

+05
133

+06
134

+07
135

+08
136

+09
137

+0A
138

+0B
139

+0C
140

+0D
141

+0E
142

+0F
143
 
9−

+10
144

+11
145

+12
146

+13
147

+14
148

+15
149

+16
150

+17
151

+18
152

+19
153

+1A
154

+1B
155

+1C
156

+1D
157

+1E
158

+1F
159
 
A−

+20
160

+21
161

+22
162

+23
163

+24
164

+25
165

+26
166

+27
167

+28
168

+29
169

+2A
170

+2B
171

+2C
172

+2D
173

+2E
174

+2F
175
 
B−

+30
176

+31
177

+32
178

+33
179

+34
180

+35
181

+36
182

+37
183

+38
184

+39
185

+3A
186

+3B
187

+3C
188

+3D
189

+3E
190

+3F
191
 
C−
2

192
2

193
2
0080
194
2
00C0
195
2
0100
196
2
0140
197
2
0180
198
2
01C0
199
2
0200
200
2
0240
201
2
0280
202
2
02C0
203
2
0300
204
2
0340
205
2
0380
206
2
03C0
207
 
D−
2
0400
208
2
0440
209
2
0480
210
2
04C0
211
2
0500
212
2
0540
213
2
0580
214
2
05C0
215
2
0600
216
2
0640
217
2
0680
218
2
06C0
219
2
0700
220
2
0740
221
2
0780
222
2
07C0
223
 
E−
3
0800
224
3
1000
225
3
2000
226
3
3000
227
3
4000
228
3
5000
229
3
6000
230
3
7000
231
3
8000
232
3
9000
233
3
A000
234
3
B000
235
3
C000
236
3
D000
237
3
E000
238
3
F000
239
 
F−
4
10000
240
4
40000
241
4
80000
242
4
C0000
243
4
100000
244
4
140000
245
4
180000
246
4
1C0000
247
5
200000
248
5
1000000
249
5
2000000
250
5
3000000
251
6
4000000
252
6
40000000
253


254


255

Legend: yellow cells are control characters, blue cells are punctuation, purple cells are numbers and green cells are ASCII letters.

Orange cells with a large dot are continuation bytes. The hexadecimal number shown after a "+" plus sign is the value of the 6 bits they add.

White cells containing a large single-digit number are the start bytes for a sequence of that many bytes. The unbolded hexadecimal code point number shown in the cell is the lowest character value encoded using that start byte (this value can be greater than the value which would be obtained by following the start byte with continuation bytes which are all 128 (hex 0x80), if this would result in an invalid overlong form).

Red cells must never appear in a valid UTF-8 sequence. The first two could only be used for overlong encoding of basic ASCII characters. The remaining red cells indicate start bytes of sequences that could only encode numbers larger than the 0x10FFFF limit of Unicode. The byte 244 (hex 0xF4) could also encode some values greater than 0x10FFFF; such a sequence is also invalid.


الكتاب المقدس: بحث، تفاسير | القراءات اليومية | الأجبية | أسئلة | طقس | عقيدة | تاريخ | كتب | شخصيات | كنائس | أديرة | كلمات ترانيم | ميديا | صور | مواقع

https://st-takla.org/General-Knowledge-Articles/03-Computer-n-Internet/003-UTF-8-Codepage-Layout.html

تقصير الرابط:
tak.la/w4tnnfa