1001 Webs provides translation and localization of websites, which involves the translation of both visible content and HTML and XML code (Title Tags, Meta Tags, Alt Tags, etc.) and the adaptation of all the scripting components (Databases, Active Server Pages, Javascript, PHP, Perl, etc.).
And now 1001webs is going truly Global with the incorporation of Hindi, Chinese, Japanese, Russian and Arabic versions of our website.
We have come across this excellent article at the Technology section of
The Guardian Unlimited:
http://www.guardian.co.uk/technology/2006/jul/27/guardianweeklytechnologysection5
Below are some excerpts:
Read this for starters:
"Despite everything you may have heard, the global resource we all know as the internet is not global at all. Since you are reading this article in English you probably won't have noticed, but if your first language was Chinese, Arabic, Hindi or Tamil, you would know very different."
Ever wondered about ASCII codes and implementation in non-Latin languages:
"the term ASCII itself. It stands for American Standard Code for Information Interchange and it is the code devised to enable computers to represent and process all the characters in the English alphabet (a through to z, plus 0 to 9 and the various symbols you get on your keyboard such as % and &).
It was first developed in 1967 and written into the internet's foundations by American scientists. It is now so hardwired into the net that the only way to include other characters such as accents on letters, or Chinese or Arabic script, is to use complex combinations of letters that don't exist in English words in order to represent them.
Linguists have created long tables to represent all the possible combinations and permutations of different languages. In the case of internet domain names, the address is preceded by "xn--" and then an agreed code. For example "www.rémax.com" is represented as "www.xn--rmax-bpa.com". Using this method, it suddenly becomes possible to have internet domain names containing foreign characters, and hence foreign language domain names."
But:
"From the western perspective this approach was sufficient for the rest of the world to use the internet. But the problem is that each of these domains still has to use the existing domain system with ".com" or ".net" - suffixes that are virtually incomprehensible to non Latin- derived language users."
and the conclusion is:
"... with non- Latin-language networks becoming increasingly advanced, China making it clear it is prepared to break away from the internet, MINC touting a solution that could bypass its processes altogether and, perhaps most crucially, Microsoft deciding to include IDN10 technology in the new version of Internet Explorer, out later this year, Icann has been left with no choice but to speed up the technical side of internationalised domain names in a bid to keep the net together."
We strongly recommend all of our International partners, specially those managing the Hindi, Chinese, Japanese, Russian and Arabic versions of 1001webs, to read the full article at the Technology section of
The Guardian Unlimited:
http://www.guardian.co.uk/technology/2006/jul/27/guardianweeklytechnologysection5, so they get a clearer idea of the difficulties of working with those languages.
Some Links of Interest:
ICANN
ICANN is responsible for the global coordination of the Internet's system of unique identifiers. These include domain names (like .org, .museum and country codes like .uk, .fr, .pt, .de, .es, .jp, .cn, .etc), as well as the addresses used in a variety of Internet protocols. Computers use these identifiers to reach each other over the Internet. Careful management of these resources is vital to the Internet's operation.
American Standard Code for Information Interchange (ASCII)
ASCII codes represent text in computers, communications equipment, and other devices that work with text. Most modern character encodings — which support many more characters than did the original — have a historical basis in ASCII.
China gives itself its own top-level domains
China has decided to bypass ICANN altogether and set up its own set of TLDs and domain name servers. In addition to the .cn TLD, China will have three new Chinese-character TLDs equating to "dot China," "dot com," and "dot net."
See also
ASCII extensions
(where all ASCII printable characters are identical to ASCII)
ASCII codes table - Format of standard characters
ASCII
|
Hex
|
Symbol
|
|
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| 0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
| NUL
SOH
STX
ETX
EOT
ENQ
ACK
BEL
BS
TAB
LF
VT
FF
CR
SO
SI
|
|
ASCII
|
Hex
|
Symbol
|
|
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
| 10
11
12
13
14
15
16
17
18
19
1A
1B
1C
1D
1E
1F
| DLE
DC1
DC2
DC3
DC4
NAK
SYN
ETB
CAN
EM
SUB
ESC
FS
GS
RS
US
|
|
ASCII
|
Hex
|
Symbol
|
|
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
| 20
21
22
23
24
25
26
27
28
29
2A
2B
2C
2D
2E
2F
| (space)
!
"
#
$
%
&
'
(
)
*
+
,
-
.
/
|
|
ASCII
|
Hex
|
Symbol
|
|
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
| 30
31
32
33
34
35
36
37
38
39
3A
3B
3C
3D
3E
3F
| 0
1
2
3
4
5
6
7
8
9
:
;
< = >
?
|
|
ASCII
|
Hex
|
Symbol
|
|
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
| 40
41
42
43
44
45
46
47
48
49
4A
4B
4C
4D
4E
4F
| @
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
|
|
ASCII
|
Hex
|
Symbol
|
|
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
| 50
51
52
53
54
55
56
57
58
59
5A
5B
5C
5D
5E
5F
| P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
_
|
|
ASCII
|
Hex
|
Symbol
|
|
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
| 60
61
62
63
64
65
66
67
68
69
6A
6B
6C
6D
6E
6F
| `
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
|
|
ASCII
|
Hex
|
Symbol
|
|
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
| 70
71
72
73
74
75
76
77
78
79
7A
7B
7C
7D
7E
7F
| p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
|
|
ASCII Codes |
HTML Codes |
Conversion |
References |
Control Characters
HTML Codes - Characters and symbols
Standard ASCII set, HTML Entity names, ISO 10646, ISO 8879, ISO 8859-1 Latin alphabet No. 1
Browser support: All browsers
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
| 20
21
22
23
24
25
26
27
28
29
2A
2B
2C
2D
2E
2F
| !
"
#
$
%
&
'
(
)
*
+
,
-
.
/
| !
"
#
$
%
&
'
(
)
*
+
,
-
.
/
| "
&
| space
exclamation point
double quotes
number sign
dollar sign
percent sign
ampersand
single quote
opening parenthesis
closing parenthesis
asterisk
plus sign
comma
minus sign - hyphen
period
slash
|
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
| 30
31
32
33
34
35
36
37
38
39
3A
3B
3C
3D
3E
3F
| 0
1
2
3
4
5
6
7
8
9
:
;
< = >
?
| 0
1
2
3
4
5
6
7
8
9
:
;
< = >
?
| <
>
| zero
one
two
three
four
five
six
seven
eight
nine
colon
semicolon
less than sign
equal sign
greater than sign
question mark
|
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
| 40
41
42
43
44
45
46
47
48
49
4A
4B
4C
4D
4E
4F
| @
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
| @
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
| | at symbol
|
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
| 50
51
52
53
54
55
56
57
58
59
5A
5B
5C
5D
5E
5F
| P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
_
| P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
_
| | opening bracket
backslash
closing bracket
caret - circumflex
underscore
|
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
| 60
61
62
63
64
65
66
67
68
69
6A
6B
6C
6D
6E
6F
| `
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
| `
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
| | grave accent
|
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
| 70
71
72
73
74
75
76
77
78
79
7A
7B
7C
7D
7E
7F
| p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
| p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
| | opening brace
vertical bar
closing brace
equivalency sign - tilde
(not defined in HTML 4 standard)
|
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
| 80
81
82
83
84
85
86
87
88
89
8A
8B
8C
8D
8E
8F
| | | | (not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
|
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
| 90
91
92
93
94
95
96
97
98
99
9A
9B
9C
9D
9E
9F
| | | | (not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
(not defined in HTML 4 standard)
|
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
| A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
AA
AB
AC
AD
AE
AF
| ¡
¢
£
¤
¥
¦
§
¨
©
ª
«
¬
®
¯
| ¡
¢
£
¤
¥
¦
§
¨
©
ª
«
¬
®
¯
| ¡
¢
£
¤
¥
¦
§
¨
©
ª
«
¬
®
¯
| non-breaking space
inverted exclamation mark
cent sign
pound sign
currency sign
yen sign
broken vertical bar
section sign
spacing diaeresis - umlaut
copyright sign
feminine ordinal indicator
left double angle quotes
not sign
soft hyphen
registered trade mark sign
spacing macron - overline
|
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
| B0
B1
B2
B3
B4
B5
B6
B7
B8
B9
BA
BB
BC
BD
BE
BF
| °
±
²
³
´
µ
¶
·
¸
¹
º
»
¼
½
¾
¿
| °
±
²
³
´
µ
¶
·
¸
¹
º
»
¼
½
¾
¿
| °
±
²
³
´
µ
¶
·
¸
¹
º
»
¼
½
¾
¿
| degree sign
plus-or-minus sign
superscript two - squared
superscript three - cubed
acute accent - spacing acute
micro sign
pilcrow sign - paragraph sign
middle dot - Georgian comma
spacing cedilla
superscript one
masculine ordinal indicator
right double angle quotes
fraction one quarter
fraction one half
fraction three quarters
inverted question mark
|
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
| C0
C1
C2
C3
C4
C5
C6
C7
C8
C9
CA
CB
CC
CD
CE
CF
| À
Á
Â
Ã
Ä
Å
Æ
Ç
È
É
Ê
Ë
Ì
Í
Î
Ï
| À
Á
Â
Ã
Ä
Å
Æ
Ç
È
É
Ê
Ë
Ì
Í
Î
Ï
| À
Á
Â
Ã
Ä
Å
Æ
Ç
È
É
Ê
Ë
Ì
Í
Î
Ï
| latin capital letter A with grave
latin capital letter A with acute
latin capital letter A with circumflex
latin capital letter A with tilde
latin capital letter A with diaeresis
latin capital letter A with ring above
latin capital letter AE
latin capital letter C with cedilla
latin capital letter E with grave
latin capital letter E with acute
latin capital letter E with circumflex
latin capital letter E with diaeresis
latin capital letter I with grave
latin capital letter I with acute
latin capital letter I with circumflex
latin capital letter I with diaeresis
|
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
| D0
D1
D2
D3
D4
D5
D6
D7
D8
D9
DA
DB
DC
DD
DE
DF
| Ð
Ñ
Ò
Ó
Ô
Õ
Ö
×
Ø
Ù
Ú
Û
Ü
Ý
Þ
ß
| Ð
Ñ
Ò
Ó
Ô
Õ
Ö
×
Ø
Ù
Ú
Û
Ü
Ý
Þ
ß
| Ð
Ñ
Ò
Ó
Ô
Õ
Ö
×
Ø
Ù
Ú
Û
Ü
Ý
Þ
ß
| latin capital letter ETH
latin capital letter N with tilde
latin capital letter O with grave
latin capital letter O with acute
latin capital letter O with circumflex
latin capital letter O with tilde
latin capital letter O with diaeresis
multiplication sign
latin capital letter O with slash
latin capital letter U with grave
latin capital letter U with acute
latin capital letter U with circumflex
latin capital letter U with diaeresis
latin capital letter Y with acute
latin capital letter THORN
latin small letter sharp s - ess-zed
|
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
| E0
E1
E2
E3
E4
E5
E6
E7
E8
E9
EA
EB
EC
ED
EE
EF
| à
á
â
ã
ä
å
æ
ç
è
é
ê
ë
ì
í
î
ï
| à
á
â
ã
ä
å
æ
ç
è
é
ê
ë
ì
í
î
ï
| à
á
â
ã
ä
å
æ
ç
è
é
ê
ë
ì
í
î
ï
| latin small letter a with grave
latin small letter a with acute
latin small letter a with circumflex
latin small letter a with tilde
latin small letter a with diaeresis
latin small letter a with ring above
latin small letter ae
latin small letter c with cedilla
latin small letter e with grave
latin small letter e with acute
latin small letter e with circumflex
latin small letter e with diaeresis
latin small letter i with grave
latin small letter i with acute
latin small letter i with circumflex
latin small letter i with diaeresis
|
|
ASCII
| | HTML
| HTML
| |
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
| F0
F1
F2
F3
F4
F5
F6
F7
F8
F9
FA
FB
FC
FD
FE
FF
| ð
ñ
ò
ó
ô
õ
ö
÷
ø
ù
ú
û
ü
ý
þ
ÿ
| ð
ñ
ò
ó
ô
õ
ö
÷
ø
ù
ú
û
ü
ý
þ
ÿ
| ð
ñ
ò
ó
ô
õ
ö
÷
ø
ù
ú
û
ü
ý
þ
ÿ
| latin small letter eth
latin small letter n with tilde
latin small letter o with grave
latin small letter o with acute
latin small letter o with circumflex
latin small letter o with tilde
latin small letter o with diaeresis
division sign
latin small letter o with slash
latin small letter u with grave
latin small letter u with acute
latin small letter u with circumflex
latin small letter u with diaeresis
latin small letter y with acute
latin small letter thorn
latin small letter y with diaeresis
|
|
HTML 4.01, ISO 10646, ISO 8879, Latin extended A and B,
Browser support: Internet Explorer > 4, Netscape > 4
|
|
| HTML
| HTML
|
|
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
338
339
352
353
376
402
| 152
153
160
161
178
192
| Œ
œ
Š
š
Ÿ
ƒ
| Œ
œ
Š
š
Ÿ
ƒ
|
| latin capital letter OE
latin small letter oe
latin capital letter S with caron
latin small letter s with caron
latin capital letter Y with diaeresis
latin small f with hook - function
|
|
|
| HTML
| HTML
|
|
Dec
| Hex
| Symbol
| Number
| Name
| Description
|
|
8211
8212
8216
8217
8218
8220
8221
8222
8224
8225
8226
8230
8240
8364
8482
| 2013
2014
2018
2019
201A
201C
201D
201E
2020
2021
2022
2026
2030
20AC
2122
| –
—
‘
’
‚
“
”
„
†
‡
•
…
‰
€
™
| –
—
‘
’
‚
“
”
„
†
‡
•
…
‰
€
™
|
€
| en dash
em dash
left single quotation mark
right single quotation mark
single low-9 quotation mark
left double quotation mark
right double quotation mark
double low-9 quotation mark
dagger
double dagger
bullet
horizontal ellipsis
per thousand sign
euro sign
trade mark sign
|
|
ASCII Codes | HTML Codes |
Conversion |
References |
Control Characters
More Info:
http://ascii.cl
No Response to "Non Latin-language websites"
Post a Comment