Conv$
ConvertedString = Conv$ ( String AS String , SourceCharset AS String , DestinationCharset AS String ) AS String
ConvertedString = Conv ( String AS String , SourceCharset AS String , DestinationCharset AS String ) AS String
Converts a string from one charset to another charset. A charset is represented by a string like
"ASCII"
,
"ISO-8859-1"
, or
"UTF-8"
.
The Gambas interpreter internally uses the UTF-8 charset.
The charset used by the system is returned by
System.
Charset. It was
ISO-8859-15
on a Mandrake 10.2, but now all Linux systems I know are
UTF-8
based.
The charset used by the graphical user interface is returned by
Desktop.
Charset. It should always be
UTF-8
.
Note that not all combinations of encoding names can be used for the
SourceCharset and
DestinationCharset parameters and that a coded character set can have a number of aliases.
The conversion uses the
iconv()
GNU library function and can convert, among many other encodings, encoded Turkish (
ISO-8859-9
), Korean (
EUC-KR
), Simplified Chinese (
GB2312
), Arabic (
WINDOWS-1256
), Cyrillic (
KOI8-R
),Japanese (
ISO-2022-JP
)... into human-readable UTF-8.
For a full list of supported international text conversions type
iconv -l
in a terminal.
The returned string is internally terminated by a null byte, but it will not be enough if you send that string to an extern function expecting, for example,
a wide-character null-terminated string (as known as
wchar_t
in C), which expects a string ended by
four null bytes.
To workaround that problem, add a null byte to the string
before converting it:
Print Conv(MyString & Chr$(0), "UTF-8", "WCHAR_T")
That way, you will be sure that the string is null-terminated in the target character set.
Errors
Examples
DIM sStr AS String
DIM iInd AS Integer
sStr = Conv$("Gambas", "ASCII", "EBCDIC-US")
FOR iInd = 1 TO Len(sStr)
PRINT Hex$(Asc(Mid$(sStr, iInd, 1)), 2); " ";
NEXT
PRINT Conv$("\xE7\xC1\xCD\xC2\xC1\xD3\x20\xD0\xCF\xDE\xD4\xC9\x20\xCF\xDA\xCE\xC1\xDE\xC1\xC5\xD4\x20\xC2\xC5\xCA\xD3\xC9\xCB","KOI8-R","UTF-8")
Гамбас почти означает бейсик
See also