Localisation and Internationalization

It seems that how Gambas deals with localisation and internationalization is not very clear, so here is a little article about that...

If you never heard about Localization and Internationalization, I suggest that you take a look at the Wikipedia article first.

The Language Environment Variables

The behaviour of an international program must be adapted according to:
  • The country of the user.

  • The language of the user.

In Gambas, these two pieces of information are extracted from the two following system environment variables: $LC_ALL, or $LANG.

The $LANG environment variable is used only if $LC_ALL is not defined. We will only use $LANG from now in this document.

The syntax of $LANG is the following:
xx_YY.ZZZZ

  • xx is a two characters string in lower case, and represents the user language.

  • YY is a two characters string in upper case, and represents the user country.

  • ZZZZ is usually the system charset.

All characters that follows the xx_YY string are ignored by Gambas.

To run a program under a different localization, you can do:
$ cd /path/to/my/project
$ LANG=fr_FR gbx3

Or you define the LANG variable in the Environment tab of the project property dialog.

On some systems, like Mandriva, you must define the $LANG variable and the $LANGUAGE variable with the wanted localization. Otherwise strings won't be translated.

$ cd /path/to/my/project
$ LANGUAGE=fr_FR LANG=fr_FR gbx3

String Translations

In your project, you have two different kinds of strings:
  • Strings that are seen by the user (a label text, a message...), and so that must be translated.

  • Strings that are internal (a collection key, a field name, a program setting...), and that must not be translated.

You must explicitely tell Gambas which strings must be translated, and which strings must not.

Strings that are to be translated must be enclosed between braces:
PRINT "A string that must not be translated"
PRINT ("A string that must be translated")

Note that all Text control properties are automatically marked as translatable.

Gambas internally uses the GNU translation system: the translated strings are stored in xx_YY.mo files, where xx and YY have the same definition as above.

If a specific translation file for the xx_YY language cannot be found, then a more general translation file xx.mo is tried.

For example, if the fr_CA (Quebec French) translation file does not exist, then the fr (Common French) language will be used.

If the GNU translation functions cannot switch to the language stored in the environmental variables, for any reason, then the strings won't be translated and the date and numeric localization settings may not be honored.

In that case, the Gambas interpreter will print a warning on the console:

gbx3: warning: cannot switch to language 'en_EN.UTF-8': No such file or directory. Did you install the corresponding locale packages?

String Manipulations

Gambas have two sets of string manipulation functions: those that deals with ASCII only, and those that can deal with UTF-8 strings.

Each time you have to manipulate a string that must be translated, then you must use the UTF-8 functions. These functions are all members of the static String class.

Otherwise, I suggest you always use ASCII characters and ASCII native functions, because they are faster.

The ASCII native functions are: Asc, Chr$, InStr, RInStr, LCase$, Left$, Right$, Mid$, Len, LTrim$, RTrim$, Trim$... See String Functions for all of them.

String Conversions

You often have to translate a Gambas number or date to a string, and the contrary.

Then, you must know if the string that represents the numeric value or the date will be seen by the user. Because in that case, the numeric value or the date must be translated according to the user country and language localization settings. And so, you must use the Gambas functions that use localization settings.

These functions are: Str$, and Val.

Str$() convert any value to a localized string, whereas Val() tries to convert any string to a specific value by guessing the value datatype from the string contents: for example, if Val() sees date separators or time separators, according to the localization settings, it will assume that the string must be converted to a date.

Otherwise, if the string representation of the numeric value or the date will not be seen by the user (an application settings stored in a text file for example), then you must use the Gambas functions that do not use localization settings.

These functions are: CBool, CByte, CDate, CFloat, CInt, CLong, CSingle, CShort, CStr.

Actually, these functions use a default localization setting named "C", that is exactly the same as the "en_US", i.e. the localization of the people living in the United States of America. That is the main reason for the confusion I think!

You will use CStr() to convert to a string, and all other functions to convert from a string to a specific datatype. The conversion should be fully bijective, i.e. CStr(CXXX(anyString)) = anyString, and CXXX(CStr(anyValue)) = anyValue. Beware, there is no full guarantee!

Date Conversions

To print a date/time using the current localization, you have to use the Format$ function with one of the date/time pre-defined formats.

If none of the pre-defined formats fit your needs, then you have to create your own by:

1) Determining the order of date and time components by analysing the formatting of a cleverly chosen date with gb.ShortDate and gb.ShortTime.

2) Using the "/" and ":" formatting characters in the format string.

Example

' Return a custom localized date format
' Doing the same thing for a time format is similar

Private Sub GetMyDateFormat() As String

  Dim aFormat As String[] = ["dd", "mmmm", "yyyy"]
  Dim sFormat As String
  Dim sDate As String
  Dim sChar As String
  Dim I, J As Integer

  sDate = Format(Date(3333, 22, 11), gb.ShortDate)

  For I = 1 To Len(sDate)
  
    sChar = Mid$(sDate, I ,1)
    If IsDigit(sChar) Then
      J = Asc(sChar) - Asc("1")
      If J < 0 Or If J > 2 Or If Not aFormat[J] Then Continue
      If sFormat Then sFormat &= "/"
      sFormat &= aFormat[J]
      aFormat[J] = ""
    Endif
  
  Next

  Return sFormat

End

GUI Localization

A localization difficulty comes from right-to-left written languages: Arabic, Farsi, Hebrew. The user interface must reverse some of its components to follow the writing direction.

That should be automatic, as soon as you always use layout containers like HBox, VBox, HPanel and VPanel, or containers that have their Arrangement property set.

The other part of the GUI (menus, labels, widget scrollbars...) should adapt, provided that the underlying toolkit (qt or GTK+) behaves correctly.

To try a reversed GUI, you can set the GB_REVERSE environmental variable before starting your program. Please tell if something does not display correctly.

Example

Here is what you see if you run LANG=fa LANGUAGE=fa gambas2 on my Mandriva:

See also