Gambas Documentation
Compilación e instalación
Componentes
Controls pictures
Cómo...
Descripciones del Lenguaje
Developer Documentation
Documentacion y Recetas
Documentación del Entorno de Desarrollo
Fragmentos de código
Glosario
Licencia del Wiki
LÉEME
Manual del Wiki
Mensajes de Error
Pendiente de traducción
Registrarse
Repositorio de Aplicaciones
Tutoriales
Índice del Lenguaje
Abs
Access
ACos
ACosh
Alloc
AND
Ang
APPEND
Array
AS
Asc
Asignación
ASin
ASinh
ATan
ATan2
ATanh
BChg
BClr
BEGINS
Bin$
BREAK
BSet
BTst
BYREF
CASE
CATCH
CBool
Cbr
CByte
CDate
CFloat
Choose
Chr$
CInt
CLASS
CLong
CLOSE
Comp
CONST
Constantes del Lenguaje
CONTINUE
Conv$
COPY
Cos
Cosh
CREATE
CREATE STATIC
CShort
CSng
CStr
Date
DateAdd
DateDiff
Day
DConv$
DEBUG
DEC
Declaracion de Arreglos
Declaración de Constantes
Declaración de Eventos
Declaración de Funciones Externas
Declaración de Métodos
Declaración de Propiedades
Declaración de Variables
Declaración de Variables Locales
DEFAULT
Deg
DFree
DIM
Dir
DIV
DO
ELSE
END
ENDIF
ENDS
END SELECT
END WITH
ENUM
Enumeration declaration
Eof
ERROR
Etiquetas
EVENT
EXEC
Exp
Exp2
Exp10
Expm
EXPORT
Expresión Constante
EXTERN
FALSE
FINALLY
FLUSH
FOR
FOR EACH
Format$
Frac
Free
FUNCTION
GOTO
Hex$
Hour
Html$
Hyp
IF
IN
INC
INPUT
INPUT FROM
InStr
Int
IsAscii
IsBlank
IsBoolean
IsByte
IsDate
IsDigit
IsDir
IsFloat
IsHexa
IsInteger
IsLCase
IsLetter
IsLong
IsNull
IsNumber
IsObject
IsPunct
IsShort
IsSingle
IsSpace
IsString
IsUCase
KILL
LAST
LCase$
Left$
Len
LIBRARY
LIKE
LINE INPUT
LINK
LOCK
Lof
Log
Log2
Log10
Logp
LOOP
LTrim$
Max
ME
Mid$
Min
Minute
MKDIR
MOD
Month
MOVE
Métodos especiales
New
NEXT
NOT
Now
NULL
OPEN
Operadores Aritméticos
Operadores de Asignación
Operadores de Cadena
Operadores Lógicos
OPTIONAL
OR
OUTPUT
OUTPUT TO
Pi
PRINT
PRIVATE
PROCEDURE
PROPERTY
PUBLIC
QUIT
Quote$
Rad
RAISE
Randomize
READ[../../def/stream] _\Stream_
Realloc
REPEAT
Replace$
RETURN
Right$
RInStr
RMDIR
Rnd
Rol
Ror
Round
RTrim$
SConv$
Second
Seek
SELECT
Sgn
Shell$
Shl
Shr
Sin
Sinh
SLEEP
Space$
Split
Sqr
Stat
STATIC
STEP
STOP
STOP EVENT
Str$
String$
StrPtr
SUB
Subst$
SUPER
SWAP
Tan
Tanh
Temp$
THEN
Time
Timer
Tipos de Datos
TO
Trim$
TRUE
TRY
TypeOf
UCase$
UNLOCK
UNTIL
Val
WAIT
WATCH
Week
WeekDay
WEND
WHILE
WITH
WRITE
XOR
Year
Últimos cambios

Tokenize

Tokens = Tokenize ( String [ , Identifiers , Strings , Operators , KeepSpace ] )

Since 3.21

Split a string into tokens and return them.

Arguments

  • String : the string to split.

  • Identifiers : a string of extra characters allowed in identifier tokens.

  • Strings : an array of strings, each string describing the limits of a string token.

  • Operators : an array of strings, each string representing an operator token.

  • KeepSpace : tell if space tokens are returned.

Return value

The tokens are returned as a string array.

Description

This function is a simple lexical parser that splits a string into tokens and return them as a string array made of following kind of tokens:

  • Space tokens

    A space token is made of successive space or tab characters.

  • Newline tokens

    A newline token is made of one newline character.

  • Number tokens

    A number token is made of successive digit characters.

  • Identifier tokens

    An identifier starts with a letter, and is made of any successive letter or digit or extra character specified in the Identifiers argument.

    If Identifiers is not specified, only letter and digits are allowed.

  • String tokens

    Each string of the Strings array describe the delimiters of a string token.

    • If the description is made of one character, then the initial and final delimiter are that character. And if two successive delimiter characters are encountered, only one character is kept, and it is not considered as an escape character anymore.

    • If the description is made of two characters, then the first one is the initial delimiter, and the second one the final delimiter. The final delimiter cannot be escaped.

    • If the description is made of three characters, then the first one is the initial delimiter, and the second one the final delimiter. The final delimiter can be escaped by using the third character.

    If Strings is not specified, then no string token is parsed.

    For example: ["\"", "''\\", "[]"] will parse as token strings everything enclosed by double quotes, single quote, and square brackets. The strings enclosed by double quotes will allow the "double quoting", those enclosed by single quotes will allow the ' character to be escaped with a backslash character, whereas those enclosed by square brackets will allow no escape.

  • Operator tokens

    The contents of the Operators argument is an array of the different strings that will be parsed as a unique token.

    As all characters that are not parsed as a space, newline, number, identifier or string token are returned as an single character token, the Operators should usually contains only operators made of multiple characters. For example, <=, >=, &&, and so on.

The tokens are parsed in the order of that description.

So if a token is parsed as an identifier, it cannot be parsed as an operator. In other words, if you specify something like "X->" in the Operators argument, it will never match, as "X" will be identified as an identifier first.

As all tokens are returned as strings, you can't really know what the type of token is. But it should not be actually relevant.

Examples

Print Tokenize("Return Subst((\"&1 MiB\"), FormatNumber(Size / 1048576))").Join(" _ ")
Return _ Subst _ ( _ ( _ " _ & _ 1 _ MiB _ " _ ) _ , _ FormatNumber _ ( _ Size _ / _ 1048576 _ ) _ )

Print Tokenize("Return Subst((\"&1 MiB\"), FormatNumber(Size / 1048576))",, ["\""]).Join(" _ ")
Return _ Subst _ ( _ ( _ "&1 MiB" _ ) _ , _ FormatNumber _ ( _ Size _ / _ 1048576 _ ) _ )

See also