Tokenize
Tokens = Tokenize ( String [ , Identifiers , Strings , Operators , KeepSpace ] )
Split a string into tokens and return them.
-
String is the string to split.
-
Identifiers is a string of extra characters allowed in identifier tokens.
-
Strings is an array of strings, each string describing the limits of a string token.
-
Operators is an array of strings, each string representing an operator token.
-
KeepSpace will tell if space tokens are returns.
This subroutine splits the string and returns the following kind of tokens: space, newline, number, identifier, string and operator tokens.
They are parsed in that order of priority.
Space tokens
A space token is made of successive space or tab characters.
Newline tokens
A newline token is made of one newline character.
Number tokens
A number token is made of successive digit characters.
Identifier tokens
An identifier starts with a letter, and is made of any successive letter or digit or extra character specified in the
Identifiers argument.
If
Identifiers is not specified, only letter and digits are allowed.
String tokens
Each string of the
Strings array describe the delimiters of a string token.
The first character of the description string is the starting character of the string. For example,
'
or
"
.
The second character of the description string is the ending character of the string.
If the second character is not specified, then:
-
The string token ends with the same character as the starting one.
-
The backslash
\\
character allows to escape its following character from being interpreted as the ending character.
If
Strings is not specified, then no string token is parsed.
Operator tokens
The contents of the
Operators argument is an array of the different strings that will be parsed as a unique token.
As all characters that are not parsed as a space, newline, number, identifier or string token are returned as an single character token,
the
Operators usually contains operators made of multiple characters.
ExampleS
Print Tokenize("Return Subst((\"&1 MiB\"), FormatNumber(Size / 1048576))").Join(" | ")
Return | Subst | ( | ( | " | & | 1 | MiB | " | ) | , | FormatNumber | ( | Size | / | 1048576 | ) | )
Print Tokenize("Return Subst((\"&1 MiB\"), FormatNumber(Size / 1048576))",, ["\""]).Join(" | ")
Return | Subst | ( | ( | "&1 MiB" | ) | , | FormatNumber | ( | Size | / | 1048576 | ) | )
See also