The Scala Expression Language
Contents
Introduction
General Syntax
Data Types
Variables
Unary Operators
Binary Operators
Built-In Functions
Flow Control
Functions
Java Reflection
Keywords
The Scala Expression Language (Sel) is an expression language
which can be interpreted by the Scala Expression Engine (See),
a Java(tm) library that has been written in Scala.
It allows to parse mathematical expressions given as string and to
evaluate them either immediately once or repeatedly afterwards.
Possible usage scenarios:
- Parse numerical user input in an expressive matter, e.g allow
"1/16" or "1/2**4" instead of the rather meaningless number "0.0625"
- Parse configuration files that may contain interdependencies
between different settings. Such dependencies may be expressed by
variables.
E.g. assume some XML content: <DefX>x = 1</DefX> ...
<Param1> x + 1 </Param1> ... <Param2> x + 2
</Param2>
- Allow configuration files to contain formulas, e.g. for sequence
generation or data checks like this:
<Generator> defined(x) ? x += PI / 16 : x = 0; sin(x)
</Generator>
<Check> x GT 0 AND x LT 100 </Check>
Sel is not intended to be used for high performance math calculations.
Those should probably be written in some native language anyway.
This document describes the Sel syntax itself. To get information about
integration into Java or other JVM languages, please consult the See
API documentation which may be found here.
Expression processing alway consists of two steps:
- An input string is parsed into a node tree. The parser will
return the root node of produced tree, if there were no parsing errors.
Otherwise a
RuntimeException
will be thrown.
- The node tree is evaluated to yield some kind of result. If a
problem occurs during this step, the evaluator will either produce an
error result or throw a
RuntimeException
, depending on
the evaluation method.
See provides a number of convenience methods to combine both steps into
one and to enforce a certain result type, e.g. boolean or double.
As Sel is a formally complete programming language, the halting problem
applies. This means it is possible to provide input which causes
evaluation to take infinite time. This may not be a major problems for
most applications, but there can be situations where it matters.
Therefore, See provides an alternate parser that rejects all constructs
which may cause infinite loops. In practice, this means that variable
support has to be dropped, because even the evaluation of a variable
may lead to endless recursion. To create a See context with this
alternate parser use "See.createConst()". As a side effect, a node
returned by the alternate parser is guaranteed to evaluate in constant
time and will always yield the same result, independent of the context
that is used for evaluation. There is no point in using the alternate
parser for performance reasons. The difference in parsing speed is
neglectible.
General Syntax
The Sel parser expects a sequence of statements as input. A statement
can be e.g. a math expression, a variable assignment or a block
definition. Statements must be separated by semicolons
(';'). The semicolon following the last statement is optional.
Since Sel is an expression language, some constructs will be done through
operators that most other languages implement using keywords. This results
in rather concise but often somewhat cryptical statements.
C-style comments are supported (Both line- and block comments).
Whitespace and line breaks won't matter in general, but are required in some
places to avoid operator ambiguities.
E.g. 1++2
is parsed as concatenation and will produce the vector (1,2),
while 1 + +2
is parsed as addition plus prefix and will generate 3.
A Sel program is therefore (with some simplifications) defined like this:
program ::= stmts
stmts ::= [statement [';' statement]]
statement ::= fdef | [variable assignOp] expr
fdef ::= Function definition according to
Functions
block ::= '{' stmts '}'
expr ::= operand [binOp operand]
operand ::= [unaryOp] atom
operand ::= literal | variable | "(" statement ")" | block
assignOp ::= assignment operator from
Binary Operators
binOp ::= any binary operator from
Binary Operators
uniOp ::= any unary operator from
Unary Operators
literal ::= Any Sel literal, see
types
variable ::= A valid variable name according to
Variables
Examples:
Input
|
Evaluation Result
|
1
|
1
|
1 + 1
|
2
|
1 + 1; 3 + 4
|
7 The result of the first
statement is discarded.
|
x - 2
|
Depends on x, Error if undefined
|
x = 1;
y = 2;
// y = 3;
r = (x + y) /* + z */ ;
q = r **2;
q
|
9
|
Data Types
Sel supports a number of data types that may be returned as result of
some (possibly intermediate) evaluation step. These types fall into two
broad categories:
- Value types may act as regular operands. They can be created
using some kind of literal form and may be converted into a
reparsable string representaion.
- Nonvalue types are the the result of some evaluation step that
does not yield a regular value. Values of such types cannot be
created explicitly.
In addition, Sel supports some kinds of container types which
are meant to take up instances of some other type.
Whether a container may act as value type depends on its content.
Value types
Bool
Represents a boolean value, which may either be true or false.
Any other type may be forced into a Bool, independent of its value.
Conversion is done automatically whenever an operation requires a Bool
operand. In all other cases, it can be forced using the built-in
function bool()
.
Literal forms:
true (case insensitive)
false (case insensitive)
Bool conversion:
not necessary
String conversion:
true => "true"
false => "false"
Java equivalent:
java.lang.Boolean
String
Represents a sequence of unicode characters of arbitrary length.
Conversion can be forced through the built-in
function str()
. To prevent ambiguities, no automatic
conversion is done except if explicitly noted for an operator.
Literal forms:
Any characters surrounded by double
quotes ("). If the String shall contain a " character itself, it has to
be escaped with ' \'.
String escapes adhere to the Java rules, so other sequences like
unicode and control characters work as well.
Example: "A string", "A \"quoted\" string."
Bool conversion:
false, if string is empty,
otherwise true
String conversion:
not necessary
Java equivalent:
java.lang.String
Regex
Represents a regular expression that may be used to match
strings or other types (which are forced into strings for the match).
Other than for match operations, a regex value behaves much like a String.
Conversion can be forced through the built-in function regex()
.
Literal forms:
Any characters surrounded by single
quotes ('). If the Regex shall contain a ' character itself, it has to
be escaped with '\'. Other than thant no characters need to be escaped
and the backslash may be used to form the ordinary regex expressions.
The regex syntax closely follows the rules that apply to
regular expressions in Java.
Example:
'\d+', "[a-zA-Z]', '"|\''
The expressions which Sel uses to parse value literals are available
through the following built-in constants:
- REGEX_DECIMAL
- Matches a decimal integer, either Int or BigInt.
- REGEX_FLOAT
- Matches a floating point number, either Real or BigReal.
- REGEX_BINARY
- Matches a binary integer, either Int or BigInt.
- REGEX_HEX
- Matches a exadecimal integer, either Int or BigInt.
- REGEX_STRING
- Matches a string.
- REGEX_REGEX
- Matches a regular expression.
- REGEX_NAME
- Matches a function or variable name.
- REGEX_TRUE
- Matches the boolean value true.
- REGEX_FALSE
- Matches the boolean value false.
Bool conversion:
false, if empty,
otherwise true
String conversion:
regex
Java equivalent:
java.util.regex.Pattern
Int
Represents a signed integral number with a resolution of 64 bit.
Int values support a number of operations that will make no sense for
floationg point types. No automatic conversion is performed.
Note that operations upon Int values are susceptible to range overflows,
e.g. 9223372036854775807 + 1 will yield -9223372036854775808.
In some cases, this is even expected behaviour.
If you want to avoid range overflow in any case, use BigInt instead.
Literal forms:
decimal, e.g 0, 1, -1 12346,
-111111111111
binary, e.g 0b0, 0b1, 0b1111, 0b1_111, 0b1111_1111_11
hexadecimal, with 0x prefix e.g 0x0, 0x1,
0xf, 0x11_ff, 0xffff_ABCD
Both binary and decimal forms may contain '_' for better readability
after the first digit.
Binary and hex literals produce negative numbers, if they result in a
number with bit 63 set.
Bool conversion:
true, if > 0
false, if <= 0
String conversion:
decimal representation of contained
value, reparsable
Java equivalent:
java.lang.Long
Real
Represents a 64-bit floating point number according to IEEE 754.
Int values are automatically propagated for any operation that expects
a Real operand. Also, a binary operator will convert an Int operand to
Real, if the other operand is Real.
Literal forms:
regular: 0.0, -0.1, .1, 12345.
123.456
scuientific: 1e5, 0.E5, .1e-1, 1.0e+3
Bool conversion:
true, if > 0.0
false, if <= 0.0
String conversion:
Regular or scientific representation,
whichever is shorter, reparsable
Java equivalent:
java.lang.Double
BigInt
Represents a signed integral number with arbitrary precision. The
BigInt type is implemented in terms of Java BigInteger. So the same
restrictions apply.
BigInt values support a number of operations that will make no sense
for floationg point types. Automatic conversion from Int is performed
wherever necessary.
Literal forms:
decimal, e.g 0L, 1L, -1L 12346L,
1234567890123456789012345678
binary, e.g 0b0L, 0b1L, 0b1111L, 0b1_111L, 0b1111_1111_11L
hexadecimal, e.g 0x0L, 0x1L, 0xfL, 0x11_ffL, 0xffff_ABCDL,
0x1_0000_0000_0000_0000
Both binary and decimal forms may contain '_' for better readability
after the first digit.
Binary and hex literals always produce positive numbers.
Note that BigInt values are automaticaly used, if an Int cannot take up
the parsing result, even if no L suffix is given.
Bool conversion:
true, if > 0
false, if <= 0
String conversion:
decimal representation of contained
value, reparsable (always wilth L suffix)
Java equivalent:
java.math.BigInteger
BigReal
Represents a floating point number with arbitrary precision. The
BigReal type is implemented in terms of Java BigDecimal, so the same
restrictions apply.
Especially, the decimal exponent must not exceed 9999999999.
Int and BigInt values are automatically propagated for any operation
that expects a Real operand. Also, a binary operator will convert any Int, BigInt or
Real operand to BigReal, if the other operand is BigReal.
Most mathematical functions like sin, sqrt, etc. cannot produce an unlimited precision.
Applying such a function to a BigReal value causes reduction to Real before the
operation takes place. The resulting value will also be of type Real.
Note that in such cases a range overflow may happen.
Literal forms:
regular: 0.0, -0.1, .1, 12345.
123.456
scientific: 1e5, 0.E5, .1e-1, 1.0e+3
Bool conversion:
true, if > 0.0
false, if <= 0.0
String conversion:
Regular or scientific representation,
whichever is shorter, reparsable
Java equivalent:
java.lang.BigDecimal
Symbol
A symbol is a constant identifier that represents some Sel type.
You cannot do much with a symbol except using it within type checks.
Nevertheless it is a value that may be assigned to a variable or
compared against some other symbol. Most other operations will
fail upon symbols.
Literal forms:
- Value
- Any value type.
- Scalar
- Any scalar type. String and Regex are regarded scalars.
- Comparable
- Any type that supports relational operators.
- Number
- Any numeric type.
- Integral
- Some kind of integer (Int or BigInt).
- Int
- Integer as explained above.
- Real
- Floating point as explained above.
- BigInt
- Big Integer as explained above.
- BigReal
- Big Decimal as explained above.
- Bool
- Boolean as explained above.
- String
- String as explained above.
- Regex
- Regular expression as explained above.
- Closure
- A closure generated from a function or a block.
- Anonymous
- Anonymous closure created from a block.
- Function
- Closure created from a function.
- Any
- An object unknown to Sel.
- Null
- The null value.
- Container
- Some kind of container.
- Vector
- A Vector.
- Map
- A Map.
- Table
- A Table.
- Assoc
- An association.
Bool conversion:
false
String conversion:
Literal string containing the symbol name.
Java equivalent:
none, not converted
Nonvalue types
Function
A Closure that represents a function which was defined within a
certain scope.
To evaluate a function, it must be called with some argument list.
A function may also be assigned to some variable, but otherwise
cannot be used as operand (although functions may accept other
functions as parameters).
Bool conversion:
always false
String conversion:
Function description (not reparsable)
Java equivalent:
none, not converted
Anonymous
An Anonymous is a closure that will be created as result of a block
evaluation.
You will hardly encounter an Anonymous value in its free form, as
these constructs are collapsed whenever a value is required.
If an Anonymous has been assigned to some variable, an explicit type check
is the only way to distinguish it form the result, it will produce.
See tries rather hard to defer collapsing an Anonymous as far as
possible, so any mentioned variable references will see the actual values.
Bool conversion:
Evaluation result
String conversion:
Evaluation result
Java equivalent:
none, not converted
A See binding may take up any kind of Java object. However, any Sel
operation (except reflection) that encounters such an operand type
will return an Error result.
Bool conversion:
false
String conversion:
determined by the object's toString method
Java equivalent:
The object itself
Null
This type represents the null value and is used for defined,
but still unassigned variables.
Only one instance of this type exists.
A == comparison of any value against Null is guaranteed to return false.
Otherwise, if this type is ever encounterd during evaluation, an error will be raised.
Bool conversion:
false
String conversion:
Null
Java equivalent:
null
Container types
Note that a vector resulting from some evaluation step may contain
nonvalue types as well.
Since all containers may hold functions as elements, they can be called themself.
Any nonfunctional container elements will be returned verbatim, thereby.
Vector
A vector is a sequence of elements that may have any type.
A vector may be of arbitrary size.
Literal forms:
vector ::= "(" [elemts] ")";
elemts ::= element ["," element];
element ::= statement
Examples:
()
Empty vector
(1,2,3)
Vector containig three Ints
(1, true, 1.0, (1,2) )
Vector containing mixed types.
(f, g)
Vector containing functions, assuming f and g
were defined as functions before.
Bool conversion:
false, if vector is empty or any
element evaluates to false.
Otherwise true.
String conversion:
Vector content.
The resulting string will only be reparsable, if the vector
contains no functions.
Java equivalent:
java.lang.Object[]
Subscript restrictions:
Only numbers.
If the index exceeds vector size, an evaluation error is raised.
Map
A map holds associations from one value (the key) to another one (the value).
It is formed by calling the map() function with a vector of Assoc arguments.
A map may be of arbitrary size, including zero.
Caution: Map keys are compared using exact equality
(see Equality).
It might be a good idea to ensure that all keys are of the same type.
Otherwise you will end up with a map that contains different entries for
e.g. 1 (Int) and 1.0 (Real), which may or may not be what was intended.
Also keep in mind that during composition, maps will happily accept
any kind of value as key, even functions, vectors or other maps.
Apart from the equality problem, there may be syntactical reasons
preventing such a key from being used as subscript, which
results in map entries that will never be found.
Literal form:
None, use built-in function map()
Examples:
map()
Empty map
map(1,2,3)
Set containig three Ints
map(1 -> true, 2 -> false, "a" -> 5 )
Map containing mixed types.
map(0 -> f, 1 -> g)
Map containing functions, assuming f and g
were defined as functions before.
Bool conversion:
false, if map is empty or any value evaluates to false.
Otherwise true.
String conversion:
Map content.
The resulting string will only be reparsable, if both keys and
values have a reparsable string conversion.
Java equivalent:
none
Subscript restrictions:
Any.
If the map does not contain a key equal to the index, an evaluation error is raised.
Table
A table is a sequence of numerical coordinates that in combination form
a function-like lookup table over a one-dimensional domain.
Assuming coordinates are defined as (x,y)-pairs, the x value must be
monotonically increasing within the defining sequence.
A table is formed by calling the table() function with a vector of Assoc arguments.
Table values between two coordinates are calculated using linear interpolation.
Multi-dimensional lookups may be created by nesting.
Literal forms:
None, use built-in function table()
Examples:
table()
Empty table (Returns zero for any index)
table(1,2)
Returns 1 below 1, identity between 1 and 2 and 2 above
table(0, 10 -> 100, 20 -> 150)
Lookup from 0 to 20 with a bend at 10.
table(0 ->f, 100 -> g, 200 )
Lookup using f from 0 to 100 and g from 100 to 200, assuming f and g
were defined as functions before.
Bool conversion:
false, if mapping contains zero.
Otherwise true.
String conversion:
Vector content.
The resulting string will only be reparsable, if the vector
contains no functions.
Java equivalent:
java.lang.Object[]
Subscript restrictions:
Only numbers.
A table will produce a valid result for any numerical index.
Assoc
An Assoc forms an association of one value (the key) to some other (the actual value).
Although an Assoc is hardly useful by itself, it acts as building block
for maps and tables.
Note that an Assoc can take up exactly one value
(which may be some other container, however).
Literal forms:
assoc ::= key "->" value;
key ::= expression;
value ::= expression
Examples:
1 -> 2
Numeric association
"A" -> (1, 2, 3)
String lookup
Bool conversion:
value converted to Bool
String conversion:
"key -> value"
The resulting string will only be reparsable, if both key and value
have a reparsable string conversion.
Java equivalent:
none
Subscript restrictions:
None,
the subscript index is ignored and the value will be returned in any case.
Value Equality
By default, Sel takes a very liberal approach about equality
which is independent of the value type.
When using the == operator, all value types that can be somehow converted
into each other may be compared.
Therefore, even something like "5 - 4" == 1.0
holds true.
Note that the first operand of this relation is a string!
This concept of equality is called "weak equality".
Obviously there are cases, where this notion of equality is not appropriate
and a stricter test is required. To work around this problem,
Sel introduces another set of operators (=== and !==) which test for
"strong equality". This means that only values sharing the same
type can be equal. In consequence, 5 - 4 === 1
still
holds true, while 1 === 1.0
holds false.
Note that map keys are always compared using strong equality.
Variables
A Sel variable is a value of any type that is bound to some name
within a certain scope. It does not need to be explicitly declared.
Instead, just assign a value to the name. Apart form constants
(see below), any variable may be reassigned to a value of any other type.
A variable name has to start with a letter or one of the special
characters '_', '$', '#' which may be followed by either of these
characters, a digit or '.'.
Names may be of any length and are case sensitive.
More formally:
varname ::= "[a-zA-Z$#_][a-zA-Z0-9$#._]*"
Examples:
x, x1234, _1234, X1, $1
The variable name "$" has a special meaning. Any evaluation result (even if
itermediate), will be assigned to a variable with that name, so it is
possible to refer to the outcome of the last evaluation.
Therefore it is usually a bad idea to assign to $, because the result
will be lost after a statement has finished. There are exceptions, however.
Names that consist only of uppercase letters and special characters are
treated as constants. Such variables will be eliminated already
within the simplification step that immediately follows the parsing stage.
Constants may be assigned once, but any attempt to reassign them
will result in an evaluation error.
Examples:
X, X1234, C_$, #1R
Note that names like $$$ or #123 are not constants,
since they do not contain an uppercase character.
Names starting with an underscore are implicitly local.
This means that such variables will never be seen form an outer scope,
even if the same name exists there. Nor will any assignment to such
a name leave the current scope.
Local names may be constants.
Examples:
_, _a, _X, __x
Scoping rules
As mentioned above, every variable is a member of some scope. Sel
evaluation starts off using the global scope which can be accessed as
binding from the embedding language. A new scope can be created by
surrounding some statements with braces ('{' '}').
Scopes are nested, so any new scope will become a child of the outer one.
Within an inner scope, all variables of the outer are visible
and accessible. A new variable created within the inner scope will not be
visible from the outer one, however.
Example:
_x = y = 1;
{
_x = y = z = 2;
}
_x; // 1 unchanged
y; // 2 changed within inner scope
z; // Unresolved eval error: z is only defined within inner scope
Unary Operators
Unary operators are always evaluated from left to right.
This table contains all supported unary operators.
Operator |
Operand type |
Description |
- |
Any numeric |
Negates argument |
+ |
Any numeric |
Argument This operator is only available for symmetry with '-' |
!, NOT |
Bool |
Boolean inversion. Argument will be forced to Bool |
~, not |
Int, BigInt |
Bitwise inversion |
local |
Variable name(s) |
Creates explicit local variable(s).
Multiple names may be given e.g. as
local (a, b, c) .
For every listed name, a variable is declared within
the local scope, even if the same name already exists in
some outer scope. Variables created this way are still undefined
and must be assigned before they can be used. Therefore
it is heavily recommended to use implicit locals wherever possible.
Never raises an evaluation error.
|
defined |
Any |
Checks, if argument is valid within current context.
If 'defined x' evaluates to true ,
x and all its possible sub-references are well defined.
Does not check for domain errors, but
descends into function and closure definitions.
Never raises an evaluation error.
|
type |
Any |
Returns the type of its argument.
Evaluates to Symbol .
Never raises an evaluation error.
|
Binary Operators
The following table lists all supported binary operators by decending precedence.
Operator |
Operand type |
Description |
=, +=,-= *=,/=,%= |=,&=,^= <<=,>>= |
Variable, Any suitable lvalue type |
= : Assignment, lhs will be created if undefined, otherwise redefined.
anything else : Reassigning operation,
lhs must be defined in such a case. |
||, OR |
Bool |
Boolean Or, arguments will be forced |
^^, XOR |
Bool |
Boolean Xor, arguments will be forced |
&&, AND |
Bool |
Boolean And, arguments will be forced |
istype |
any, Symbol |
Checks, if lhs if of given type. Yields Bool |
==, !=, >, <, >=, <=
EQ, NE, GT, LT, GE, LE
===, !==, EEQ, NEE
~~,
@?, @?? |
any Comparable |
Comparison, always yields Bool
== and != check for weak equality.
They also work on most other types.
=== and !== check for strong equality
(see Equality).
~~ requires that at least one operand is a Regex
and will yield true upon full match.
@?, @?? weak and strong containment test.
See Set Operations for details. |
+, - |
any numeric |
Addition, subtraction
+ is also defined for String and Regex, which causes concatenation |
|, or |
any integral |
Bitwise Or |
^, xor |
any integral |
Bitwise Xor |
&, and |
any integral |
Bitwise And |
<<, LSHIFT |
any integral |
Right shift |
>>, RSHIFT |
any integral |
Left shift |
*, /, % *+ |
any numeric |
Multiplication, Division, Modulo
Scalar product (multiplication on scalars)
* is also defined for functions, where f * g produces a new function
that is equivalent to f(g()).
|
**, EXP |
any numeric |
Exponentiation |
++ |
any |
Container Concatenation
e.g. 1 ++ 2 -> (1,2); (1,2) ++ (3,4) -> (1,2,3,4)
Concatenates two containers.
The left operand determines the result type.
Scalars are treated as if they were Vectors with a single element. |
+++ |
any |
Vector Concatenation
e.g. (1,2) +++ (3,4) -> ((1,2),(3,4))
The result of this operation is guaranteed to be a vector.
Operands are not flattened. |
@, @@ |
any Container
Vector, any numeric |
Subscripting and slicing
e.g. (1,2)@0 -> 1 (1,2,3)@@(0,2) -> (1,2)
Subscripting works on any container as left operand.
Slicing requires some kind of vector.
Negative arguments count from the end of the vector,
e.g. (1,2,3)@-1 -> 3, (1,2,3)@@(-1,0) -> (3,2,1)
Note that an illegal subscript index will cause an evaluation error,
while invalid slicing bounds will yield an empty vector. |
@|, @&, @^ |
any, except tables |
Setlike operations.
See chapter Set Operations for details.
@| Set union.
@& Set intersection.
@^ Set difference. |
~~~, ~@, ~+, ~* |
Regex, any |
Regular expression operations
~~~ Performs full match, yielding a vector of matching groups.
~@ Searches for first match, yielding the index after the match as
Int, -1 if none was found.
~+ Searches for first match, yielding a vector of matching groups.
Empty vector, if no match is found.
~* Searches for all matches, yielding a vector that contains
vectors of matching groups. Empty vector, if no match is found.
|
:> |
any, special |
Java Reflection
Calls a Java method or accesses a field.
See chapter Java Reflection for details.
|
Built-In Functions
Sel provides the following built-in functions.
In general they are called as in y = f(x)
or formally
fcall ::= fname arglist
arglist ::= '(' args ')'
args ::= [statement[',' statement]]
Note that the argument list looks exactly like a vector definition, and in
fact all functions called with multiple argument will instead see just one vector.
A function that requires a certain arguments pattern will cause an
evaluation error, if the argument vector doesn't match that pattern.
If not otherwise noted, any function that expects a single parameter will also
operate upon vectors, if it encounters any during argument evaluation. In such
cases, the function is performed on an element by element basis, and the
result will have the same shape as the input.
Type conversion functions
These functions force a type conversion of their arguments.
An evaluation error will result if the conversion is not supported or
the result cannot be represented within the target type.
- bool(x)
- Converts its arguments into Bool values.
Possible for any value type.
Return type: Bool
- str(x)
- Converts its arguments into a reparsable string.
Possible for any value type
Return type: String
- regex(x)
- Converts its arguments into a regular expression.
Possible for any value type.
Return type: Regex
- int(x)
- Converts its arguments into Int values. Real values are rounded down.
Possible for any numeric type and reparsable string.
Return type: Int, may be truncated
- real(x)
- Converts its arguments into Real values.
Possible for any numeric type and reparsable string.
Return type: Real, may be truncated
- bigint(x)
- Converts its arguments into BigInt values.
Possible for any numeric type and reparsable string.
Return type: BigInt
- bigreal(x)
- Converts its arguments into BigReal values.
Possible for any numeric type and reparsable string.
Return type: BigReal
- round(x)
- Converts its arguments into integer type of same precision,
rounding to nearest.
Possible for any numeric type and reparsable string.
Return type: Int or BigInt
- floor(x)
- Converts its arguments into integer type of same precision,
rounding down.
Possible for any numeric type and reparsable string.
Return type: Int or BigInt
- ceil(x)
- Converts its arguments into integer type of same precision,
rounding up.
Possible for any numeric type and reparsable string.
Return type: Int or BigInt
Container Construction
Using these functions creates containers of a certain type that hold
the function arguments as elements.
- vector(x1, ... xn)
- Generates a vector.
In general, using this function is redundant,
because the form (x1, ... xn)
will yield a vector anyway,
but it may be useful to create a vector with exactly one element.
Return type: Vector of arguments
- map(x1, ... xn)
- Generates a Map.
In most cases, arguments will be associations, but a scalar
values are also allowed.
It is even possible that all arguments are scalars.
Although this sounds rather pointless at first, such a map will
make sense when used in combination with set operations described below.
Return type: Map of arguments
- table(x1, ... xn)
- Generates a Table.
In general, arguments will be associations, but any kind
of number is also a valid argument.
Return type: Table of arguments
Basic Mathematics
The functions listed here will call into the Java math library to calculate results.
Arguments may be any numerical types.
- abs(x)
- Calculates the absolute value of its argument.
Arguments may be any numerical types or vectors composed of such.
Return type: Same as argument
- cos(x)
- Calculates cosine of its argument.
Return type: Real
- sin(x)
- Calculates sine of its argument.
Return type: Real
- tan(x)
- Calculates tangent of its argument.
Return type: Real
- acos(x)
- Calculates inverse cosine of its argument.
Return type: Real
- asin(x)
- Calculates inverse sine of its argument.
Return type: Real
- atan(x)
- Calculates inverse tangent of its argument.
Return type: Real
- sqrt(x)
- Calculates square root of its argument.
Return type: Real
- log(x)
- Calculates natural logarithm of its argument.
Return type: Real
- log10(x)
- Calculates logarithm base ten of its argument.
Return type: Real
Vector reducing functions
The functions listed here will reduce any vector argument to a single scalar.
The result will have the propagated type of all vector elements.
Arguments may be vectors composed of any numerical types.
If a scalar is given instead, it will be returned as is.
- max(x)
- Returns the maximum of all arguments.
Return type: Same as argument
- min(x)
- Returns the minimum of all arguments.
Return type: Same as argument
- sum(x)
- Calculates the sum of all arguments.
Return type: Same as argument
- prod(x)
- Calculates the product of all arguments.
Return type: Same as argument
- mean(x)
- Calculates the arithmetic mean of all arguments.
Return type: Same as argument
- fold(init, op, v)
- Folds a vector into as scalar using function 'op'.
Folding is done from left to right. 'init' is used as initial value.
So the function sum(x)
could also be written as fold(0, add, x)
,
where 'add' is somewhere defined as add(x,y) := {x + y}
Return type: Some kind of scalar.
Misc. functions
- len(x)
- Returns the length of a vector or string argument.
All other input types will return 1.
Return type: Int (always scalar)
- pad(x, l, v)
- Produces a vector of constant length l from input vector x.
If l is smaller than len(x), the rightmost elements are discarded.
If l is larger than len(x), the produced vector will be padded
with v.
Return type: Vector
- rep(l, v)
- Produces a vector of constant length l with elements initialized to v.
Return type: Vector
- rnd(x)
- Produces a random number between 0 and argument (exclusive).
BigReal result may be larger than x!
Return type: Same as argument
- sort(v)
- Sorts vector elements in ascending order.
This function is only guaranteed to work if all entries are
comparable and share the same type. Otherwise an error may be
raised or the resulting vector will remain partially unsorted.
Return type: Vector
- unique(v)
- Removes duplicate elements from vector.
Uses weak equality to identify duplicate entries.
Return type: Vector
- distinct(v)
- Removes duplicate elements from vector.
Uses strong equality to identify duplicate entries.
Return type: Vector
- zip(v1, v2, ...)
- Zips vectors into one, containing tuples of adjacent elements.
If vectors are of different length, the shortest one will determine result size.
It is also possible to call this function with a single argument,
that is a multi-dimensional vector. For dimension 2,
the result wll have rows and columns exchanged and the
relation zip(zip(v)) == v
holds.
Return type: Vector
Set Operations
Although Sel does not provide an explicit set type, it contains
operations that let you treat vectors and maps as such.
The preferred type to use for sets is the map.
Therefore, if neither operand of a set operation is map or vector,
a map will be produced, possibly an empty one.
Operations will just affect the key sets.
The associated values are ignored and remain unchanged.
A vector may form a multi-set, which means it can contain multiple
instances of the same value.
Operation result depends on operands.
If the left operand is either a map or a vector, so will be the result.
Otherwise the right operand determines the result type.
A scalar combined with a map will produce an association to itself,
which means key and value are the same,
e.g. map(1 -> 2) @| 3
yields map(1 -> 2, 3 -> 3)
.
Set operations work like this:
- Union
The result will contain the union set of both operands.
If both operands are vectors, this is basically the same as concatenation.
Otherwise, a regular union operation is performed.
E.g.(1,2,3,3) @| (1,3,3,4)
yields (1,2,3,3,1,3,3,4)
,
while (1,2,3,3) @| 3
yields (1,2,3,3)
.
If the left operand is a map, any overlapping entries will take
associations from the right.
- Intersection
The result will contain the intersection set of both operands, which may be empty.
If both operands are vectors, the result contains any elements
that occur in both operands, e.g.
(1,2,3,3,3) @& (1,3,3,4)
yields (1,3,3)
.
For map operands, the values of the left side take precedence.
- Difference
This operator is asymmetric.
All elements occurring within the right operand will be removed
from the left one, which may leave an empty set.
If the right operand is a scalar, all equal elements will be
removed, otherwise only so many as the right side contains,
e.g. (1,2,3,3,3) @^ (1,3,3,4)
yields (2,3)
,
while (1,2,3,3,3) @^ 3
yields (1,2)
- Containment
The operators @? and @?? yield true, if the right operand is found
at least once within the left operand.
@? performs a weak, @?? performs a strong equality check.
Even if the weak test succeds for a given value, it is not
guaranteed that a map subscript will succeed.
Flow Control
Sel contains a few operators to control program flow.
In contrast to most other languages, those operators yield a result.
Conditional operator
The conditional operator is defined like this:
ifOp ::= cond '?' then-clause ':' else-clause
cond ::= expr Expr will be forced to Bool
then-clause ::= stmts | block
else-clause ::= stmts | block
Depending on the result of the cond expression, either the then- or the
else-clause will be evaluated and its result returned as result of the
conditional statement.
Example:
x > 0 ? 1 : x < 0 ? -1 : 0
Loop operator
The loop operator is defined like this:
ifOp ::= cond '??' while-body ':' else-clause
while-body ::= stmts | block
else-clause ::= stmts | block
The while-body will be evaluated as long as the cond expression evaluates to true.
If it never does, the evaluation result of the else-clause is returned, otherwise
the last result of the while-body.
The while-body is required to modify the result of the cond clause,
most probably by redefining some variable. Otherwise an infinite loop will result.
Example:
y = 1; x > 0 ?? x -= 1; y *= 1 : 0
Caution:
The else-clause of both conditional and loop operator may consist of
multiple statements. While this is convenient within simple expressions,
it means that these operators will always terminate a statement sequence.
Example:
x = 1;
y = x ? 0 : 1;
z = y * 10; // Probably causes eval failure, because y is undefined!
This happens, because the above is actually interpreted as
y = x ? ( 0 ) : ( 1; z = y * 10 )
.
So if you want to execute other statements after a loop or condition,
you have to enclose the whole expression within parentheses like this:
x = 1;
y = (x ? 0 : 1);
z = y * 10; // Works as expected now
Return operator
The return operator will terminate a block prematurely,
if its condition is true.
It is defined like this:
returnOp ::= cond '?=' statement
Example:
x = -1;
y = {
x <= 0 ?= 0;
log10(x); // not executed
}; // will end up here
10 * y;
// not here!
Error (or assert) operator
Generates an error condition if condition is false.
The error will terminate evaluation, except if it isn't explicitly catched
by a catch block.
It is defined like this:
assertOp ::= cond '?!' | 'assert' statement
Example:
x = 0;
{
y = 10 * {
x > 0 ?! "Undefined ln() operand";
log10(x); // not executed
};
y += 5 // not executed
}! // error catched here
0 // result returned in case of error
}
Pattern matching operator
This operator is obviously inspired by Scala's pattern matching,
which is an extremely powerful construct. Although Sel's pendant
doesn't come close, it is still quite useful and offers much more
versatility than the usual switch statement.
Formal definition:
matchOp ::= selector '?~' alternatives
selector ::= expr
alternatives ::= alternative (':' alternative)*
alternative ::= pattern '->' result
pattern ::= '?' | typename | expr
alternative ::= statement
typename ::= Any type literal (a Symbol)
result ::= statement
Pattern matching works like this:
- At first, the selector expression is evaluated.
- Then, alternatives are tried from left to right.
- If any pattern matches the selector,
the operator yields its result and the remaining alternatives are ignored.
- If no pattern matches at all, an evaluation error is raised.
Whether a match is found depends on the pattern:
- The '?' acts as default pattern.
It matches any selector and should therefore be used only within the last alternative.
- If the pattern is a Regex, the selector is forced into a string
and a full regex match will cause the pattern to fit.
- If the pattern is a typename, the selector must fit this type.
- If the pattern is any other constant, it will be compared against the
selector, using the == operator. If that returns true, the pattern fits.
- If the pattern is a variable that refers to a function, it will be called
using the selector as argument. The return value is then forced into
a Bool and fits, if true.
- Any other variable is compared against the selector,
using the == operator. If that returns true, the pattern fits.
- Finally, if the pattern is some other kind of expression,
it will be evaluated and the result is forced into a Bool.
In such a case, it is assumed that the expression contains a
reference to $, which will be set to the selector value.
This isn't enforced, however. If the pattern expression doesn't mention $,
it will in fact be independent of the selector, so the operator
degradates into a short form of if() elseif() elseif() ...
The result expression of an alternative may refer to $ to build
a value that depends on the selector.
Example (rather contrived):
pred(x) := { 10 < x < 100 };
a = 101;
x = 20;
y = x ?~ "abc" -> 0x41L :
'\d+\*a' -> a * int($~~~'(\d*).'@1) :
111 -> 112 :
a -> a + 1:
$ < 0 -> $ + 1:
pred -> 2 * $:
Number -> bigint($) :
? -> 0L ; // note semicolon after last alternative!
Note that although the syntax was chosen to resemble Scala's pattern
matching, it doesn't work the same way.
Especially, the -> will not produce a function in this case,
just the result of its right hand side. If you want this to be
a function, you have to say so explicitly.
Functions
The formal syntax of a function definition in Sel looks like this:
fundef ::= fname '(' params ')' ':=' block
fname ::= "[a-zA-Z][a-zA-Z0-9_]*"
params ::= [varname[',' varname]]
This will not only define the function, but also assign it to a name,
which may be used to call it afterwards.
Example:
f(x, y) := { x + 2 * y };
f(10, 20) // yields 40
However this is only a more intuitive shortcut for the more general form
f = (x, y) => { x + 2 * y };
f(10, 20) // same as above
Formally:
function ::= '(' params ')' '=>' expr
This syntax allows to define functions without assigning them to any
name at all. A function defined in this way may e.g. be used as vector
element or as return value of some other function.
Similar to Scala, the definition above may be further
abbreviated by leaving off the parameter list,
if some placeholder syntax is used instead.
E.g. the example above could be also written as
f = { _1 + 2 * _2 };
f(10, 20) // again same as above
Any block that contains references to variables of the form "_1" .. "_9"
will be converted into a function definition with a parameter list
containing these placeholders. The variable with the largest number
determines the number of arguments, the function will take,
even if placeholders with lower numbers are not used at all:
{ 2 * _4 } // is equivalent to (_1, _2, _3, _4) => { 2 * _4 }
If a function shall take ony a single
parameter, it may also be written as just "_" instead of "_1".
Note that there is a significant difference between Scala and Sel
if this kind of shortcut is used:
{ _ + _ } // is equivalent to (a, b) => { a + b } in Scala
// but (a) => { a + a } in Sel
A function definition must not mix list styles.
This means either use just explicitely named parameters,
numbered placeholder or just an underscore, but no combination of those.
{ _ + _2 } // Won't work
(x) => { x + _2} // Illegal
Note that any function definition within Sel always generates a closure.
This means that all variables which are visible when the
definition is evaluated will be available throughout the lifetime
of the function, whether they live in the same scope or in some
enclosing one. However, their value may still change afterwards.
Whenever a function is called, a new scope is created within the
function's definition scope and all arguments are
then defined within the new scope.
The Java Reflection operator (":>") allows an
expression to reuse existing Java (or Scala) code.
The operator works like this:
If the left hand side evaluates into type Any,
the contained object will be used as argument for a method invocation
or field access. In all other cases, the result of the left hand side
evaluation will be forced into a string and used as (fully qualified)
class name to generate a static call.
The right hand side of the operator acts as method- or field name.
Since Java- and especially Scala names do not conform to Sel naming
conventions, this name has to be terminated. For method invocations,
the opening parenthesis of the argument list implicitly terminates the
name, but for field accesses, a '!' character has to be explicitly
appended. The '!' just signals a field access, but will not be part
of the name. This way it is possible to call even Scala operator methods,
e.g. "name_=()" to change a property.
After class- and method name are known, the evaluator looks for a
matching method using the Java reflection interface.
Only public methods or fields are taken into account. If there are no
overloads for the given number of parameters, it tries to convert
all arguments into matching types and invokes the method afterwards.
If more than one overload with the same number of parameters exist,
a stricter matching algorithm is used, which just maps closely related
types with different precision onto each other. For example,
in the first case an Int-argument will satisfy a double parameter,
while in the latter case only a Real would. If there are more overloads
using similar types, that one with the highest precision will be used
(e.g. Double instead of Float or Long instead of Byte).
The result of the method call or field access will be converted into
the most appropriate Sel type available (which may be Any).
To invoke a constructor of some class, use the synthetic method name
"new", e.g. "java.io.File":>new("somefile")
.
Some examples:
f = "java.io.File":>new("somefile");
(f:>exists() ? f:>delete(); 1 : 0);
"java.lang.System":>out!:>println("Hello");
// or, if you want to do that more than once:
sout = "java.lang.System":>out!;
sout:>println("Hello");
sout:>println("world");
fstr = "java.lang.String":>format("%04x, %.2f", (10, 1.5));
sout:>println(fstr); // prints "000a, 1.50"
As shown above, reflection is a rather powerful feature.
It may give an expression nearly full control over the JVM
of the embedding application as well as the file system
of the hosting machine. Of course, this creates a serious security
problem, if the input of the expression cannot be controlled.
Therefore, reflection can be disabled by calling
See.disableJavaReflection()
. Nevertheless, I left the
feature on by default, because in such situations a user should
consider using a constant parser anyway (by means of
See.createConst()
instead of See.create()
).
Keywords
The following keywords have a predefined meaning and should not be used
to define variables or functions, although they are valid names:
$ AND and assert defined E EQ EEQ
EXP gcd GE GT istype LE local LSHIFT LT NE NEE NOT not
OR or PI RSHIFT XOR xor
All type names
All regex pattern names
The names of Built-in functions are not keywords. It is perfectly valid
to overwrite them, although doing so locally may yield unexpected results
due to pre-simplification.
Note that the parser may not always produce an error, if you use
a keyword as variable name.
Instead it will be able to determine meaning by context most of the time,
but in some cases you may experience rather surprising results.
So, if you are absolutely determined to write obfuscated code,
the following should work:
and = 1;
xor = 2;
or = 3;
if = and or xor xor and or not or and and or xor;
I leave it up to you to find out what if will be :-)