Language Syntax

Lexical Form


An SPP program consists of a sequence of lines of text. The length of a line is arbitrary, but SPP is guaranteed to be able to handle only lines of up to 160 characters long. The end of each line is marked by a "newline" character.

Character set

SPP uses the extended ASCII character set which includes the characters listed in Table 1.1

Table 1.1: SPP Character Set.

Some of these may be used in identifier names and numeric constants. The remaining ones have specific meaning within the language. SPP does not distinguish between lower case and upper case except for literal strings (inside double quotes). Any character may be used in a literal string. The specific meaning of special characters is described in the appropriate section.

White Space

White space is defined as one or more tabs or spaces. A newline normally marks the end of a statement, and is not considered to be white space. White space always delimits tokens, the smallest recognized elements of the language. Keywords and operators will not be recognized as such if they contain embedded white space. However, the absolute amount of white space is not relevant and there is no enforced structure of text on the line. Indentation and judicious use of white space greatly improves readability. Note, however, that spaces, including trailing blanks, are significant in literal quoted strings such as text to be written to standard output.

Comments

Comments begin with the # character and end at the end of the line. That is, anything after a # is ignored by the preprocessor until the next end of line. Thus, in-line comments may follow SPP statements.

Continuation

Statements may span several lines. A line that ends with an operator (excluding /) or punctuation character (comma or semicolon) is automatically understood to be continued on the following line.

Constants

SPP supports several types of constants. These are described below. (Predefined constants are described in Appendix A.)

Integer Constants

A integer constant is a sequence of one or more of the digits in the range 0 through 9. An octal constant is a sequence of one or more of the digits in the range 0 through 7, followed by the letter b or B. A hexadecimal constant is one of the digits in the range 0 through 9, followed by zero or more of the digits 0 through 9, the letters in the range a through f, or the letters A through F, followed by the letter x or X. Note that a hexadecimal constant must begin with a decimal digit (zero through nine) to distinguish it from an identifier. The notation shown in Table 1.2 more concisely summarizes these definitions.

Table 1.2: Integer Constant Notation.

In the notation used above, + means one or more, * means zero or more, - implies a range, and | means "or". Brackets ([...]) define a class of characters. Thus, "[0-9]+" reads "one or more of the characters in the range 0 through 9." An integer constant has the same range as the range of the underlaying Fortran constant. Since this changes from machine to machine, SPP has the predefined constant MAX_INT as the maximum allowable integer (see Appendix A).

Floating Point Constants

A floating point constant (type real or double) consists of a decimal integer, optionally preceded by a sign (+ or -), followed by a decimal point, optionally followed by a decimal fraction, followed by one of the characters: e, E, d, D, followed by a decimal integer, which may be negative. Either the decimal integer or the decimal fraction part must be present. The number must contain either the decimal point or the exponent (or both). Embedded white space is not permitted. The following are all legal floating point numbers: .01, 100., 100.01, 1E5, 1e-5, -1.00D5, 1.0d0. A complex constant consists of two floating point constants separated by a comma and enclosed in parentheses representing the real and imaginary parts, (1.0,0.0) for example. A floating constant may also be given in sexagesimal, i.e., in hours and minutes, or in hours, minutes, and seconds, or any other units in which places of the number vary by a factor of sixty. Numerical fields are separated by colon characters (:) and there must be either two or three fields. The number of decimal digits in the second field and in the integer part of the third field is limited to exactly two. The decimal point and any fraction is optional. The low level procedures that parse input recognize this syntax as well, making it convenient for users to enter values in a natural format (time or equatorial coordinates).

Table 1.3: Coordinate and Floating Point Equivalents.

The last example has only two fields with the last including a fraction. These two fields are then the largest and next largest fields, such as hours and minutes of time or degrees and minutes of arc. Note that there may be some problems in rounding, however. The predefined constants MAX_REAL and MAX_DOUBLE contain the host-dependent maximum permissible values for real and double constants, respectively.

Character Constants

A character constant consists of from one to four digits delimited at front and rear by the single quote ('), as opposed to the double quotes used to delimit string constants). A character constant is numerically equivalent to the corresponding decimal integer, and may be used wherever an integer constant would be used. On most systems, characters are represented in ASCII, therefore the character values are the ASCII values.

Table 1.4: Character Constants.

The backslash character (\) is used to form escape sequences, which are special non-printed characters. SPP recognizes the following escape sequences:

Table 1.5: Character Constant Escape Sequences.

String Constants

A string constant is a sequence of characters enclosed in double quotes ("), "image" for example. The double quote itself may be included in the string by escaping it with a backslash ("abc\"xyz"). All of the escape sequences given above are recognized. The backslash character itself must be escaped to be included in the string. A string constant may not span lines of text. For example,

call strcpy ("This is a long character string 
with an embedded newline.", outstr, SZ_LINE) 
Would result in the error "Newline while processing string." However, you may include a newline in a string explicitly with the newline character, for example:

call strcpy ("A string\nwith a newline.", outstr, SZ_LINE) 

Identifiers

An
identifier is the name used to refer to a variable or a procedure. Identifiers are constructed of an upper or lower case letter, followed by zero or more upper or lower case letters, digits, or the underscore character. Identifiers may be as long as desired, but only the first five characters and the last character are significant. Identifiers are used for variable names and procedure names, including built-in, intrinsic functions, as well as other language constructs. SPP maps all identifiers to a Fortran identifier that conforms to Fortran 66 standards. That is, they must be six character or fewer and may not include underscores. SPP performs the mapping by first removing underscores and taking up to the first five characters and the last character. If there is a conflict between two SPP identifiers that map to the same Fortran identifier, the last character of the mapped name is replaced with a digit in one of the names. It may be instructive to see the mappings. The mapped SPP and Fortran identifiers are listed as comments in the Fortran output by xc (using the -f option) at the end of the translated source. The definition of an identifier may be summarized using the following rules:

[a-zA-Z][a-zA-Z_0-9]*
See
"Constants" on page 3 for an explanation of the syntax of this shorthand. The following example illustrates valid and invalid SPP identifiers:

Figure 1.1: Identifier Syntax.

Note that the last two map to the same Fortran variable. Therefore, if they were in the same source file, SPP would change the mapping of one to make them unique.

The identifiers in Figure 1.2 are reserved. That is, do not use them as variable or procedure names. Note that not all of them are actually used at present.

Figure 1.2: Reserved Identifiers.

Fortran statements

Fortran statements may be used in SPP source by preceding the statement with a percent character, %. The xc compiler then passes this statement through unchanged. Remember that Fortran does require specific positioning of the text on the line, unlike SPP. So you must include the necessary spaces between the % escape character and the beginning of the Fortran statement. For example:

# Fortran follows, note 
# 6 spaces after % 
%      INTEGER INTF 
Also keep in mind that while most SPP data types are the same as Fortran, character strings are not. See
"Calling Fortran Subprograms" on page 38 and "Fortran Strings" on page 125 for more details.

Character set
Table 1.1: - SPP Character Set.
White Space
Comments
Continuation
Constants
Integer Constants
Table 1.2: - Integer Constant Notation.
Floating Point Constants
Table 1.3: - Coordinate and Floating Point Equivalents.
Character Constants
Table 1.4: - Character Constants.
Table 1.5: - Character Constant Escape Sequences.
String Constants
Identifiers
Figure 1.1: - Identifier Syntax.
Figure 1.2: - Reserved Identifiers.
Fortran statements

Generated with CERN WebMaker