10.  Predefined Scheme Types  

      This chapter introduces the Scheme types predefined by Elk. It begins with the ``pointer-less'' types such as boolean, whose values are stored directly in the pointer field of an Object; followed by the types whose members are C structs that reside on the Scheme heap.

10.1.  Booleans (T_Boolean)  

      Objects of type T_Boolean can hold the values #t and #f. Two Objects initialized to #t and #f, respectively, are available as the external C variables True and False. The macro

can be used to check whether an arbitrary Scheme object is regarded as true. Use of Truep() is not necessarily equivalent to
because the empty list may count as false in addition to #f if backwards compatibility to older Scheme language versions has been enabled. Truep() may evaluate its argument twice and should therefore not be invoked with a function call or a complex expression.

The two functions

int Eqv(Object, Object);
int Equal(Object, Object);
are identical to the primitives P_Eqv() and P_Equal(), except that they return a C integer rather than a Scheme boolean and therefore can be used more conveniently in C/C++.

10.2.  Characters (T_Character)  

      The character value stored in an Object of type T_Character can be obtained by the macro

as a non-negative int. A new character object is created by calling the function
Object Make_Char(int c);
The predefined external C variable Newline holds the newline character as a Scheme Object.

10.3.  Empty List (T_Null)  

      The type T_Null has exactly one member--the empty list; hence all Objects of this type are identical. The empty list is available as the external C variable Null. This variable is often used to initialize Objects that will be assigned their real values later, for example, as the fill element for newly created vectors or to initialize Objects in order to GC_Link() them. A macro Nullp() is provided as a shorthand for checking if an Object is the empty list:

#define Nullp(obj)  (TYPE(obj) == T_Null)
This macro is used frequently in the termination condition of for-loops that scan a Scheme list:
Object tail;
for (tail = some_list; !Nullp(tail); tail = Cdr(tail))
(Car() and Cdr() essentially are shorthands for P_Car() and P_Cdr() and will be revisited in the section on pairs).

10.4.  End of File (T_End_Of_File)  

      The type T_End_Of_File has one member--the end-of-file object--and is only rarely used from within user-supplied C/C++ code. The external C variable Eof is initialized to the end-of-file object.

10.5.  Integers (T_Fixnum and T_Bignum)  

      Integers come in two flavors: fixnums and bignums. The former have their value stored directly in the pointer field and are wide enough to hold most C ints. Bignums can hold integers of arbitrary size and are stored in the heap. Two macros are provided to test whether a given signed (or unsigned, respectively) integer fits into a fixnum:

The former always returns 1 in Elk 3.0, but the range of integer values that can be represented as a fixnum may be restricted in future revisions. It is guaranteed, however, that at least two bits less than the machine's word size will be available for fixnums in future versions of Elk.

The value stored in a fixnum can be obtained as a C int by calling the macro

A macro
can be used as a shorthand for checking whether an Object is a fixnum or a bignum and raising an error otherwise.

The following functions are provided to convert C integers to Scheme integers:

Object Make_Integer(int);
Object Make_Unsigned(unsigned);
Object Make_Long(long);
Object Make_Unsigned_Long(unsigned long);
Make_Integer() returns a fixnum object if FIXNUM_FITS() returns true for the argument, otherwise a bignum. Likewise, Make_Long() usually returns a fixnum but may have to resort to bignums on architectures where a C long is wider than an int. Make_Unsigned() returns a bignum if the specified integer is larger than the largest positive int that fits into a fixnum (UFIXNUM_FITS() returns zero in this case). Another set of functions convert a Scheme number to a C integer:
int Get_Integer(Object);
int Get_Exact_Integer(Object);

unsigned Get_Unsigned(Object);
unsigned Get_Exact_Unsigned(Object);

long Get_Long(Object);
long Get_Exact_Long(Object);

unsigned long Get_Unsigned_Long(Object);
unsigned long Get_Exact_Unsigned_Long(Object);
These functions signal an error if one of the following conditions is true:

As all of the above functions include suitable type-checks, primitives receiving integer arguments can be written in a simple and straightforward way. For example, a primitive encapsulating the UNIX dup system call (which returns an integer file descriptor pointing to the same file as the original one) can be written as:

Object p_unix_dup(Object fd) {
    return Make_Integer(dup(Get_Exact_Unsigned(fd)));
Note that if Get_Unsigned() (or Get_Integer()) had been used here in place of the ``exact'' conversion function, it would be possible to write expressions such as:
(define fd (unix-dup (truncate 1.2)))

10.6.  Floating Point Numbers (T_Flonum)  

      Real and inexact numbers are represented as Objects of type T_Flonum. Each such object holds a pointer to a structure on the heap with a component val of type double, so that the expression

can be used to obtain the double value. To convert a Scheme number to a double regardless of its type, the more general function
double Get_Double(Object);
can be used. It raises an error if the argument is not a fixnum, bignum, or flonum, or if it is a bignum too large to fit into a double.

The functions

Object Make_Flonum(double);
Object Make_Reduced_Flonum(double);
convert a C double to a flonum; the latter returns a fixnum if the double is small enough to fit into a fixnum and has a fractional part of zero. The macro
checks whether the given Object is a number (that is, a fixnum, bignum, or flonum in the current revision of Elk) and raises an error otherwise.

10.7.  Pairs (T_Pair)  

      Pairs have two components of type Object, the car and the cdr, that can be accessed as:

Two macros Car() and Cdr() are provided as shorthands for these expressions, and another macro Cons() can be used in place of P_Cons() to create a new pair. The macro
checks whether the specified Object is either a pair or the empty list and signals an error otherwise. The predefined function
int Fast_Length(Object list);
can be used to compute the length of the given Scheme list. This function is more efficient than the primitive P_Length(), because it neither checks the type of the argument nor whether the given list is proper, and the result need not be converted to a Scheme number. The function
Object Copy_List(Object list);
returns a copy of the specified list (including all its sublists).

      As explained in section @(ch-gc), care must be taken when mixing calls to these macros, because Cons() may trigger a garbage collection: an expression such as

Car(x) = Cons(y, z);
is wrong, even if x is properly ``GC_Linked'', and should be replaced by
tmp = Cons(x, y);
Car(x) = tmp;
or a similar sequence.

10.8.  Symbols (T_Symbol)  

      Objects of type T_Symbol have one public component--the symbol's name as a Scheme string (that is, an Object of type T_String):

A new symbol can be created by calling one of the functions
Object Intern(const char *);
Object CI_Intern(const char *);
with the new symbol's name as the argument. CI_Intern() is the case-insensitive variant of Intern(); it maps all upper case characters to lower case. EQ() yields true for all Objects returned by calls to Intern() with strings with the same contents (or calls to CI_Intern() with strings that are identical after case conversion). This is the main property that distinguishes symbols from strings in Scheme.

      A symbol that is used by more than one function can be stored in a global variable to save calls to Intern(). This can be done using the convenience function

void Define_Symbol(Object *var, const char *name);
Define_Symbol() is called with the address of a variable where the newly-interned symbol is stored and the name of the symbol to be handed to Intern(). The function adds the new symbol to the garbage collector's root set to make it reachable (as described in section @(ch-gcglobal). Example:
static Object sym_else;
void elk_init_example(void) {
	Define_Symbol(&sym_else, "else");

10.8.1.  The Non-Printing Symbol  

      By convention, Scheme primitives that do not have a useful return value (for example the output primitives) return the ``non-printing symbol'' in Elk. The name of this symbol consists of the empty string; it does not produce any output when it is printed, for example, by the toplevel read-eval-print loop. In Scheme code, the non-printing symbol can be generated by using the reader syntax ``#v'' or by calling string->symbol with the empty string. On the C language level, the non-printing symbol is available as the external variable Void, so that primitives lacking a useful return value can use

return Void;

10.9.  Strings (T_String)  

      Objects of type string have two components--the length and the contents of the string as a pointer to char:

The data component is not null-terminated, as a string itself may contain a null-byte as a valid character in Elk. A Scheme string is created by calling the function
Object Make_String(const char *init, int size);
size is the length of the newly-created string. init is either the null-pointer or a pointer to size characters that are copied into the new Scheme string. For example, the sequence
Object str;
str = Make_String(0, 100);
bzero(STRING(str)->data, 100);
generates a string holding 100 null-bytes.

      Most primitives that receive a Scheme string as one of their arguments pass the string's contents to a C function (for example a C library function) that expects an ordinary, null-terminated C string. For this purpose Elk provides a function

char *Get_String(Object);
that returns the contents of the Scheme string argument as a null-terminated C string. An error is raised if the argument is not a string. Get_String() has to create a copy of the contents of the Scheme string in order to append the null-character. To avoid requiring the caller to provide and release space for the copy, Get_String() operates on and returns NUMSTRBUFS internal, cyclically reused buffers (the value of NUMSTRBUFS is 3 in Elk 3.0). Consequently, no more than NUMSTRBUFS results of Get_String() can be used simultaneously (which is rarely a problem in practice). As an example, a Scheme primitive that calls the C library function getenv() and returns #f on error can be written as
Object p_getenv(Object name) {
	char *ret = getenv(Get_String(name));
	return ret ? Make_String(ret, strlen(ret)) : False;

      If more strings are to be used simultaneously, the macro Get_String_Stack() can be used instead. It is called with the Scheme object and the name of a variable of type ``char*'' to which the C string will be assigned. Get_String_Stack() allocates space by means of Alloca() (as explained in section @(ch-alloca)); hence a call to Alloca_Begin must be placed in the declarations of the enclosing function or block, and Alloca_End must be called before returning from it.

      An additional function Get_Strsym() and an additional macro Get_Strsym_Stack() are provided by Elk; these are identical to Get_String() and Get_String_Stack(), respectively, except that the Scheme object may also be a symbol. In this case, the symbol's name is taken as the string to be converted.

      As an example for the use of Get_String_Stack(), here is a simple Scheme primitive exec that is called with the name of a program and one more more arguments and passes them to the execv() system call:

Object p_exec(int argc, Object *argv) {
	char **argp; int i;

	Alloca(argp, char**, argc*sizeof(char *));
	for (i = 1; i < argc; i++)
		Get_String_Stack(argv[i], argp[i-1]);
	argp[i-1] = 0;
	execv(Get_String(*argv), argp);   /* must not return */

elk_init_example() {
	Define_Primitive(p_exec, "exec", 2, MANY, VARARGS);
The primitive can be used as follows:
(exec "/bin/ls" "ls" "-l")
Get_String() could not be used in this primitive, because the number of string arguments may exceed the number of static buffers maintained by Get_String().

10.10.  Vectors (T_Vector)  

      The layout of Objects of type vector is identical to that of strings, except that the data component is an array of Objects. A function Make_Vector() creates a new vector as has been explained in section @(ch-gc) above.

10.11.  Ports (T_Port)  

      The components of Objects of type T_Port are not normally accessed directly from within C/C++ code, except for

which is a pointer to a function receiving an argument of type ``FILE*'' (for example, a pointer to fclose()), provided that the port is a file port. It is called automatically whenever the port is closed, either because close-input-port or close-output-port is applied to it or because the garbage collector has determined that the port is no longer reachable.

A new file port is created by calling

Object Make_Port(int flags, FILE *f, Object name);
with a first argument of either zero (output port), P_INPUT (input port) or P_BIDIR (bidirectional port), the file pointer, and the name of the file as a Scheme string. The macros
check whether the specified port is open and is capable of input (or output, respectively); an error is raised otherwise.

      To arrange for a newly-created port to be closed automatically when it becomes garbage, it must be passed to the function Register_Object() as follows:

Register_Object(the_port, 0, Terminate_File, 0);
Register_Object() will be described in section @(ch-term). The current input and output port as well as ports pointing to the program's initial standard input and output are available as four external variables of type Object:
Curr_Input_Port      Standard_Input_Port
Curr_Output_Port     Standard_Output_Port
The function
void Reset_IO(int destructive_flag);
clears any input queued at the current input port, then flushes the current output port (if destructive_flag is zero) or discards characters queued at the output port (if destructive_flag is non-zero), and finally resets the current input and current output port to their initial values (the program's standard input and standard output). This function is typically used in error situations to reset the current ports to a defined state.

      In addition to the standard Scheme primitives for output, extensions and applications can use a function

void Printf(Object port, char *fmt, ...);
to send output to a Scheme port using C printf. The first argument to Printf() is the Scheme port to which the output will be sent (it must be an output port); the remaining arguments are that of the C library function printf().

To output a Scheme object, the following function can be used in addition to the usual primitives:

void Print_Object(Object obj, Object port, int raw_flag,
		  int print_depth, int print_length);
The arguments to Print_Object() are identical to the arguments of the ``print function'' that must be supplied for each user-defined Scheme type (as described in section @(ch-deftype): the Object to be printed, the output port, a flag indicating that the object should be printed in human-readable form (display sets the flag, write does not), and the ``print depth'' and ``print length'' for that operation. For debugging purposes, the macro
may be used to output an Object to the current output port.

A function

void Load_Source_Port(Object port);
can be used to load Scheme expressions from a file that has already been opened as a Scheme port.

10.12.  Miscellaneous Types  

      Other built-in Scheme types are lexical environments, primitive procedures, compound procedures, macros, continuations (also called ``control points'' at a few places in Elk), and promises. These types are not normally created or manipulated from within C or C++ code. If you are writing a specialized extension that depends on the C representation of these types, refer to the declarations in the public include file ``object.h'' (which is included automatically via ``scheme.h'').

      Lexical environments are identical to pairs except that the type is T_Environment rather than T_Pair. The current environment and the initial (gobal) environment are available as the external C variables The_Environment and Global_Environment. The predefined type constants for primitives, compound procedures (the results of evaluating lambda expressions), and macros are T_Primitive, T_Compound, and T_Macro, respectively. The function

void Check_Procedure(Object);
checks whether the specified object is either a compound procedure or a primitive procedure with a calling discipline different from NOEVAL and raises an error otherwise. The type constant for continuations is T_Control. ``Promise'' is the type of object returned by the special form delay; the corresponding type constant is named T_Promise.

Markup created by unroff 1.0,    September 24, 1996,    net@informatik.uni-bremen.de