To suggest an efficient way of using FORTRAN IV with the PDP15, and to discuss the machine code compiled for various statements.
This report is divided into nine sections as follows:
The rest of this introduction is concerned with the use of the compiler, and running compiled programs.
0.1 Much of the compiled code consists of subroutine calls, or their arguments; one FORTRAN statement generates about 6 words, 2 of which are subroutine calls.
Compiling into subroutine calls gives a great saving of space compared with inline coding: much time is spent in entering subroutines, sometimes this doubles the time needed by the corresponding inline code.
P Nelson's graphical package took 3 times as long to run as it did on Atlas, doing a standard task.
The largest routines of the FORTRAN object time system (OTS) are BCDIO and REAL. BCDIO together with FIOPS makes up the system for interpreting FORMAT statements, and issuing the corresponding IOPS macros. REAL is the group of routines which simulate a real arithmetic module.
BCDIO is 30278 locations long, REAL 10228, and FIOPS 5408.
Since input/output and real arithmetic are time consuming it is useful to use only INTEGER arithmetic, and purpose built input/output routines.
0.2 It is quite easy to run out of space while loading a large package of subroutines. The linking loader shows this by giving:
LOAD 1
and returning control to the keyboard monitor.
Programs and common areas must fit between 130778 and 376368 if DTA and the loader are resident.
There are 3 ways of avoiding LOAD 1.
0.2.1 Assign DTC2 to -4, and DTC0 to -1. This saves 31038 core locations. Use MTF, instead o MTA, this saves about 20008 locations.
0.2.2 Try putting subroutines in a LIBR5 BIN file using UPDATE. Note that the LIBR5 file is only read once, so if one program calls others, it should make forward references. See 0.2.2.3.
0.2.2.1To create a new .LIBR5 BIN file.
Suppose that three files are available on DEC tape unit one called A BIN, B BIN and C BIN. Then in keyboard monitor type
A TTA -12/DAT1 - 10?DTA2 -15 UPDATE Update then type UPDATE Vnn > Type the command string L,N ← LIBR5 Update then replies > type I A when UPDATE replies > type I B when UPDATE replies > type I C when UPDATE replies > type C
when UPDATE gives: type↑C to return to monitor. Update will also have given a listing of the library file it has created, showing the PROGRAM NAME and its SIZE in words as an octal number. It will create the LIBR5 BIN file on Unit 2. It is then necessary to assign
DTA2 -5 or DTC2 -5
so that the file will be read by the linking loader.
0.2.2.2 To update an existing LIBR5 BIN file,
Suppose that we wish to delete a program called A in the LIBR5 file and insert a new one in the same place, then Type
A TTA -12/DTA1 -10, -1/DTA2 -15
this requires that the old LIBR5 file and the new version of A are on DEC tape unit one, and creates a new LIBR file on Unit 2.
When Update replies
UPDATE Vnn > type L, U- LIBR5 Update will answer > type > type R A when update answers > type C and when update answers > type↑C to return to monitor.
0.2.2.3 Ordering of programs in .LIBR5 files
The .LIBR5 file is only read once, so the order of programs in it is of great importance. As each program is read, so its unsatisfied global references are added to the list. A program is read down only if it satisfies some needed .GLOBL reference.
0.2.2.4Why this is of value
In a .LIBR5 file, there are no internal symbols, only .GLOBL references are stored.
0.2.3 By constructing an execute file
Use the Manual in one of the white binders, or the PDP15/20 User's Guide.
It is not necessary to create an overlay structure to get a benefit from using Chain and Execute. Since Chain writes each subprogram on tape absolutely, then it does not matter how much store Chain uses up while it is resident.
Chain takes its input from .DAT slot -4, and writes its output to .DAT -6. If .DAT -5 is assigned to a unit there must be a file .LIBR5 BIN on it, or the program will fail with file not found.
The process of creating an execute file, and using it is long winded, and time consuming.
Below I shall represent an ALT MODE by <>.
Suppose files A BIN, B BIN and C BIN are again available on .DAT-4, and .DAT -6 is some other unit, then in keyboard monitor type:
CHAIN Chain answers CHAIN Vnn >
type some filename terminate with <> - this will be the name used when running the EXECUTE structure.
Chain answers
LIST OPTIONS AND PARAMETERS > type PGR, 16K, BGD *<> Chain answers DEFINE RESIDENT CODE
Supposing A to be the main routine, type
A,B,C <>
If the list gets too long for one line, put a carriage return after a comma, and continue list.
Chain will answer
DESCRIBE LINKS AND STRUCTURE > type <>
Chain will then try to build the structure, and will type out a core map as it goes.
Whether it succeeds or fails from here on it returns to the monitor. If it fails, it gives an error message.
If the process works then the program can be executed by typing.
E filename
where filename is on .DAT slot -4. Device handlers are now loaded as required.
NB. XX is not recognised by chain, so do not refer to it.
If the program is still too big, it will reply with the message
CAN'T FIT
Then alter the overlay structure - or buy more core.
0.3 FORTRAN programs are executed in Page Mode, but no use is made of the index register. Not even for accessing arrays. No indication is given of how much code has been produced.
0.4 All references in this report are to the Fortran IV Programming Manual, when they are by page number, by table name or by chapter. Cross references within the report are to a paragraph number, and are prefixed section.
This section is divided into 4 parts. Integer arithmetic is described in Section 1.1, Real and Double Precision arithmetic in Section 1.2, the use of logical variables in Section 1.3, and the characters in Section 1.4.
There is no test in the arithmetic package for overflow or underflow. In real and double precision arithmetic overflow will cause undefined results.
Dividing by zero is always identical to dividing by one.
Mixed Mode arithmetic is allowed only between REAL and DOUBLE PRECISION constants or variables, except in exponentiation, where REAL or DOUBLE PRECISION variables may be raised to integer or real exponents.
1.1.0 Integers are stored as single words in two's complement. Thus integer variables may have values from -132,O7110 to 132,07110. 40000008 is not a Fortran integer. It can be put into the accumulator, but it does not mean anything.
There is no way of testing the link.
Addition is shown in 1.1.1, subtraction in 1.1.2, multiplication in 1.1.3 and division in 1.1.4. Exponentiation is described in 1.1.5, an example in 1,1.6, and fixing in 1.1.7.
Subroutines for multiplication and division are required since the EAE module does arithmetic in one's complement. It might well have been simpler to use this throughout for integer arithmetic, and put up with two representations of zero.
1.1.1 Addition is compiled into a TAD instruction. Thus I= J + K
LAC J TAD K DAC I
Note that no use is made of the instruction AAC for small constants.
1.1.2 Subtraction compiles into calls of subroutine .AY or .AZ. Thus, I = J-K
LAC J JMS* .AY LAC K DAC I
Note: LAC is a parameter passing mechanism. Control is returned to DAC I. I= -K+J generates identical code, and that AAC is again not used.
This calling of the subroutine requires 19 cycles as against 7 for the inline code:
LAC K TCA TAD J DAC I
Note:
IT = -K I= J + IT
compiles as:
LAC K CMA TAD (000001 DAC IT LAC J TAD IT DAC I
This takes 13 cycles while occupying three more words of core than the subroutine call.
1.1.3 Multiplication is done by subroutine .AD
I = J*K
Compiles as
LAC J JMS* .AD LAC K DAC I
It is used as the EAE unit does one's complement arithmetic.
Note: .AD will not give the result 4000008. The result of multiplying 4000008 by 0000018 is 40000018. Also note that only the least significant 18 bits of the result are taken. Thus the multiplication of large numbers may give unexpected results.
The subroutine is:
.AD 0 GSM /Put sign of acc in link, if acc.neg comp. SZL /was acc neg? Skip if it wasn't. IAC /Acc was neg .GSM took one's comp. DAC C1 /deposit multiplicand XCT .AD /pick up multiplier ISZ .AD /reset return address SPA /is multiplier positive? Skip, if yes. ADD (777776 /fudge factor if it wasn't MULS /signed multiply one's complement C1 0 /multiplier LACQ /leastsig.18 bits of product are in MQ SPA / they positive? Skip if they were. IAC /Make 2's complement; if not JMP* .AD /return
1.1.4 Integer division is dealt with by .AE or .AF. It gives a result in two's complement, and stores a positive remainder in .CO.
I=J/K
compiles as:
LAC J JMS* .AE LAC K DAC I
to retrieve the remainder, define .CO as a global and write
LAC* .CO
Note Division by zero has the effect of dividing by one, no error message is given.
1.1.5 Exponentiation is dealt with by .BB. The calling sequence is standard.
This routine is slower than repeated multiplication.
Note Error OTS 15 is given if zero is raised to a zero or negative power, this is recoverable, the value of the expression is zero.
1.1.6 An Example
I = J + K*(L**M-2)/N)*(I+J)
compiles into:
LAC L JMS* .BB LAC M JMS* .AY LAC (2 JMS .AD LAC K JMS* .AE LAC N TAD J DAC %IA LAC I TAD J JMS* .AD LAC %IA DAC I
Note; Intermediate addresses are called %IA, %IB etc. Repeated subexpressions are compiled separately. Thus
I= (J + K)*(J + K)
compiles into
LAC J TAD K DAC %IA LAC J TAD K JMS* .AD LAC %IA DAC I
1.1.7 Fixing
Real numbers are truncated to Integers, by assigning them to integers, or by use of the functions INT or IFIX. All these processes have the same result.
1.9 is truncated to 1
-1.9 is truncated to -1
This truncation is done by subroutine .AX.
1.2.0 Introduction
Real and Double Precision Arithmetic is carried out on a 3-word floating accumulator labelled .AA, .AB, .AC (see p.9-2) and on a held accumulator labelled .CE-5, .CE-4, and .CE-3.
Real variables are held in core as two words (see p9-3).
0 8 9 17 Low order mantissa Exponent 0 1 2 17 Sign Mantissa high order
This gives a mantissa of 26 bits and sign, about 7 significant decimal digits. The exponent is 9 bits long, permitting a range from -256 to 256, but as the exponent is a power of two, this converts to decimal -76 to +76. For safety the software rules any exponent whose absolute value is greater than 75 illegal, at compile time. No test at run time, Double precision variables are held in 3 words, giving a 35 bit mantissa with sign bit, or about 10 significant decimal digits. The exponent has 18 bits to itself but is still restricted to the s8Jlle limits as real variables.
The name of the variable is attached to the first of the sequence of words.
The exponent is held in two's complement, and the mantissa is a binary fraction.
The real representation of zero is:
000000 000000 one is: 000001 200000 five is: 000003 240000 etc,
Integers are floated by .AW. It is a complicated process.
Section 1.2.1 Introduces the routines of the package, and Section 1.2.2 lists those not mentioned in Sections 1.2.1 and 1.2.3 notes the difference between double precision and Real arithmetic. Section 1.2.4 gives an example of a complicated expression, and Section 1.2.5 an introduction to the system supplied functions.
Real and double precision variables or constants may be mixed in one expression.
1.2.1 The floating accumu1ator is loaded by .AG and stored by .AH. Thus
JMS* .AG .DSA A
1.2.2 See table 9-1 for a summary of system functions. Argument A is held in the floating accumulator, B is the store address given.
Subroutine name Action .AI A: = A + B .AJ A: = A - B .AK A: = A - B .AL A: = A/B .AM A: = B/A .BA A: = -A .BC raise A to an integer exponent .BE raise A to a real exponent
Division by zero sets the floating accumulator to zero. Raising zero to a zero or negative power generates the error message .OTS 15, the expression is set to zero, and execution continues.
1.2.3 The corresponding routines for double precision are also summarised in table 9.1.
Double precision function names must be defined as DOUBLE PRECISION, otherwise wrong mode conversion is used.
The Double precision Package is used in FOCAL.
1.2.4 Example of a complex expression
C = A(SQRT(B + C) - 4.0*A*C)/X
compiles into
JMS* .AG /load B into floating acc. .DSA B JMS* .AI .DSA C /add C JMS* .AH .DSA %RA /intermediate result JMS* SQRT /subroutine calling sequence JMP .+2 .DSA %RA /second intermediate result JMS* .AG .DSA (000003 200000 /floating point 4 JMS* .AK /multiply by A .DSA A JMS* .AK .DSA C JMS* .AM /Reverse subtract .DSA %RB JMS* .AK /as * and / have same priority as left one .DGA A /first JMS* .AL /Divide by X .DSA X JMS* . AH /assignment .DSA C
1.2.4 Functions are supplied by the system for operating on real numbers. See tables 7.1 and 7.2.
The argument for SIN or COS is in radians.
1.3 Logical Arithmetic
The variables are stored in one word, the value .FALSE, is represented as zero, .TRUE. as -1 in two's complement. Mixed modes of integer and logical variables in one expression are faulted by the compiler. The variables may be equivalenced, see Section 7.2.
Comparisons are done by the routines of the arithmetic package, and values assigned by means of conditional skips.
1.3.1 .AND.is compiled as the machine order AND, and .NOT. as the .machine order CMA.
Thus
A.AND..NOT.B
compiles as
LAC B CMA AND A
1.3.2 .OR. . is compiled as SNA, skip on non zero ace. thus
A.OR.B
compiles as
LAC A SNA LAC B
Using CMA, AND and XOR orders alone it would have taken 5 instructions.
LAC B CMA DAC %LA XOR A AND %LA
This is suggested as an algorithm, should the user want to do logical bit manipulation.
No other logical operations are provided.
1.4 Characters
Characters are stored as 5/7 IOPS Ascii in Real variables. Templates may be set up by DATA statements. No more than five characters may be assigned to one variable. The rest of the variable is padded with zeros, This variable is then an acceptable real number.
DATA A/3HABC/
Results in A containing 406050 3000008 (8.24638*1011)
READ(4, 100)A 100 FORMAT(A3) IF(AT.EQ.A) GO TO 20
is meaningful and will work.
1.5 Summary
Dodges suggested,
I = 2*J
goes faster as I= J + J but uses more store.
I = J**2
goes much faster as I= J*J and uses less store.
IT = -K I = IT + J
goes faster than
I = J-K
but uses more store
A= B + B
goes faster than
A = 2.*B
and uses no extra store.
A= B*B
goes faster than
A = B**2
and uses less store
Do not do mode conversion, if you can avoid it.
Arrays are stored in core in increasing addresses in the normal Fortran Layout. A Four word leader is provided for each array, to give information for finding entries in multi-dimensioned arrays.
Entries in Integer or Logical arrays are stored in one word, those in real arrays in two, and in double precision arrays in three.
Up to 3 subscripts are allowed, only simplified integer arithmetic is permitted for these indices.
The routine .SS is used to compute the store location of the indexed-array element.
DIMENSION A(10)
With A real, generates
AT .BLOCK 24 020024 000000 000000 A. .DAS AT /if the array is in common, the most /significant bit of this word is set.
Note: The index register is not used by .SS
These DO loops are non-standard - it is possible for the body of the loop to be executed 0 times.
DO 25 I=J,K,L ------- 25 CONTINUE
generates
LAC J JMP A2 A1 LAC I TAD L A2 DAC I CMA TAD (1 TAD K SPA JMP $n body of loop .25 JMP A1 $n rest of program
Thus if J>K, the loop is skipped.
Logical and Arithmetic IFs compile to give the same number of words, the arithmetic IF runs faster on average ie
IF(I+J) 1,2,1
is slightly faster than
IF(I+J,EQ.0) GO TO 2
There is no difference between writing .GE:, .GT., etc. All compile to use one skip.
An IF on an integer takes 9 or 10 cycles on average, and occupies 7 words.
4.1 Arithmetic IF
IF(C) 10, 20, 30
compiles as
JMS .A6 / load variable .DSA C / LAC* .AB / load most sig. word of mantissa SPA / skip if sign bit not set JMP .10 / jump if negative SNA / skip if non zero JMP .20 / jump if zero JMP .30 / jump if positive
4.2 Logical IF
IF(A.GT.100) GO TO 30
compiles as
JMS* .AG / load DSA A JMS* .AJ /subtract .DSA (000007 310000 /100.0 LAC* .AB /load most sign.word of mantissa SMA!SZA!CLA /skip if acc. neg or zero, clear acc. CLC /set acc. to -1, logical true SNA /skip on non zero acc. JMP $n /jump to next statement JMP .3% /jump if true $n next statement
5. Statement numbers are converted into labels by prefixing the number with a full stop. The label is attached to the first word that is generated from the numbered statement.
There are three types of GOTO statement, unconditional GOTO see Section 5,1, Computed GOTO see Section 5.2, and assigned GOTO see Section 5.3.
5.1 Unconditional GOTO
These are compiled as simple jumps. Thus
GOTO 3(0
becomes
JMP .30
5.2 Computed GOTO
Computed GOTOs compile into a call for the subroutine .GO. Subroutine .GO is 26 words long.
GOTO(10,20,39,40), I
gives
LAC I JMS* .GO .DSA -4 .DSA .10 .DSA .20 .DSA .30 .DSA .40
5.3 Assigned GOTO
The assigned GOTO is designed to be used with the ASSIGN statement which puts statement numbers in integers.
ASSIGN 13 TO KAPPA
compiles to
JMP A1 .DSA .13 A2 LAC A1 DAC KAPPA
The assigned GOTO compiles into an indirect jump. Thus
GOTO KAP,(1,10,20)
compiles as
JMP* KAP
where 1,10,20 must be valid statement numbers, they do not have to be those ASSIGNed to KAP.
6. Introduction
Common statements are converted into linking loader code by the compiler, leaving the problems of cross reference and store allocation, to the Loader.
Blank common is given the name .XX for the linking loader. This name is not recognised by chain.
6.1 Variables in COMMON for a Fortran Program
One word is reserved as a pointer for each variable in COMMON, unless the variable is in an array. Arrays in COMMON have the array descriptor in the program defining the COMMON.
The address of the variable is increased by 500000 if the variable is real or double precision, integers and logical variable are referred to by indirect memory access functions.
6.2 Variables in COMMON for a Macro Program
The name of the common block should be declared as a .GLOBL, and the variable required then found by indexing. Thus
COMMON I,J,K
requires
.GLOBL .XX CLX LAC* .XX,X /to access I AXR 1 LAC* .XX,X /to access J etc
It is possible to access the floating point variable if the floating point subroutines are declared as .GLOBLs. see Chapter 9.
6.3 Problems
Note problem with EQUIVALENCE in Section 7.2
7.1 DATA statements create words with the required contents in the subroutine where the DATA occurs (see Chapter 6.6)
DATA can be used to initialise variables in a named . COMMON via a BLOCK DATA subroutine.
7.2 EQUIVALENCE statements generate one word of storage for each equivalence class. An EQUIVALENCE statement may be used to extend the length of a COMMON block(see Chapter 6.4). There are complex rules about this.
Integers and logical variables may be equivalenced together and share the same location in core.
Real variables may be equivalenced together, as may double precision variables.
Mixing variable lengths, eg equivalencing real and double length variables, results in an irrelevant compile time fault, such as 0IE.
8. Introduction
Fortran programs are made up of independently compiled subroutines (see 7). Each subroutine should not be longer than a 4K page, there is no limit to the number of subroutines.
Parameters are passed at a subroutine or Function call by means of a routine, .DA. The process of calling a subroutine is described in Section 8.1. Functions are described in section 8.2, BLOCK DATA subprograms in Section 8.3, EXTERNAL statements in Section 8.4.
The first location of the compiled code for a subroutine is zero, to store the return address. The subroutine is left by executing RETURN, which generates
JNP SUBR.
where SUBR is the first location described above.
8.1 Calling sequence
8.1.0Introduction
Subroutines or Functions are called identically. Parameters are called by reference, only addresses are passed. No mode checking is carried out. Mixing modes, where the modes have different word lengths will cause obscure run time faults.
Subroutine and Function calls need not define all the parameters specified in the Subroutine or Function definition. .DA always transfers the number of words specified in the definition, but does not alter return addresses. So junk will be passed if required.
Parameters are addressed indirectly within the subroutine.
8.1.1 Code generated
CALL SUBR(A,B,C)
generates
JMS* SUBR /as the subroutine name is an JMP .+4 /external .GLOBL .DSA A .DSA B .DSA C
SUBROUTINE SUBR(A,B,C,D)
generates
SUBR 0 JMS .DA JMP .+5 .DSA A .DSA B .DSA C .DSA D JMP $n /to jump to first executable statement
8.2 Functions
8.2 Introduction
Functions are identical to subroutines, except that integer or logical functions return with the result in the accumulator, while real or double precision functions return with results in the floating accumulator.
Functions must have at least one dummy argument, this does not have to be used.
Thus A= PI() generates
JMS* PI JMP $n .DSA PI + 400000 $n JMS* .AH .DS
8.2.1 Statement Functions
Statement functions compile into normal functions, with the normal parameter passing and result passing mechanisms. However, as their names are not declared as .GLOBL they cannot be called by other subprograms.
8.3 BLOCK DATA
BLOCK DATA subprograms generate the required values, with the linking loader code to ensure that they are correctly relocated.
8.4 EXTERNAL statements
They are a way of passing subprogram names as a parameter. The code generated is identical to any other subroutine call.
9 Input/Output is described in chapter 8.1 starting on page 8-2. Since it is covered in detail , the following are notes only.
9.1 The system permits input and output to any of the positive .DAT slots, and gives OTS10 if any attempt is made to address a negative slot.
If an integer in common is used as a stream number, it is marked with bit 0 of its address set to one to show indirection. The system interprets this as an illegal stream number.
The system could be used by macro programmers, it requires 47008 locations in total.
9.2 Unformatted READ and WRITE generate IOPS binary records with sequence numbers, immediately after the header pair.
9.3 The system will not transfer data in exotic codes.
9.4 It would seem simpler, and less space consuming to write our own input/output routines, and make no use of FORTRAN's second level of device independence.
9.5 There is no way of checking if end of file, or parity fault occurs in the current record.
Illegal construction of statement, illegal equivalence relationships, illegal common declaration or non-common storage declared in a block data subprogram.
D ERRORS (DO LOOP ERROR)
Illegal DO construction or illegal statement terminating a DO Loop.
E ERRORS (FUNCTION/SUBROUTINE/EXTERNAL/CALL STATEMENT ERROR)
Illegal use of function/subroutine name, out of order, or illegal variable for external declaration.
F ERROR ( FORMAT STATEMENT ERROR)
Illegal format specification or illegal construction of format statement.
H ERRORS (HOLLERITH ERROR)
Hollerith data illegal in this statement or illegal use of Hollerith constant.
I ERRORS (CHARACTER/STATEMENT/TERM ERROR)
Illegal character, unrecognizable statement, illegal statement for program type, statement out of order or improper statement preceding end statement.
L ERRORS (NESTING ERROR)
Illegal nesting or DO nesting too deep.
M ERRORS (MAGNITUDE ERROR)
Program exceeds one core bank, Maximum number of dummy arguments or equivalence classes exceeded, or constant/variable exceeds specified limits.
N ERRORS (STATEMENT NUBER ERROR)
Phase error, number more than 5 digits, no statement number where one is required, statement should not be labelled or statement numbers defined twice.
S ERRORS (ARGUMENT/SUBSCRIPT ERROR)
Missing argument or subscript, illegal use of subscripts, illegal construction of a subscripted variable, more than 3 subscripts or stated number of subscripts does not agree with declared number.
T ERRORS (TABLE OVERFLOW)
Symbol/Constant or Argument/Operator Table Limits Exceeded.
V ERRORS (VARIABLE/CONSTANT MODE ERROR)
Illegal mode mixing, missing constant, variable or exponent, or illegal matching of constants or variables in a data statement.
X ERRORS ( SYNTAX ERROR) : STATEMENT CANNOT BE RECOGNISED AS A PROPERLY CONSTRUCTED FORTRAN IV STATEMENT.