Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 15

Release notes for 64 bit FTN95

==============================

Introduction
-----------------------------------------------------------------------------------
----------------------------------------

FTN95 creates 64 bit executables and DLLs when:


a) the option /64 is used on the FTN95 command line,
b) SLINK64 is used in place of SLINK,
c) salflibc64.dll and clearwin64.dll are used in place of salflibc.dll,
d) SDBG64 is used in place of SDBG.

FTN95
-----------------------------------------------------------------------------------
----------------------------------------
This release includes 64 bit checking options such as /check and /checkmate.

FTN95 can also check winio@ arguments at compile time as an alternative to the
existing ClearWin+ library checks at run time.
The FTN95 command line option /CHECK_WINIO provides this argument checking at
compile time.

A 64 bit debugger called SDBG64 that can be used together with /debug etc. on the
FTN95 command line.

The FTN95 command line option /OPTIMISE is provided. This gives an initial set of
optimisations. More may be added at a later
date.

The FTN95 WINAPP directive is not always effective for 64 bit executable. When
using multiple source files the explicit SLINK64
command 'windows' is required.

Extended precision (REAL*10) is not available when creating 64 bit applications.

SLINK64
-----------------------------------------------------------------------------------
----------------------------------------
SLINK64 can be used in
a) command line mode
b) interactive mode or
c) script file mode

Here is an example of using command line mode...


FTN95 prog.f95 /64
FTN95 sub.f95 /64
SLINK64 prog.obj sub.obj /file:prog.exe

Here is an example of using interactive mode...


SLINK64
$ lo prog.obj
$ lo sub.obj
$ file prog.exe

Here is an example of using a script file...


SLINK64 @prog.inf

where prog.inf contains...


lo prog.obj
lo sub.obj
file prog.exe

For further information see below or type...


SLINK64 /help

SLINK64 automatically scans commonly used Windows DLLs. If a Windows function


located in (say) xxx.dll is
reported as missing then the DLL should be loaded by using a script command of the
form

lo C:\Windows\Sysnative\xxx.dll

where C:\Windows illustrates the value of the %windir% environment variable.

Note that SLINK64 can construct executables and DLLs and also static libraries. See
FTN95.chm for details.

Note that SLINK64 will usually accept the SLINK form for commands. For example
"dll", "archive" and "addobj" may also
be used with SLINK64.

SDBG64
-----------------------------------------------------------------------------------
----------------------------------------
The 64 bit debugger SDBG64 is provided. It operates in essentially the same way as
the corresponding 32 bit debugger.

ClearWin+
-----------------------------------------------------------------------------------
----------------------------------------
64 bit ClearWin+ was previously available for use with third party compilers via
clearwin64.dll. This DLL has
been extended for use with 64 bit FTN95. Users who have already adapted their code
for use with third-party compilers
can continue to use their modified code. Alternatively native FTN95/ClearWin+ code
can be used without change apart from the
following exceptions.

64 bit Microsoft Windows HANDLEs are addresses (64 bit integers). So if a Windows
handle is used explicitly in Fortran code, it
will currently appear as a 32 (KIND=3) integer and must become a 64 bit (KIND=4)
integer for 64 bit applications. FTN95 has a
special KIND value (7) that is interpreted as KIND=3 for 32 bit applications and
KIND=4 for 64 bit applications. Alternatively
INTEGER(KIND=CW_HANDLE) can be used together with standard INCLUDE and MOD files
because CW_HANDLE is defined as a parameter
with value 7. Windows HANDLEs are mainly used with %lc, %hw and some direct calls
to the Windows API. For further information
see the end of this file.

The winio@ edit control %eb, when used with a grave accent, requires a user-
supplied "edit_info" array or TYPE. For 64 bit
programs, some of the items in the array form become 64 bit addresses so for 64 bit
applications users are advised to use only
TYPE(edit_info) which is now included in the standard INCLUDE files (such as
clearwin.ins) and the associated modules.
For example:
INCLUDE <clearwin.ins>
INTEGER i
INTEGER(2)::handle,err_code
TYPE(edit_info) info
i=winio@('%60.20`eb','*',0,info)
IF(info%modified > 0)THEN
i=winio@('%cnSave changes?%2nl%6bt[Yes] %6bt[No]')
IF(i.EQ.1)THEN
CALL openw@('myfile.txt',handle,err_code)
CALL writef@(CCORE1(info%buffer),handle,info%buffer_size,err_code)
ENDIF
ENDIF
END

The function CLEARWIN_INFO@ now returns an INTEGER(KIND=7) value.

A native form of the winio@ graph control %pl is now available for both 32-bit and
64-bit applications. Details can be
found in ClearWin+ enhancements file cwplus.enh and also in the main help file
ftn95.chm.

A few very old (DBOS) graphics routines have not been ported to 64 bit ClearWin+.
Other DBOS routines are available (e.g.
DRAW_TEXT@) but now require an explicit interface that can be found in the INCLUDE
file called dbos.ins.

The function cpu_clock@ is not available for 64 bit applications and has been
replaced by rdtsc_val@...
INTEGER(KIND=4) FUNCTION RDTSC_VAL@()

The following undocumented Win32 routines have been ported to 64 bits: FPOSLONG@,
RFPOSLONG@ and FPOS_EOFLONG@.
These are similar to FPOS@, RFPOS@ and FPOS_EOF@ but take INTEGER*8 position
arguments. Note that other arguments
are INTEGER*2.

SRC
-----------------------------------------------------------------------------------
----------------------------------------
Use the command line option /r for 64 bit applications and link the resulting .res
file
(together with the .obj files) via SLINK64.

Current experience suggests that using the "default.manifest" in a resource script


causes the resulting 64 bit application
to fail to load. However, a user supplied manifest file can improve the appearance.
The text of a suitable manifest file is
presented below.

RESOURCES
-----------------------------------------------------------------------------------
----------------------------------------
A RESOURCES directive can be used at the end of a 64 bit Fortran main program but
it only has effect when used with FTN95
command line options /LINK or /LGO. Otherwise a separate call to SRC is required.
(For Win32 main programs, FTN95 automatically adds the resources to the main object
file).
Silverfrost INCLUDE and MOD files
-----------------------------------------------------------------------------------
----------------------------------------
Silverfrost INCLUDE files have been modified so that Microsoft HANDLEs have type
INTEGER(KIND=7).
Silverfrost MOD files can be used without change provided they are updated to those
in this release.

Note that user FTN95 MOD files for 64 bit applications may differ from those for 32
bit applications. So FTN95 uses the
extension .mod64 for 64 bit MOD files whilst retaining the extension .mod for 32
bit MOD files. The corresponding
object files always differ and the respective linker (SLINK or SLINK64) will reject
object files of the wrong kind.

By default FTN95 uses the extension .obj for both 64 bit and 32 bit object files.
For projects, both Plato and Visual
Studio retain the default extension and use a system of sub-folders in order to
create executables for different
platforms (such as Win32, x64 and .NET) and for differenct configurations (such as
Debug, CheckMate and Release).

Users who prefer to build their applications using batch and/or makefiles can adopt
a similar sub-folder approach to
that used by Plato and Visual Studio. Alternatively, 64 bit object files can be
given a different extension (e.g. .o64)
by using /BINARY (together with the object file name) on the FTN95 command line. In
that way, 64 bit and 32 bit object
files could reside in the same folder.

Plato
-----------------------------------------------------------------------------------
----------------------------------------
Plato is now provided in the form of a 64 bit executable and is configured by
default to use FTN95 when you select
"x64" on the main toolbar. An earlier release used gFortran by default. This
default can be changed from the "Settings" dialog.

64 bit Plato contains a built-in version of SDBG64 and this is used by default when
debugging. Alternatively Plato can be
configured (via the "Settings" dialog) to launch SDBG64 as an external application.

SSE and AVX support


-----------------------------------------------------------------------------------
----------------------------------------
FTN95 /64 creates machine code that makes some use of the SSE and AVX instruction
sets
(see https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions). Users can also
provide direct SSE/AVX support via CODE/EDOC
statements in their code (see below for further details).

Four "BLAS" type library routines (DOT_PRODUCT8@,DOT_PRODUCT4@,AXPY8@ and AXPY4@)


are also provided and these make direct
use of the SSE/AVX instruction sets. In addition, the library function USE_AVX@ can
be called in order to instruct these
routines to use AVX rather than SSE when the CPU and operating system make this
possible.

REAL*8 FUNCTION DOT_PRODUCT8@(x,y,n)


REAL*8 x(n),y(n)
INTEGER*8 n

REAL*4 FUNCTION DOT_PRODUCT4@(x,y,n)


REAL*4 x(n),y(n)
INTEGER*8 n

SUBROUTINE AXPY8@(y,x,n,a)
REAL*8 x(n),y(n),a
INTEGER*8 n
(Y = Y + A*X)

SUBROUTINE AXPY4@(y,x,n,a)
REAL*4 x(n),y(n),a
INTEGER*8 n
(Y = Y + A*X)

INTEGER FUNCTION USE_AVX@(level)


INTEGER level
(Set level = 0 for SSE. Set level = 1 for AVX. The function returns the level that
will be used by the current CPU/OS.
The default level is 1 which means that AVX will be used when available otherwise
SSE. If USE_AVX@(1) is called before
an ALLOCATE statement then the resultant addresses will be 32 byte aligned. The
USE_AVX@ level must be the same at
a corresponding DEALLOCATE.)

For example:

INTEGER(4),PARAMETER::n=100
REAL(2) DOT_PRODUCT8@,prod,x(n),y(n)
INTEGER USE_AVX@,level
! x = ...; y = ...
level = USE_AVX@(0)
prod = DOT_PRODUCT8@(x,y,n)

Redistributing
-----------------------------------------------------------------------------------
----------------------------------------
Like salflibc.dll, salflibc64.dll and clearwin64.dll can be freely redistributed
with your applications and DLLs.

Additional notes on porting from 32 bit to 64 bit applications


-----------------------------------------------------------------------------------
----------------------------------------
1) When using the standard Fortran SIZE intrinsic, FTN95 with /64 returns a 64 bit
integer despite the fact that this
is not strictly Standard conforming. In certain very special circumstances, this
change can cause existing code to fail.
For example, failure will occur if SIZE(x) appears as the value of an argument to
an overloaded subprogram (i.e. a
subprogram that has various definitions depending on the types of its arguments). A
new command line option /SIZE32
is provided in order to resolve this conflict.

2) It is possible that there may a some slight loss of precision when porting from
32 bit to 64 bit calculations. This
is mainly because some FTN95 32 bit mode floating point calculations actually use
hidden extended precision on the way
to producing double or single precision results. It is therefore possible that the
process of porting to 64 bits may
expose a numerically unstable calculation (i.e. one that depends critically on the
level of round-off error). In the same
way, in extreme cases it is possible that new exceptions may appear at runtime due
to floating point overflow. Overflow
can occur directly or as the result of dividing by a value that has underflowed to
zero. In some cases it is possible to
resolve these issues by using a scaling factor in the calculations.

Further information about SLINK64


-----------------------------------------------------------------------------------
----------------------------------------

The SINK64 command line


-----------------------

SLINK64 can be used in 3 ways...

1) It can use a series of commands from a file (recommended). The commands are
placed in a file with the .inf or .link
suffix, and is invoked thus:

SLINK64 file.inf

2) It can be used interactively, using the same commands as in (1).

3) It can be used from the command line. This can be derived from the command
specifications. Thus the command
lo <obj file> can be coded on the command line as /lo <obj file>

SLINK64 commands
----------------

load(lo) <file> - Loads the file, which must be FTN95/SCC 64-bit object
code.

map <file> - Requests a link map, to be placed in the specified file.


If the file argument is
omitted, the map is placed in a file whose name is
derived from the name of the DLL
or EXE file being created.

file <exe or dll file> - Completes the linking operation and puts the result in
the given file name. Note that
the choice of suffix (DLL or EXE) determines the type of
file created. Currently all
entry points in the code are exported in the case of a
DLL.

windows - This command forces the creation of a WINDOWS


application, which does not use a console.
This is normally used in conjunction with ClearWin+
code.

load(lo) <file.dll> - Uses the entry points in the specified DLL to satisfy
calls in the code. The DLL must
be avaiable at run-time.
load(lo) <file.res> - Loads a resource file created with SRC using the /r
switch. This is the same SRC command
used in 32-bit mode except that the /r switch must be
used.

image_base <hex address> - Specifies the base address for the link (not normally
required, and can be overwritten
at run time by Windows).

stack_size <hex number> - Specifies the stack size. The default value is 0x1000000
(16 MB).

alias <name> <alias> - Sets up an alias to an external name when making a DLL.
Note that the names are case sensitive.
This was added to enable gFortran to call a DLL built
with FTN95. It circumvents the problem
that gFortran uses lower case names while FTN95 uses
upper case names! It may have other
specialised uses.

help - Prints out abbreviated help information to the console.

quit(q) - Quits SLINK64 without saving anything.

Typical use
-----------

SLINK64 is automatically called when /link or /lgo is used on the FTN95 command
line. The name of the
executable or DLL can be supplied after /link (this is optional for executables but
mandatory for DLLs).
Also /stack can be included followed by the stack size as a number of megabytes.
/map can also be used
in this context.
The WINAPP directive in the Fortran code creates a Windows application and this
directive can optionally
be followed by the name of a resource script. Alternatively a resource script can
be included by placing
the script after the main program by using the RESOURCES directive.

Here are the required SLINK64 commands for three slightly more complicated
scenarios:

1) To link a simple program that uses a DLL:


The file (say) ExtraDLL.dll is scanned for entry points but it isn't
incorporated in
MyProgram.exe - so MyProgram.exe will require the DLL somewhere on the path at
runtime.

lo MyProgram.obj
lo ExtraDLL.dll
file MyProgram.exe

2) To link a number of files to create a DLL that exports all subroutine/function


names:

lo file1.obj
lo file2.obj
lo file3.obj
file MyLibrary.dll

3) To create a windows program with some ClearWin+ code that uses resources:

The resources are prepared by:

SRC MyResources.rc /r

Then the slink commands are:

lo MyProgram.obj
lo MyResources.res
windows
file MyProgram.exe

4) For details on creating static libraries see ftn95.chm.

Further general information about 64 bit FTN95


-----------------------------------------------------------------------------------
----------------------------------------
Programs compiled with FTN95 using the /64 option, use the AMD64 instruction set
(subsequently adopted by Intel, and
referred to as x64 or x86_64) which is almost universally available on modern PC's.
This code cannot be mixed
with legacy 32-bit code, nor can it access legacy 32-bit DLL's. 64-bit object files
must be linked using the new
utility SLINK64. This object file format is incompatible with all third-party link
utilities.

The default size of INTEGER variables remains unchanged (2^31-1), so INTEGER*8 (8-
byte) variables
must be used to index extremely large arrays. These variables are implemented in a
more efficient and natural way
in 64 bits. Note that some arrays that would not fit in the old 4GB limit may still
be indexable using default sized integers,
for example a REAL*8 array of 2,000,000,000 elements would occupy nearly 16GB of
memory, but could be
indexed using default integers.

The main value of 64-bit compilation is that the available address space has
increased from 4GB to
approximately 1.8 x 10^19 bytes! This means that for the foreseeable future
(possibly forever!), the size of programs
will be limited only by the amount of physical memory available on a system.

Arrays that are ALLOCATEd, or which are in COMMON or in MODULEs can exceed the 4GB
limit, except that initialised
arrays must fit within the .EXE or .DLL file to which they belong, and the the size
of these files cannot extend beyond
the 4GB limit. This is a Microsoft limit, but is fairly reasonable, since the time
needed to load a 4GB file would be excessive!

COMMON blocks and MODULE arrays are allocated dynamically as a program starts in
order to enjoy no 4GB restrictions.
This is applied to all such storage blocks, because a program may exceed the 4GB
limit even though each individual array lies
within this limit.

Local arrays (static or dynamic) are restricted as in 32 bits. This is because it


is not feasible to extend the hardware
stack to sizes > 4GB, and SAVE'd variables must fit within the EXE or DLL file to
which they belong. Users who require a very
large local array, should put it in a COMMON block or MODULE referenced by only the
one routine.

Since the code can be distributed across multiple DLL's plus an EXE file, the code
itself is also not limited to
4GB - although this is not usually a serious concern.

The various 64-bit Windows operating systems provide less than the full 1.8 x 10^19
address space, and the size
of this space varies somewhat with the available physical memory on the system.
Nevertheless, these limits are very generous
and will increase as physical memory becomes more plentiful. In part, these limits
are due to the fact that the
paging mechanism itself requires memory.

For further information see...


https://msdn.microsoft.com/en-us/library/aa366778.aspx

The pair of DLL's SALFLIBC64.DLL and CLEARWIN64.DLL in 64 bits take the place of
the 32 bits SALFLIBC.DLL. Currently CLEARWIN64.DLL
(which contains much more than ClearWin+) is compiled with Microsoft C++. In the
future this may be absorbed into SALFLIBC64.DLL
but will remain independent for use with third-party compilers.

Perhaps surprisingly, FTN95.exe and SLINK64.exe are 32-bit executables, and so


still require access to SALFLIBC.DLL at compile time.

Note that the extra executables and DLL's to support 64-bit mode can coexist with
those that support 32-bit operations
because they have different names.

Contents of a clrwin.manifest file...


-----------------------------------------------------------------------------------
----------------------------------------

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0">


<trustInfo xmlns="urn:schemas-microsoft-com:asm.v2">
<security>
<requestedPrivileges>
<requestedExecutionLevel level="asInvoker" uiAccess="false"/>
</requestedPrivileges>
</security>
</trustInfo>
<dependency>
<dependentAssembly>
<assemblyIdentity type="Win32" name="Microsoft.Windows.Common-Controls"
version="6.0.0.0"
processorArchitecture="*" publicKeyToken="6595b64144ccf1df"
language="*"/>
</dependentAssembly>
</dependency>
</assembly>

CIF conditional compilation


-----------------------------------------------------------------------------------
-----------------------------------------
_WIN32 and _WIN64 are predefined for use with CIF. For example:

CIF(_WIN64)
k = 64
CELSE
k = 32
CENDIF
print*, k

will print "64" if /64 (with /fpp) is used on the FTN95 command line.
This is particularly useful with CODE/EDOC blocks. In other contexts it is possible
to use an equivalent run-time condition...

IF(KIND(1_7) == 4)THEN
k = 64
ELSE
k = 32
ENDIF
print*, k

64-bit CODE/EDOC in FTN95


-----------------------------------------------------------------------------------
-----------------------------------------

The AMD 64-bit architecture


---------------------------
This architecture was invented by AMD, and was later adopted by by Intel when their
own Itanium 64-bit architecture
was not received with enthusiasm. Intel use the term x86-64. It is the basis of
most modern PCs, and is targeted by
FTN95 when the /64 switch is used.

The AMD 64-bit architecture has 16 general purpose integer registers:

RAX, RCX, RDX, RBX, RSP, RBP, RSI, RDI, R8, R9, R10, R11, R12, R13, R14, R15.

The bottom eight registers correspond to the 32-bit register set, and retain some
of the same functionality.
Thus RSP is the stack pointer and descends as the stack expands, RCX, RSI and RDI
are used for string operations just
as they are in 32-bits, and RAX is used by convention to return integer function
values. RBP does not correspond in
function to EBP, however it is given a special function in Silverfrost code
(explained later), and should not be modified
in normal circumstances.

All these registers hold 64 bits (8 bytes) and can therefore hold a pointer to
anywhere in the 64-bit address space.

64-bit programs can access two sets of different floating point registers - the old
floating point stack of eight 80-bit
registers, and a set of registers designated XMM0 - XMM15, and known as the SSE
registers. These registers can hold
multiple values simultaneously - foour REAL*4 floating point values, or two REAL*8
values. They can also hold integer values.
Thus these registers are 16 bytes in width. These registers do not 'know' what data
they contain - so it is up to the
programmer to keep track. In particular, if you load a REAL*8 value into an XMM
register and wish to store it as a REAL*4,
you must first use the appropriate conversion instruction.

Strangely, the old coprocessor stack instructions, do offer some functionality that
is not present in the newer SSE
instruction set - for example SIN and COS can be evaluated in one instruction.

Silverfrost CODE/EDOC conventions


---------------------------------
Let us start with a simple executable example of a 64-bit CODE/EDOC sequence that
simply sums a vector of REAL*8 values.
It is not meant to be optimal because it does not use the parallel execution
facilities of the SSE registers.

REAL*8 vec(3),ans
DATA vec/3.0d0,4.0d0,5.0d0/
CALL sum(vec,3,ans)
PRINT*,ans
END

SUBROUTINE sum(vec,n,ans)
INTEGER n
REAL*8 vec,ans
CODE
MOV_Q RDX,=VEC ! The '=' denotes a (non-immediate) constant or, as in
this case, the address of an argument
MOV_Q R14,=N ! Remember all addresses are 64-bit - hence the use of
MOV_Q
MOVSX_Q R14,[R14] ! Instructions and register names are case insensitive
! N is only a 32-bit integer, so it is sign extended to
64 bits
XORPD XMM0,XMM0 ! This is one way to zeroise an XMM register it does a
bitwise exclusive OR
1 ADDSD XMM0,[RDX]
ADD_Q RDX,8 ! This uses an immediate constant
DEC_Q R14
JNE $1 ! Labels are denoted by a '$'
MOV_Q RCX,=ans
MOVDQU [RCX],XMM0 ! Store away the accumulated answer in the argument ANS
EDOC
END

This illustrates a variety of points

1) The instructions that operate on the integer registers can operate on 1, 2,


4, or 8 byte operands. These are
distinguished by a suffix, thus the MOV instruction takes the forms MOV_B,
MOV_H, MOV, MOV_Q.

2) Unlike the 32-bit code/edoc, the register name does not change when the
operation operates on a smaller number
of bytes.

3) Operations that work on 4 bytes of a register (MOV, ADD, etc) also clear the
upper 4 bytes of the register,
whereas 2-byte and 1-byte instructions do not change the other bytes of the
register. This is a feature of
the hardware, not a Silverfrost convention.

4) Labels are prefixed by a '$' when used, just as is the case in 32-bit mode.

5) When accessing a Fortran argument, you need to first access its address (an
8-byte quantity). The notation =N
is used to access the address of argument N. The '=' notation can also be
used to address a constant in memory,
for example:
MOVSD XMM3,=2.0d0

6) The MOVSX_Q instruction sign extends a 32-bit integer to 64-bits. In


situations where a number is known to be
non-negative. This extension can be obtained for free using point 3 above.

In general a good way to learn to write instructions inside CODE/EDOC is to compile


simple code samples with the /EXPLIST
option, which will display the instructions generated by the compiler line by line
in essentially the same format that
you will use.

Referencing COMMON, MODULE, and ALLOCATE'd variables


----------------------------------------------------
Because most COMMON blocks are allocated as the program starts up (as are large
arrays in MODULE's) the simplest way to
access these objects, as well as explicitly ALLOCATE'd arrays, is to take their
address before entering the CODE/EDOC.
For example:

COMMON/FRED/alpha,beta(100),gamma
INTEGER*8 alpha,beta,gamma
INTEGER*8 addressof_beta
addressof_beta=loc(beta)
CODE
MOV_Q R10,addressof_beta
MOV_Q [R10+8],42 !This sets beta(2) to the value 42

The 64-bit address space


------------------------
The 32-bit address space provided a theoretical maximum 2^32 (4 x 10^9) addressable
bytes. Correspondingly,
the 64-bit address space offers a theoretical maximum 2^64 (1.8 x 10^19)
addressable bytes. This means that, rather like
in the early days of the 32-bit architecture, when a typical computer might have
vastly less than 2^32 bytes (4 GB) of
memory, the virtual address space is only very sparsely populated.

Indeed, the 64-bit virtual address space is so large that it isn't possible to
provide page tables to cover the
address space. This means that the amount of virtual address space available to a
program is determined in a way
that depends on the version of Windows in use, and the total amount of main memory
on the computer (say 16 GB).
This number is still extremely large. However, it is relevant if you use calls to
VirtualAlloc to access high memory
addresses in an absolute way.
Using the SSE registers for parallel computation
------------------------------------------------
Instructions like MOVDQA will load a pair of REAL*8 numbers into an XMM register.
Since these numbers are just bits,
the instruction can also be used to move four REAL*4 numbers into an XMM register.
However this instruction will fault
if the data is not 16-bit aligned. This is problematic because REAL*4 and REAL*8
numbers are aligned wherever possible
(EQUIVALENCE can prevent alignment) to 4 and 8 bytes respectively. In practice it
turns out that the MOVDQU (which is
reputed to be slower than MOVDQA) seems to run at the same speed for aligned data,
and only somewhat slower for non-aligned
data, but generates no alignment faults.

It is also worth reading this discussion about alignment issues:

http://lemire.me/blog/2012/05/31/data-alignment-for-speed-myth-or-reality/

ClearWin+ and INTEGER(7) values


-----------------------------------------------------------------------------------
-----------------------------------------

Format codes that output a Microsoft integer(7) handle


------------------------------------------------------
%hw handle of current window
%lc handle of previous control
%`cw insert ClearWin window

Format codes that require a Microsoft integer(7) handle as input


---------------------------------------------------------------
%fh use a Windows API font
%`bm draw a bitmap
%`cu cursor
%dw owner draw graphics
%`mi caption icon
%nw new window

ClearWin+ functions returning integer(7) values


-----------------------------------------------
ADD_GRAPHICS_ICON@
CLEARWIN_INFO@
CREATE_CURSOR@
CREATE_BITMAP@
CREATE_ICON@
CREATE_SHARED_MEMORY@
CREATE_WINDOW@
DOWNLOAD@
GET_BITMAP_DC@
GET_GRAPHICS_DC@
GET_WINDOW_HANDLE@
HANDLE_FROM_CONTROL_NAME@
IDENTIFY_WINDOW_HANDLE@
IMPORT_BMP@
IMPORT_GIF@
IMPORT_PCX@
MAKE_BITMAP@
MAKE_ICON@
MAP_FILE_FOR_READING@
MAP_FILE_FOR_READ_WRITE@
MOVIE_PLAYING@
OPEN_INTERPROCESS_SHAREMEM@
OPEN_WAV_FILE_READ@
OPEN_WAV_FILE_WRITE@
SET_DEFAULT_WINDOW@
START_THREAD@
WINDOWS_INSTANCE@
WINDOW_HANDLE@

ClearWin+ subroutines returning an integer(7) value via its argument


--------------------------------------------------------------------
GET_CURRENT_DC@

ClearWin+ subprograms with integer(7) input arguments (the first argument except
where stated)
-----------------------------------------------------------------------------------
-----------
ADD_ACCELERATOR@
ADD_ACCELERATOR1@
ADD_TEXT_DESCRIPTOR@ (first two arguments)
ATTACH_BITMAP_PALETTE@ (both arguments)
CHANGE_BUTTON_TEXT@
CHANGE_HYPERTEXT@
CHANGE_PEN@
CLEAR_BITMAP@
IDENTIFY_WINDOW_HANDLE@
RELEASE_BITMAP_DC@
REMOVE_ACCELERATOR@
REMOVE_BITMAP@
REMOVE_ICON@
REPLACE_ENHANCED_MENU@
REPLACE_ENHANCED_POPUP_MENU@
RESIZE_WINDOW@
SELECT_GRAPHICS_OBJ_BY_WINDOW@
SELECT_PALETTE@ (second argument)
SET_CONTROL_VISIBILITY@
SET_HIGHLIGHTED@
SET_MAX_LINES@
SET_MOUSE_CURSOR_POSITION@
UPDATE_WINDOW@

Trapping and counting underflows


--------------------------------
By default, floating point underflows do not raise a runtime exception. An initial
call to PERMIT_UNDERFLOWS@(0_2) will cause
underflows to be trapped.

SUBROUTINE PERMIT_UNDERFLOW@(opt)
INTEGER(2),INTENT(IN)::opt

Set opt=0 to trap underflows otherwise they are enabled.

When underflows are permitted, the total number of underflows can be returned by
calling UNDERFLOW_COUNT@ but only after calling
SET_SOFTWARE_UNDERFLOWS@(1) which adds a significant overhead and should not be
used in production mode.
SUBROUTINE UNDERFLOW_COUNT@(count)
INTEGER,INTENT(OUT)::count

SUBROUTINE SET_SOFTWARE_UNDERFLOWS@(opt)
INTEGER,INTENT(IN)::opt

opt = 0 Use hardware to convert underflows to zero.


opt = 1 Use software to handle underflows - enabling underflow counting.
opt = 2 Treat underflows as errors.

The 32 bit routine MASK_UNDERFLOWS@ is not implemented for 64 bits.

You might also like