4 files changed, 322 insertions, 0 deletions
diff --git a/Documentation/arm/nwfpe/NOTES b/Documentation/arm/nwfpe/NOTES
new file mode 100644
index 000000000000..40577b5a49d3
--- /dev/null
+++ b/Documentation/arm/nwfpe/NOTES
@@ -0,0 +1,29 @@
+There seems to be a problem with exp(double) and our emulator.  I haven't
+been able to track it down yet.  This does not occur with the emulator
+supplied by Russell King.
+I also found one oddity in the emulator.  I don't think it is serious but
+will point it out.  The ARM calling conventions require floating point
+registers f4-f7 to be preserved over a function call.  The compiler quite
+often uses an stfe instruction to save f4 on the stack upon entry to a
+function, and an ldfe instruction to restore it before returning.
+I was looking at some code, that calculated a double result, stored it in f4
+then made a function call. Upon return from the function call the number in
+f4 had been converted to an extended value in the emulator.
+This is a side effect of the stfe instruction.  The double in f4 had to be
+converted to extended, then stored.  If an lfm/sfm combination had been used,
+then no conversion would occur.  This has performance considerations.  The
+result from the function call and f4 were used in a multiplication.  If the
+emulator sees a multiply of a double and extended, it promotes the double to
+extended, then does the multiply in extended precision.
+This code will cause this problem:
+double x, y, z;
+z = log(x)/log(y);
+The result of log(x) (a double) will be calculated, returned in f0, then
+moved to f4 to preserve it over the log(y) call.  The division will be done
+in extended precision, due to the stfe instruction used to save f4 in log(y).
diff --git a/Documentation/arm/nwfpe/README b/Documentation/arm/nwfpe/README
new file mode 100644
index 000000000000..771871de0c8b
--- /dev/null
+++ b/Documentation/arm/nwfpe/README
@@ -0,0 +1,70 @@
+This directory contains the version 0.92 test release of the NetWinder 
+Floating Point Emulator.
+The majority of the code was written by me, Scott Bambrough It is
+written in C, with a small number of routines in inline assembler
+where required.  It was written quickly, with a goal of implementing a
+working version of all the floating point instructions the compiler
+emits as the first target.  I have attempted to be as optimal as
+possible, but there remains much room for improvement.
+I have attempted to make the emulator as portable as possible.  One of
+the problems is with leading underscores on kernel symbols.  Elf
+kernels have no leading underscores, a.out compiled kernels do.  I
+have attempted to use the C_SYMBOL_NAME macro wherever this may be
+important.
+Another choice I made was in the file structure.  I have attempted to
+contain all operating system specific code in one module (fpmodule.*).
+All the other files contain emulator specific code.  This should allow
+others to port the emulator to NetBSD for instance relatively easily.
+The floating point operations are based on SoftFloat Release 2, by
+John Hauser.  SoftFloat is a software implementation of floating-point
+that conforms to the IEC/IEEE Standard for Binary Floating-point
+Arithmetic.  As many as four formats are supported: single precision,
+double precision, extended double precision, and quadruple precision.
+All operations required by the standard are implemented, except for
+conversions to and from decimal.  We use only the single precision,
+double precision and extended double precision formats.  The port of
+SoftFloat to the ARM was done by Phil Blundell, based on an earlier
+port of SoftFloat version 1 by Neil Carson for NetBSD/arm32.
+The file README.FPE contains a description of what has been implemented
+so far in the emulator.  The file TODO contains a information on what 
+remains to be done, and other ideas for the emulator.
+Bug reports, comments, suggestions should be directed to me at
+<scottb@netwinder.org>.  General reports of "this program doesn't
+work correctly when your emulator is installed" are useful for
+determining that bugs still exist; but are virtually useless when
+attempting to isolate the problem.  Please report them, but don't
+expect quick action.  Bugs still exist.  The problem remains in isolating
+which instruction contains the bug.  Small programs illustrating a specific
+problem are a godsend.
+Legal Notices
+-------------
+The NetWinder Floating Point Emulator is free software.  Everything Rebel.com
+has written is provided under the GNU GPL.  See the file COPYING for copying
+conditions.  Excluded from the above is the SoftFloat code.  John Hauser's 
+legal notice for SoftFloat is included below.
+-------------------------------------------------------------------------------
+SoftFloat Legal Notice
+SoftFloat was written by John R. Hauser.  This work was made possible in
+part by the International Computer Science Institute, located at Suite 600,
+1947 Center Street, Berkeley, California 94704.  Funding was partially
+provided by the National Science Foundation under grant MIP-9311980.  The
+original version of this code was written as part of a project to build
+a fixed-point vector processor in collaboration with the University of
+California at Berkeley, overseen by Profs. Nelson Morgan and John Wawrzynek.
+THIS SOFTWARE IS DISTRIBUTED AS IS, FOR FREE.  Although reasonable effort
+has been made to avoid it, THIS SOFTWARE MAY CONTAIN FAULTS THAT WILL AT
+TIMES RESULT IN INCORRECT BEHAVIOR.  USE OF THIS SOFTWARE IS RESTRICTED TO
+PERSONS AND ORGANIZATIONS WHO CAN AND WILL TAKE FULL RESPONSIBILITY FOR ANY
+AND ALL LOSSES, COSTS, OR OTHER PROBLEMS ARISING FROM ITS USE.
+-------------------------------------------------------------------------------
diff --git a/Documentation/arm/nwfpe/README.FPE b/Documentation/arm/nwfpe/README.FPE
new file mode 100644
index 000000000000..26f5d7bb9a41
--- /dev/null
+++ b/Documentation/arm/nwfpe/README.FPE
@@ -0,0 +1,156 @@
+The following describes the current state of the NetWinder's floating point
+emulator.
+In the following nomenclature is used to describe the floating point
+instructions.  It follows the conventions in the ARM manual.
+<S|D|E> = <single|double|extended>, no default
+{P|M|Z} = {round to +infinity,round to -infinity,round to zero},
+          default = round to nearest
+Note: items enclosed in {} are optional.
+Floating Point Coprocessor Data Transfer Instructions (CPDT)
+------------------------------------------------------------
+LDF/STF - load and store floating
+<LDF|STF>{cond}<S|D|E> Fd, Rn
+<LDF|STF>{cond}<S|D|E> Fd, [Rn, #<expression>]{!}
+<LDF|STF>{cond}<S|D|E> Fd, [Rn], #<expression>
+These instructions are fully implemented.
+LFM/SFM - load and store multiple floating
+Form 1 syntax:
+<LFM|SFM>{cond}<S|D|E> Fd, <count>, [Rn]
+<LFM|SFM>{cond}<S|D|E> Fd, <count>, [Rn, #<expression>]{!}
+<LFM|SFM>{cond}<S|D|E> Fd, <count>, [Rn], #<expression>
+Form 2 syntax:
+<LFM|SFM>{cond}<FD,EA> Fd, <count>, [Rn]{!}
+These instructions are fully implemented.  They store/load three words
+for each floating point register into the memory location given in the 
+instruction.  The format in memory is unlikely to be compatible with
+other implementations, in particular the actual hardware.  Specific
+mention of this is made in the ARM manuals.  
+Floating Point Coprocessor Register Transfer Instructions (CPRT)
+----------------------------------------------------------------
+Conversions, read/write status/control register instructions
+FLT{cond}<S,D,E>{P,M,Z} Fn, Rd          Convert integer to floating point
+FIX{cond}{P,M,Z} Rd, Fn                 Convert floating point to integer
+WFS{cond} Rd                            Write floating point status register
+RFS{cond} Rd                            Read floating point status register
+WFC{cond} Rd                            Write floating point control register
+RFC{cond} Rd                            Read floating point control register
+FLT/FIX are fully implemented.
+RFS/WFS are fully implemented.
+RFC/WFC are fully implemented.  RFC/WFC are supervisor only instructions, and
+presently check the CPU mode, and do an invalid instruction trap if not called
+from supervisor mode.
+Compare instructions
+CMF{cond} Fn, Fm        Compare floating
+CMFE{cond} Fn, Fm       Compare floating with exception
+CNF{cond} Fn, Fm        Compare negated floating
+CNFE{cond} Fn, Fm       Compare negated floating with exception
+These are fully implemented.
+Floating Point Coprocessor Data Instructions (CPDT)
+---------------------------------------------------
+Dyadic operations:
+ADF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - add
+SUF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - subtract
+RSF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse subtract
+MUF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - multiply
+DVF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - divide
+RDV{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse divide
+These are fully implemented.
+FML{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast multiply
+FDV{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast divide
+FRD{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast reverse divide
+These are fully implemented as well.  They use the same algorithm as the
+non-fast versions.  Hence, in this implementation their performance is
+equivalent to the MUF/DVF/RDV instructions.  This is acceptable according
+to the ARM manual.  The manual notes these are defined only for single
+operands, on the actual FPA11 hardware they do not work for double or
+extended precision operands.  The emulator currently does not check
+the requested permissions conditions, and performs the requested operation.
+RMF{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - IEEE remainder
+This is fully implemented.
+Monadic operations:
+MVF{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - move
+MNF{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - move negated
+These are fully implemented.
+ABS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - absolute value
+SQT{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - square root
+RND{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - round
+These are fully implemented.
+URD{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - unnormalized round
+NRM{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - normalize
+These are implemented.  URD is implemented using the same code as the RND
+instruction.  Since URD cannot return a unnormalized number, NRM becomes
+a NOP.
+Library calls:
+POW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - power
+RPW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse power
+POL{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - polar angle (arctan2)
+LOG{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base 10
+LGN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base e 
+EXP{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - exponent
+SIN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - sine
+COS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - cosine
+TAN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - tangent
+ASN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arcsine
+ACS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arccosine
+ATN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arctangent
+These are not implemented.  They are not currently issued by the compiler,
+and are handled by routines in libc.  These are not implemented by the FPA11
+hardware, but are handled by the floating point support code.  They should 
+be implemented in future versions.
+Signalling:
+Signals are implemented.  However current ELF kernels produced by Rebel.com
+have a bug in them that prevents the module from generating a SIGFPE.  This
+is caused by a failure to alias fp_current to the kernel variable
+current_set[0] correctly.
+The kernel provided with this distribution (vmlinux-nwfpe-0.93) contains
+a fix for this problem and also incorporates the current version of the
+emulator directly.  It is possible to run with no floating point module
+loaded with this kernel.  It is provided as a demonstration of the 
+technology and for those who want to do floating point work that depends
+on signals.  It is not strictly necessary to use the module.
+A module (either the one provided by Russell King, or the one in this 
+distribution) can be loaded to replace the functionality of the emulator
+built into the kernel.
diff --git a/Documentation/arm/nwfpe/TODO b/Documentation/arm/nwfpe/TODO
new file mode 100644
index 000000000000..8027061b60eb
--- /dev/null
+++ b/Documentation/arm/nwfpe/TODO
@@ -0,0 +1,67 @@
+TODO LIST
+---------
+POW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - power
+RPW{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse power
+POL{cond}<S|D|E>{P,M,Z} Fd, Fn, <Fm,#value> - polar angle (arctan2)
+LOG{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base 10
+LGN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base e 
+EXP{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - exponent
+SIN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - sine
+COS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - cosine
+TAN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - tangent
+ASN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arcsine
+ACS{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arccosine
+ATN{cond}<S|D|E>{P,M,Z} Fd, <Fm,#value> - arctangent
+These are not implemented.  They are not currently issued by the compiler,
+and are handled by routines in libc.  These are not implemented by the FPA11
+hardware, but are handled by the floating point support code.  They should 
+be implemented in future versions.
+There are a couple of ways to approach the implementation of these.  One
+method would be to use accurate table methods for these routines.  I have 
+a couple of papers by S. Gal from IBM's research labs in Haifa, Israel that
+seem to promise extreme accuracy (in the order of 99.8%) and reasonable speed.
+These methods are used in GLIBC for some of the transcendental functions.
+Another approach, which I know little about is CORDIC.  This stands for
+Coordinate Rotation Digital Computer, and is a method of computing 
+transcendental functions using mostly shifts and adds and a few
+multiplications and divisions.  The ARM excels at shifts and adds,
+so such a method could be promising, but requires more research to 
+determine if it is feasible.
+Rounding Methods
+The IEEE standard defines 4 rounding modes.  Round to nearest is the
+default, but rounding to + or - infinity or round to zero are also allowed.
+Many architectures allow the rounding mode to be specified by modifying bits
+in a control register.  Not so with the ARM FPA11 architecture.  To change
+the rounding mode one must specify it with each instruction.
+This has made porting some benchmarks difficult.  It is possible to
+introduce such a capability into the emulator.  The FPCR contains 
+bits describing the rounding mode.  The emulator could be altered to 
+examine a flag, which if set forced it to ignore the rounding mode in
+the instruction, and use the mode specified in the bits in the FPCR.
+This would require a method of getting/setting the flag, and the bits
+in the FPCR.  This requires a kernel call in ArmLinux, as WFC/RFC are
+supervisor only instructions.  If anyone has any ideas or comments I
+would like to hear them.
+[NOTE: pulled out from some docs on ARM floating point, specifically
+ for the Acorn FPE, but not limited to it:
+ The floating point control register (FPCR) may only be present in some
+ implementations: it is there to control the hardware in an implementation-
+ specific manner, for example to disable the floating point system.  The user
+ mode of the ARM is not permitted to use this register (since the right is
+ reserved to alter it between implementations) and the WFC and RFC
+ instructions will trap if tried in user mode.
+ Hence, the answer is yes, you could do this, but then you will run a high
+ risk of becoming isolated if and when hardware FP emulation comes out
+                -- Russell].

diff --git a/Documentation/arm/nwfpe/NOTES b/Documentation/arm/nwfpe/NOTES new file mode 100644 index 000000000000..40577b5a49d3 --- /dev/null +++ b/Documentation/arm/nwfpe/NOTES
@@ -0,0 +1,29 @@
	1	There seems to be a problem with exp(double) and our emulator. I haven't
	2	been able to track it down yet. This does not occur with the emulator
	3	supplied by Russell King.
	4
	5	I also found one oddity in the emulator. I don't think it is serious but
	6	will point it out. The ARM calling conventions require floating point
	7	registers f4-f7 to be preserved over a function call. The compiler quite
	8	often uses an stfe instruction to save f4 on the stack upon entry to a
	9	function, and an ldfe instruction to restore it before returning.
	10
	11	I was looking at some code, that calculated a double result, stored it in f4
	12	then made a function call. Upon return from the function call the number in
	13	f4 had been converted to an extended value in the emulator.
	14
	15	This is a side effect of the stfe instruction. The double in f4 had to be
	16	converted to extended, then stored. If an lfm/sfm combination had been used,
	17	then no conversion would occur. This has performance considerations. The
	18	result from the function call and f4 were used in a multiplication. If the
	19	emulator sees a multiply of a double and extended, it promotes the double to
	20	extended, then does the multiply in extended precision.
	21
	22	This code will cause this problem:
	23
	24	double x, y, z;
	25	z = log(x)/log(y);
	26
	27	The result of log(x) (a double) will be calculated, returned in f0, then
	28	moved to f4 to preserve it over the log(y) call. The division will be done
	29	in extended precision, due to the stfe instruction used to save f4 in log(y).


diff --git a/Documentation/arm/nwfpe/README b/Documentation/arm/nwfpe/README new file mode 100644 index 000000000000..771871de0c8b --- /dev/null +++ b/Documentation/arm/nwfpe/README
@@ -0,0 +1,70 @@
	1	This directory contains the version 0.92 test release of the NetWinder
	2	Floating Point Emulator.
	3
	4	The majority of the code was written by me, Scott Bambrough It is
	5	written in C, with a small number of routines in inline assembler
	6	where required. It was written quickly, with a goal of implementing a
	7	working version of all the floating point instructions the compiler
	8	emits as the first target. I have attempted to be as optimal as
	9	possible, but there remains much room for improvement.
	10
	11	I have attempted to make the emulator as portable as possible. One of
	12	the problems is with leading underscores on kernel symbols. Elf
	13	kernels have no leading underscores, a.out compiled kernels do. I
	14	have attempted to use the C_SYMBOL_NAME macro wherever this may be
	15	important.
	16
	17	Another choice I made was in the file structure. I have attempted to
	18	contain all operating system specific code in one module (fpmodule.*).
	19	All the other files contain emulator specific code. This should allow
	20	others to port the emulator to NetBSD for instance relatively easily.
	21
	22	The floating point operations are based on SoftFloat Release 2, by
	23	John Hauser. SoftFloat is a software implementation of floating-point
	24	that conforms to the IEC/IEEE Standard for Binary Floating-point
	25	Arithmetic. As many as four formats are supported: single precision,
	26	double precision, extended double precision, and quadruple precision.
	27	All operations required by the standard are implemented, except for
	28	conversions to and from decimal. We use only the single precision,
	29	double precision and extended double precision formats. The port of
	30	SoftFloat to the ARM was done by Phil Blundell, based on an earlier
	31	port of SoftFloat version 1 by Neil Carson for NetBSD/arm32.
	32
	33	The file README.FPE contains a description of what has been implemented
	34	so far in the emulator. The file TODO contains a information on what
	35	remains to be done, and other ideas for the emulator.
	36
	37	Bug reports, comments, suggestions should be directed to me at
	38	<scottb@netwinder.org>. General reports of "this program doesn't
	39	work correctly when your emulator is installed" are useful for
	40	determining that bugs still exist; but are virtually useless when
	41	attempting to isolate the problem. Please report them, but don't
	42	expect quick action. Bugs still exist. The problem remains in isolating
	43	which instruction contains the bug. Small programs illustrating a specific
	44	problem are a godsend.
	45
	46	Legal Notices
	47	-------------
	48
	49	The NetWinder Floating Point Emulator is free software. Everything Rebel.com
	50	has written is provided under the GNU GPL. See the file COPYING for copying
	51	conditions. Excluded from the above is the SoftFloat code. John Hauser's
	52	legal notice for SoftFloat is included below.
	53
	54	-------------------------------------------------------------------------------
	55	SoftFloat Legal Notice
	56
	57	SoftFloat was written by John R. Hauser. This work was made possible in
	58	part by the International Computer Science Institute, located at Suite 600,
	59	1947 Center Street, Berkeley, California 94704. Funding was partially
	60	provided by the National Science Foundation under grant MIP-9311980. The
	61	original version of this code was written as part of a project to build
	62	a fixed-point vector processor in collaboration with the University of
	63	California at Berkeley, overseen by Profs. Nelson Morgan and John Wawrzynek.
	64
	65	THIS SOFTWARE IS DISTRIBUTED AS IS, FOR FREE. Although reasonable effort
	66	has been made to avoid it, THIS SOFTWARE MAY CONTAIN FAULTS THAT WILL AT
	67	TIMES RESULT IN INCORRECT BEHAVIOR. USE OF THIS SOFTWARE IS RESTRICTED TO
	68	PERSONS AND ORGANIZATIONS WHO CAN AND WILL TAKE FULL RESPONSIBILITY FOR ANY
	69	AND ALL LOSSES, COSTS, OR OTHER PROBLEMS ARISING FROM ITS USE.
	70	-------------------------------------------------------------------------------


diff --git a/Documentation/arm/nwfpe/README.FPE b/Documentation/arm/nwfpe/README.FPE new file mode 100644 index 000000000000..26f5d7bb9a41 --- /dev/null +++ b/Documentation/arm/nwfpe/README.FPE
@@ -0,0 +1,156 @@
	1	The following describes the current state of the NetWinder's floating point
	2	emulator.
	3
	4	In the following nomenclature is used to describe the floating point
	5	instructions. It follows the conventions in the ARM manual.
	6
	7	<S\|D\|E> = <single\|double\|extended>, no default
	8	{P\|M\|Z} = {round to +infinity,round to -infinity,round to zero},
	9	default = round to nearest
	10
	11	Note: items enclosed in {} are optional.
	12
	13	Floating Point Coprocessor Data Transfer Instructions (CPDT)
	14	------------------------------------------------------------
	15
	16	LDF/STF - load and store floating
	17
	18	<LDF\|STF>{cond}<S\|D\|E> Fd, Rn
	19	<LDF\|STF>{cond}<S\|D\|E> Fd, [Rn, #<expression>]{!}
	20	<LDF\|STF>{cond}<S\|D\|E> Fd, [Rn], #<expression>
	21
	22	These instructions are fully implemented.
	23
	24	LFM/SFM - load and store multiple floating
	25
	26	Form 1 syntax:
	27	<LFM\|SFM>{cond}<S\|D\|E> Fd, <count>, [Rn]
	28	<LFM\|SFM>{cond}<S\|D\|E> Fd, <count>, [Rn, #<expression>]{!}
	29	<LFM\|SFM>{cond}<S\|D\|E> Fd, <count>, [Rn], #<expression>
	30
	31	Form 2 syntax:
	32	<LFM\|SFM>{cond}<FD,EA> Fd, <count>, [Rn]{!}
	33
	34	These instructions are fully implemented. They store/load three words
	35	for each floating point register into the memory location given in the
	36	instruction. The format in memory is unlikely to be compatible with
	37	other implementations, in particular the actual hardware. Specific
	38	mention of this is made in the ARM manuals.
	39
	40	Floating Point Coprocessor Register Transfer Instructions (CPRT)
	41	----------------------------------------------------------------
	42
	43	Conversions, read/write status/control register instructions
	44
	45	FLT{cond}<S,D,E>{P,M,Z} Fn, Rd Convert integer to floating point
	46	FIX{cond}{P,M,Z} Rd, Fn Convert floating point to integer
	47	WFS{cond} Rd Write floating point status register
	48	RFS{cond} Rd Read floating point status register
	49	WFC{cond} Rd Write floating point control register
	50	RFC{cond} Rd Read floating point control register
	51
	52	FLT/FIX are fully implemented.
	53
	54	RFS/WFS are fully implemented.
	55
	56	RFC/WFC are fully implemented. RFC/WFC are supervisor only instructions, and
	57	presently check the CPU mode, and do an invalid instruction trap if not called
	58	from supervisor mode.
	59
	60	Compare instructions
	61
	62	CMF{cond} Fn, Fm Compare floating
	63	CMFE{cond} Fn, Fm Compare floating with exception
	64	CNF{cond} Fn, Fm Compare negated floating
	65	CNFE{cond} Fn, Fm Compare negated floating with exception
	66
	67	These are fully implemented.
	68
	69	Floating Point Coprocessor Data Instructions (CPDT)
	70	---------------------------------------------------
	71
	72	Dyadic operations:
	73
	74	ADF{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - add
	75	SUF{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - subtract
	76	RSF{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse subtract
	77	MUF{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - multiply
	78	DVF{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - divide
	79	RDV{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse divide
	80
	81	These are fully implemented.
	82
	83	FML{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast multiply
	84	FDV{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast divide
	85	FRD{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - fast reverse divide
	86
	87	These are fully implemented as well. They use the same algorithm as the
	88	non-fast versions. Hence, in this implementation their performance is
	89	equivalent to the MUF/DVF/RDV instructions. This is acceptable according
	90	to the ARM manual. The manual notes these are defined only for single
	91	operands, on the actual FPA11 hardware they do not work for double or
	92	extended precision operands. The emulator currently does not check
	93	the requested permissions conditions, and performs the requested operation.
	94
	95	RMF{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - IEEE remainder
	96
	97	This is fully implemented.
	98
	99	Monadic operations:
	100
	101	MVF{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - move
	102	MNF{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - move negated
	103
	104	These are fully implemented.
	105
	106	ABS{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - absolute value
	107	SQT{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - square root
	108	RND{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - round
	109
	110	These are fully implemented.
	111
	112	URD{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - unnormalized round
	113	NRM{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - normalize
	114
	115	These are implemented. URD is implemented using the same code as the RND
	116	instruction. Since URD cannot return a unnormalized number, NRM becomes
	117	a NOP.
	118
	119	Library calls:
	120
	121	POW{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - power
	122	RPW{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse power
	123	POL{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - polar angle (arctan2)
	124
	125	LOG{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base 10
	126	LGN{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base e
	127	EXP{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - exponent
	128	SIN{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - sine
	129	COS{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - cosine
	130	TAN{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - tangent
	131	ASN{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - arcsine
	132	ACS{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - arccosine
	133	ATN{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - arctangent
	134
	135	These are not implemented. They are not currently issued by the compiler,
	136	and are handled by routines in libc. These are not implemented by the FPA11
	137	hardware, but are handled by the floating point support code. They should
	138	be implemented in future versions.
	139
	140	Signalling:
	141
	142	Signals are implemented. However current ELF kernels produced by Rebel.com
	143	have a bug in them that prevents the module from generating a SIGFPE. This
	144	is caused by a failure to alias fp_current to the kernel variable
	145	current_set[0] correctly.
	146
	147	The kernel provided with this distribution (vmlinux-nwfpe-0.93) contains
	148	a fix for this problem and also incorporates the current version of the
	149	emulator directly. It is possible to run with no floating point module
	150	loaded with this kernel. It is provided as a demonstration of the
	151	technology and for those who want to do floating point work that depends
	152	on signals. It is not strictly necessary to use the module.
	153
	154	A module (either the one provided by Russell King, or the one in this
	155	distribution) can be loaded to replace the functionality of the emulator
	156	built into the kernel.


diff --git a/Documentation/arm/nwfpe/TODO b/Documentation/arm/nwfpe/TODO new file mode 100644 index 000000000000..8027061b60eb --- /dev/null +++ b/Documentation/arm/nwfpe/TODO
@@ -0,0 +1,67 @@
	1	TODO LIST
	2	---------
	3
	4	POW{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - power
	5	RPW{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - reverse power
	6	POL{cond}<S\|D\|E>{P,M,Z} Fd, Fn, <Fm,#value> - polar angle (arctan2)
	7
	8	LOG{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base 10
	9	LGN{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - logarithm to base e
	10	EXP{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - exponent
	11	SIN{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - sine
	12	COS{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - cosine
	13	TAN{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - tangent
	14	ASN{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - arcsine
	15	ACS{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - arccosine
	16	ATN{cond}<S\|D\|E>{P,M,Z} Fd, <Fm,#value> - arctangent
	17
	18	These are not implemented. They are not currently issued by the compiler,
	19	and are handled by routines in libc. These are not implemented by the FPA11
	20	hardware, but are handled by the floating point support code. They should
	21	be implemented in future versions.
	22
	23	There are a couple of ways to approach the implementation of these. One
	24	method would be to use accurate table methods for these routines. I have
	25	a couple of papers by S. Gal from IBM's research labs in Haifa, Israel that
	26	seem to promise extreme accuracy (in the order of 99.8%) and reasonable speed.
	27	These methods are used in GLIBC for some of the transcendental functions.
	28
	29	Another approach, which I know little about is CORDIC. This stands for
	30	Coordinate Rotation Digital Computer, and is a method of computing
	31	transcendental functions using mostly shifts and adds and a few
	32	multiplications and divisions. The ARM excels at shifts and adds,
	33	so such a method could be promising, but requires more research to
	34	determine if it is feasible.
	35
	36	Rounding Methods
	37
	38	The IEEE standard defines 4 rounding modes. Round to nearest is the
	39	default, but rounding to + or - infinity or round to zero are also allowed.
	40	Many architectures allow the rounding mode to be specified by modifying bits
	41	in a control register. Not so with the ARM FPA11 architecture. To change
	42	the rounding mode one must specify it with each instruction.
	43
	44	This has made porting some benchmarks difficult. It is possible to
	45	introduce such a capability into the emulator. The FPCR contains
	46	bits describing the rounding mode. The emulator could be altered to
	47	examine a flag, which if set forced it to ignore the rounding mode in
	48	the instruction, and use the mode specified in the bits in the FPCR.
	49
	50	This would require a method of getting/setting the flag, and the bits
	51	in the FPCR. This requires a kernel call in ArmLinux, as WFC/RFC are
	52	supervisor only instructions. If anyone has any ideas or comments I
	53	would like to hear them.
	54
	55	[NOTE: pulled out from some docs on ARM floating point, specifically
	56	for the Acorn FPE, but not limited to it:
	57
	58	The floating point control register (FPCR) may only be present in some
	59	implementations: it is there to control the hardware in an implementation-
	60	specific manner, for example to disable the floating point system. The user
	61	mode of the ARM is not permitted to use this register (since the right is
	62	reserved to alter it between implementations) and the WFC and RFC
	63	instructions will trap if tried in user mode.
	64
	65	Hence, the answer is yes, you could do this, but then you will run a high
	66	risk of becoming isolated if and when hardware FP emulation comes out
	67	-- Russell].