DVCon 2005

Fixed- and floating-point packages for VHDL 2005
David Bishop, Eastman Kodak Company, Rochester, NY

Abstract
The pending update to VHD R! contains severa" ne# packages and $unctions% The ne# packages inc"ude support $or
&oth $i'ed(point and $"oating(point &inary math% These $u""y synthesi)a&"e packages #i"" raise the "eve" o$ a&straction in
VHD% D*+ app"ications, #hich previous"y needed an independent processor core, or re,uired very di$$icu"t manua"
trans"ation, can no# &e per$ormed #ithin your VHD source code% -n addition, *chematic(&ased D*+ a"gorithms can
no# &e trans"ated direct"y to VHD% This paper #i"" descri&e these packages and give e'amp"es o$ their use%
Introduction:
For the past 15 years we have been using HDL to increase the level of abstraction in our ASIC and F!A designs" HDL
was a #a$or leap fro# sche#atics" %hat have we done sense& Little"
Atte#pts have been #ade" '( Syste#)C and Syste#)*erilog are good e+a#ples" ,hese are great ideas( but they do not
give the designer the control and tool #aturity that *HDL and *erilog provide"
%hy not si#ply increase the level of abstraction in a language that is already well -nown& ,he potential of *HDL has
not yet been fully tapped" Designed fro# the ground up as a software language it is easily e+tendable and fle+ible"
Constructed at a higher level then *erilog( it has the ability to provide higher levels of abstraction directly( with already
#ature tools"
,ypically designers use integer #ath in their .,L code" For fi+ed point they tend to $ust /re#e#ber0 where the deci#al
point is" For floating point you use a DS( which #ay even be off chip" Designers tend to use #ath solutions in order of
/integer #ath0( /fi+ed point #ath0 and /floating point #ath0( where 123 of designs are done in integer( of the ne+t 423 12
3 of those are done in fi+ed point" 5ote that the co#ple+ity of fi+ed point #ath is not that #uch higher than integer #ath(
but that floating point is about 6+ as co#ple+ as integer #ath"
,he integer #ath proble# has been effectively solved with the 578'.IC9S,D pac-ages :12;<"6( now part of *HDL)
422=)F,>" ,his pac-age has been well adopted and been in use for #any years"
In this paper( I intend to describe a new set of pac-ages( which are being added to the *HDL language in the *HDL)
4225 update" ,hese pac-ages include *HDL overloads that allow you to do fi+ed and floating point #ath directly( without
the user having to perfor# any conversions" ,hese pac-ages raise the level of abstraction in *HDL A5D give the user the
fle+ibility and power of an HDL"
Fixed-point package:
Fi+ed)point #ath is basically integer #ath with nu#bers that can be less than 1"2" A fi+ed)point nu#ber has an assigned
width and an assigned location for the deci#al point" As long as the nu#ber is big enough to provide enough precision then
fi+ed point is fine for #ost DS applications" Since it is based on integer #ath it is e+tre#ely efficient as long as the long as
the data does not very too #uch in #agnitude"
,he fi+ed)point #ath pac-ages are based on the *HDL 12;<"6 nu#eric9std pac-age and use the signed and unsigned
arith#etic fro# within that pac-age" ,his #a-es the# highly efficient as the nu#eric9std pac-age is well supported by
si#ulation and synthesis tools" ,his pac-age defines two new types /ufi+ed0 which is unsigned fi+ed point( and /sfi+ed0
which is signed fi+ed point"
7sage #odel?
use ieee.fixed_pkg.all;
....
signal a, b : sfixed (7 downto -6);
signal c: sfixed (8 downto -6);
begin
....
a < to_sfixed (-!."#$, 7, -6);
b < to_sfixed (inp", b%&ig&, b%low);
c < a ' b;
,he two data types are defined as follows?
t(pe ufixed is a))a( (intege) )ange <*) of std_logic;
-- base +nsigned fixed point t(pe, downto di)ection assu,ed
t(pe sfixed is a))a( (intege) )ange <*) of std_logic;
-- base -igned fixed point t(pe, downto di)ection assu,ed
,his data type uses a negative inde+ to show you where the deci#al point is" ,he deci#al point is assu#ed to be
between the @2@ and @)1@ inde+" ,hus is we can assu#e @signal y ? ufi+ed :A downto )5>@ as the data type :unsigned fi+ed
point( 12 bits wide( 5 bits of deci#al>( then y B <"5 B @22112"12222@( or si#ply?
( < ./"/""/"////.;
Cou can also say?
( < to_ufixed (6.$, 0, -$);
where @A@ is the upper inde+( and @)5@ is the lower inde+( so you could also say?
( < to_ufixed (6.$, (1&ig&, (1low);
,he signed version uses a two co#pli#ent to show represent a negative nu#ber( $ust li-e the @nu#eric9std@ pac-age"
Any non)Dero inde+ range is valid" ,hus?
signal 2 : ufixed (-# downto -!);
2 < ."".; -- /.!7$ /./""
signal x : sfixed (0 downto ");
( < .""".; -- -# """/./
,he data widths in the fi+ed)point pac-age were designed :by .yan Hilton> so that there is no possibility of an overflow"
,his is a departure fro# the /nu#eric9std0 #odel which si#ply throws away underflow and overflow bits"
For unsigned fi+ed point?
ufixed(a downto b) ' ufixed(c downto d) ufixed(,ax(a,c)'" downto ,in(b,d))
ufixed(a downto b) - ufixed(c downto d) ufixed(,ax(a,c)'" downto ,in(b,d))
ufixed(a downto b) 3 ufixed(c downto d) ufixed(a'c'" downto b'd)
ufixed(a downto b) 4 ufixed(c downto d) ufixed(a-d'" downto b-c-")
)ecip)ocal (ufixed(a downto b)) ufixed(a-b'" downto b-a-")
ufixed(a downto b) )e, ufixed(c downto d) ufixed(c downto d)
ufixed(a downto b) ,od ufixed(c downto d) ufixed(a downto b)
For signed fi+ed point?
sfixed(a downto b) ' sfixed(c downto d) sfixed(,ax(a,c)'" downto ,in(b,d))
sfixed(a downto b) - sfixed(c downto d) sfixed(,ax(a,c)'" downto ,in(b,d))
sfixed(a downto b) 3 sfixed(c downto d) sfixed(a'c downto b'd)
sfixed(a downto b) 4 sfixed(c downto d) sfixed(a-d downto b-c)
)ecip)ocal (sfixed(a downto b)) sfixed(a-b downto b-a)
ufixed(a downto b) )e, ufixed(c downto d) ufixed(c downto d)
ufixed(a downto b) ,od ufixed(c downto d) ufixed(a downto b)
7nsigned '+a#ple?
signal x : ufixed ( 7 downto 5!);
signal ( : ufixed ( # downto 56);
If we #ultiply + by y we would get a signal which would be?
x 3 ( ufixed (7'#'" downto 5!'(-6)) o) ufixed ("/ downto 5"#);
Signed '+a#ple?
signal x : sfixed (-" downto 5!);
signal ( : sfixed (! downto ");
If we divide + by y we would get a signal which would be?
x4( sfixed (-"-" downto 5!-!) o) sfixed (-# downto 56);
,he /resiDe0 function can be used to fi+ the siDe of the output" However( rounding and saturate rules are applied?
7 < )esi2e (x 3 (, x%&ig&, x%low);
%hat about an accu#ulator& An accu#ulator is a fi+ed width nu#ber that you continually add to" ,o i#ple#ent an
accu#ulator in the fi+ed)point pac-ages( you can use the /resiDe0 function as follows?
Signal = ? ufi+ed :; downto 2>E
7 < )esi2e (7 ' ", 7%&ig&, 7%low, false, false);
%here the first /false0 is the round9style" Since we do not need to do any rounding( we set this to false" ,he second
/false0 is the overflow9style" If this is set to true( we saturate( or go to the #a+i#u# possible nu#ber" %hen set to /false0
we wrap( #eaning that the upper #ost bit is dropped and the nu#ber si#ply recycles" 5ote that the default for both
overflow9style and round9style is /true0"
Integer and real overloaded for all operators( thus you can say?
-ignal x : sfixed (0 downto 5$);
-ignal ( : )eal;
8
9 : x ' (;
In the case where an operation is perfor#ed which includes both a fi+ed)point nu#ber and an integer or real then the
siDing rules are #odified" For a real nu#ber( then the real is converted to a fi+ed)point nu#ber that is the sa#e siDe as the
fi+ed)point nu#ber that has been passed as the other argu#ent" ,hus in the above e+a#ple?
9 : x ' sfixed((, 0, -$);
%ould be called( which would result in F being an /sfi+ed :5 downto G5>0 type" For an integer( the nu#ber is also converted
to a fi+ed)point nu#ber( but the siDe is only /downto 20( as an integer can never have a fraction" ,hus( if /y0 were an integer
the above e+a#ple would loo- li-e?
9 : x ' sfixed ((, 0, /);
%hich in this case would not affect the resultant nu#berHs siDe" However this has a fairly large effect on the siDe of the
output nu#bers in the #ultiply and divide routines"
,he following operations are defined for ufi+ed?
I( )( J( K( re#( #od( B( KB( L( M( MB( LB( sll( srl( rol( ror( sla( sra
,he following functions are defined for ufi+ed?
divide( reciprocal( scalb( #a+i#u#( #ini#u#( find9lsb( find9#sb( resiDe( ,o921( Is9=(
Conversion functions are defined for ufi+ed?
to9ufi+ed :natural>( to9ufi+ed :real>( to9ufi+ed :unsigned>( to9ufi+ed:signed>( re#ove9sign :sfi+ed>( to9unsigned(
to9real( to9integer( to97Fi+
,he following operations are defined for sfi+ed?
I( )( J( K( re#( #od( B( KB( L( M( MB( LB( sll( srl( rol( ror( sla( sra( abs( ) :unary>
,he following functions are defined for ufi+ed
divide( reciprocal( scalb( #a+i#u#( #ini#u#( find9lsb( find9#sb( resiDe( to921( I+9=
Conversion functions are defined for ufi+ed?
to9sfi+ed :natural>( to9sfi+ed :real>( to9sfi+ed :unsigned>( to9sfi+ed:signed>( add9sign :ufi+ed>( to9signed( to9real(
to9integer( to9Fi+
All of the operators are overloaded for /real0 and /integer0 data types" In each case the nu#ber is converted into fi+ed
point before the operation is done" ,hus the fi+ed)point operand #ust be of a for#at large enough to acco##odate the
converted input or a /vector truncated0 warning is produced" In the case of an integer( the nu#ber is converted in the for#
/integer9width downto 20 which causes the siDe of the output vector to change accordingly" In these functions
/fi+ed9saturate0 is set to true regardless of what the /overflow9style0 constant is set to"
,his pac-age defines 6 constants that are used to #anipulate fi+ed)point nu#bers?
constant fixed_)ound : boolean : t)ue; -- :ound o) t)uncate
constant fixed_satu)ate : boolean : t)ue -- satu)ate o) w)ap
constant fixed_gua)d_bits : natu)al : !; -- gua)d bits fo) )ounding
,hese constants are defaults( and can be overridden everywhere they are used"
@round9style@ defaults to fi+ed9round :true> that turns on the rounding routines" If false then the nu#ber is truncated" If
the 8SN of the re#ainder is a @1@ A5D the LSN of the unround result is a O1O or the lower bits of the re#ainder include a O1O
then the result will be rounded" ,his is si#ilar to the floating)point /round9nearest0 style"
@overflow9style@ default to fi+ed9saturate :true> that returns the #a+i#u# possible nu#ber if the nu#ber is too large to
represent( otherwise a @wrap@ routine is used which si#ply truncates the top bits" 7nli-e the way it is done in /nu#eric9std0(
the sign bit is not preserved when wrapping" ,hus it is possible to get positive result when resiDing a negative nu#ber in this
#ode"
Finally @guard9bits@ defaults to @fi+ed9guard9bits@ which defaults to 6" !uard bits are used in the rounding routines" If
guard is set to 2( then the rounding is auto#atically turned off" ,hese e+tra bits are added to the end of nu#bers in the
division and /to9real0 functions to #a-e the nu#bers #ore accurate"
,he /resiDe0 function is defined as follows?
function )esi2e (a)g : sfixed;
constant intege)_widt& : ;<=>?>:;
constant f)action_widt& : ;<=>?>:;
constant )ound_st(le : @AAB>C< : fixed_)ound;
constant oDe)flow_st(le : @AAB>C< : fixed_satu)ate)
In /saturate0 #ode :where overflow9style is true> if the output siDe is s#aller than the input nu#ber then the nu#ber will
/saturate0" An unsigned fi+ed point will saturate to all /10( a signed positive nu#ber will be all /10 with the first bit a /20(
and a signed negative nu#ber will saturate to be all /20 with the first nu#ber a /10"
If in /wrap0 #ode :where overflow9style is false> the nu#ber will be truncated" In this case the top or the nu#ber is
si#ply truncated without regard to the sign bits( so you can truncate a negative nu#ber to be a positive one" ,he rounding
routines are left intact in /wrap0 #ode"
If /round9style0 is true( then the rounding routines are turned on" Ptherwise the nu#ber is si#ply truncated"
Shift operators are functionally the sa#e as the 12;<)1QQ6 shift operators with the e+ception of the arith#etic shift
operations" An arith#etic shift :/sra0( or /sla0> on an unsigned nu#ber is the sa#e as a logical shift" An arith#etic shift on a
signed nu#ber is a logical shift if you are shifting left( and an arith#etic shift :sign bit replicated> if you are shifting right"
,he divide function is defined as follows?
function diDide (
l, ) : sfixed;
gua)d_bits : <C=+:CB : fixed_gua)d_bits;
)ound_st(le : @AAB>C< : fixed_)ound)
)etu)n sfixed;
,he output is siDed with the sa#e rules as the /K0 operator" ,he function allows you to override the nu#ber of guard bits
and the rounding operation" 5ote that the output siDe is calculated so that overflow is not possible"
,he reciprocal function is defined in a very si#ilar #anor to the divide function?
function )ecip)ocal (
a)g : ufixed;
gua)d_bits : <C=+:CB : fixed_gua)d_bits;
)ound_st(le : @AAB>C< : fixed_)ound)
)etu)n ufixed;
,his function perfor#s a /1K=0 function( with the output vector following the siDing rules as noted above" ,his function
is very useful for dividing by a constant( e+a#ple?
C : @4Eons;
Can be rewritten as?
C : @3)ecip)ocal(Eons);
Since a #ultiply uses less logic then a divide this can save you significant hardware resources"
,he /scalb0 function is a fi+ed)point version of a very co##on floating)point function" ,he function loo-s li-e this?
function scalb (( : ufixed; < : -;?<>F) )etu)n ufixed;
,his function co#putes y J 4JJ5 without co#puting 4JJ5 by using a shift operator" ,he siDe of the output nu#ber is the
sa#e as the input" For this function overflow and rounding functions are ignored( as this is treated li-e a shift operator" ,he
/50 input is also overloaded for the type /I5,'!'.0"
,he /#a+i#u#0 and /#ini#u#0 functions do a co#pare operation and return the appropriate value" ,hese functions
are not overloaded for integer and real inputs" ,he siDe of the inputs does not need to #atch"
,he /find9lsb0 and /find9#sb0 functions are used to find the #ost significant bit or least significant bit of a fi+ed)point
nu#ber" ,he function loo-s li-e the following?
function find_,sb (a)g : ufixed; ( : -=F_+BA?;E) )etu)n ;<=>?>:;
In this case( /y0 can be any /std9ulogic0 value" ,hese functions search for the first occurrence of /y0 in the fi+ed)point
nu#ber" /find9#sb0 starts at the 8SN :argHhigh> and goes down" /find9#sb0 starts at the LSN :argHlow> and goes up" If that
value is not found in the /find9#sb0 function( then /argHlow)10 is returned" If the value is not found in the /find9lsb0
function then /argHhighI10 is returned"
/to9210 and /Is9=0 are si#ilar in function to the nu#eric9std functions with the sa#e na#e"
8ost synthesis tools do not support any IKP for#at other than /std9logic9vector0 and /std9logic0" ,hus functions have
been created to convert between std9logic9vector and ufi+ed or sfi+ed and visa versa?
7f;96 LB to9ufi+ed :slv;( uf;96Hhigh( uf;96Hlow>E and Slv; LB to9slv :sf;96>E
Pne of the changes #ade to all pac-ages in vhdl)4225 is that the read and write routines for all data types are now
defined in the sa#e pac-age that defines that type" ,hus the .'AD( %.I,'( H.'AD( H%.I,'( P.'AD( and P%.I,'
routines are defined for fi+ed)point data types" A /"0 Separator is added between the integer part and the fractional part of the
fi+ed)point nu#ber" ,hus if you write out or /<"50 e+a#ple fro# above you will get the string @22112"12222@( which you
can also read into that data type"
5ew to vhdl)4225 are the functions /to9string0( /to9ostring0 and /to9hstring0" ,hese are very useful in /assert0
state#ents" '+a#ple?
Csse)t x(
:epo)t to_st)ing(x) G H 4 H G to_st)ing(() )epo)t e))o);
Pr( if you prefer to see the nu#bers as /real0 nu#bers( you can use?
Csse)t x(
:epo)t to_st)ing(to_)eal(x)) G H 4 H G to_st)ing(to_)eal(()) )epo)t e))o);
8ath%or-s Si#ulin- is these days the #ost co##on way to define a fi+ed point DS algorith#" It what would see# to
be a #a$or step into the past as it is sche#atic based" In Si#ulin- an unsigned fi+ed point nu#ber is described as ufi+R1A(12S(
which specifies a 1A bit long word with 12 bits after the fraction" ,his translates into /ufi+ed :6 downto G12>0 in the
unsigned fi+ed)point type" ,he Si#ulin- /sfi+0 notation translates #uch better because of the e+tra sign bit that #ust be
generated" Sfi+:1A( 12> will translate into /sfi+ed:A downto G12> in the notation of the /fi+ed9p-g0"
Issues?
A negative or /to0 inde+ is flagged as an error by the fi+ed point routines" ,hus if you define a nu#ber as /ufi+ed :)1 to 5>0
the routines will auto#atically error out"
String literals are also a proble#" Ny default( if you do the following?
9 < a ' H/""/""I;
,he inde+ of the fi+ed)point nu#ber is undefined" ,he *HDL co#piler will assu#e that the range of this nu#ber has the
range /IntegerHlow to integerHlowI50( #a-ing it very s#all" ,o avoid crashing the si#ulator with a 64(222 bit wide nu#ber
this also will auto#atically error out"
Floating-point numbers:
After Fi+ed point the ne+t step is floating point" Floating)point nu#bers are well defined by I''');5A :64 and <A bit>
and I''')15A :variable width> specifications" Floating point has been used in processors and I for years and is a well)
understood for#at"
,here are #any concepts in floating point that #a-e it different fro# our well understood signed and unsigned nu#ber
notations" ,hese co#e fro# how a floating)point nu#ber is defined" Lets first ta-e a loo- at a 64)bit floating)point nu#ber?
- >>>>>>>> JJJJJJJJJJJJJJJJJJJJJJJ
!" !/ #$ #0 /
'4- exp. J)action
Nasically( a floating)point nu#ber co#prises a sign bit :I or )>( a nor#aliDed e+ponent( and a fraction" ,o convert this
nu#ber bac- into an integer( the following eTuation can be used?
- 3 ("./ ' J)action4Kax f)action) 33 # (exponent 5 exponent_base)
where the /e+ponent9base0 is 4JJ::#a+i#u# e+ponentK4>G1> and /Fraction0 is always a nu#ber less than one" ,hus for 64
bit floating point an e+a#ple would be?
/ "//////" "/"/////////////////////
B I1 J 4JJ :14Q G 14;> J :1"2 I 12A15;<2K1<;;;41<> B I1 J 1"<45 J A"2 B <"5
,here are also /denor#al nu#bers0( which are nor#ally nu#bers s#aller than can be represented with this structure" ,he tag
for a denor#al nu#ber is that the e+ponent is /20" ,his forces you to invo-e another for#ula?
/ //////// "///////////////////////
B I1 J 4JJ )14< J :1611<21K1<;;;41<> B I1 J 4JJ)1 J 4JJ)14< B 4JJ)14;
5e+t are the /constants0 that e+ist in the floating)point conte+t?
/ //////// //////////////////////// B 2
" //////// //////////////////////// B )2 :which B 2>
/ """""""" //////////////////////// B positive infinity
" """""""" //////////////////////// B negative infinity
If you get a nu#ber with an infinite :all /10s> e+ponent and anything other than an all Dero fraction( then it is said to
be a 5A5( or /5ot A 5u#ber0" 5A5s co#e in two types( signaling and non)signaling" For the purposes of these pac-ages I
chose a fraction with an 8SN of /10 to be a signaling 5A5 and anything else to be a Tuiet 5A5"
,hus you wind up with the following classes :or states> that each floating)point nu#ber can fall into?
nan Signaling 5a5
Tuiet9nan Uuiet 5a5
neg9inf 5egative infinity
neg 5egative nor#aliDed nonDero
neg9denor#al 5egative denor#aliDed
neg9Dero )2
Dero I2
denor#al ositive denor#aliDed
nor#al ositive nor#aliDed nonDero
infinity ositive infinity
In the pac-ages I use these states to both e+a#ine and create nu#bers needed for floating point operations" ,his defines
the type /valid9fpstate0 " ,he constants Derofp( nanfp( Tnanfp( pos9inffp( neginf9fp( neg9Derofp are also defined"
.ounding co#es in A different flavors
.ound nearest
.ound positive infinity
.ound negative infinity
.ound Dero
/.ound nearest0 has the e+tra caveat that if the re#ainder is e+actly V then you need to round so that the LSN of the nu#ber
you will get is a Dero" ,he i#ple#entation of this feature reTuires two co#pare operations( but they can be consolidated"
.ound negative infinity rounds down( and round positive infinity always rounds up" .ound Dero is a #i+ of the two( and has
the effect of doing a truncation :no rounding>"
!e floating point packages:
,he new floating)point pac-ages ta-e advantage of a new feature in *HDL)4225 called pac-age generics" ,he 64 bit
floating point pac-age loo-s li-e the following?
package fp&dl!#_pkg is new ;>>>.fp&dl_pkg
gene)ic ,ap (
fp_f)action_widt& * #!; -- #! bits of f)action
fp_exponent_widt& * 8; -- exponent 8 bits
fp_)ound_st(le * )ound_nea)est; -- )ound nea)est algo)it&,
fp_deno),ali2e * t)ue; -- =u)n on Feno),ali2ed nu,be)s
fp_c&eck_e))o) * t)ue; -- =u)n on <C< and oDe)flow p)ocessing
fp_gua)d_bits * !); -- nu,be) of gua)d bits
ac-age generics allow you to specify any data width or siDe of floating point nu#ber you li-e"
,he resulting data type will be called /fp0" ,hus you have the following use #odel?
signal a, b, c : fp;
signal x : unsigned ($ downto /);
constant L; : )eal : !."0;
begin
b < to_fp (x);
c < a ' L;;
,he actual floating)point type is defined as follows?
t(pe fp is a))a( (fp_exponent_widt& downto -fp_f)action_widt&) of -=F_BA?;E;
Pnce again we are using the negative inde+ tric- to separate the fraction part of the floating)point nu#ber fro# the e+ponent"
,he top bit is the sign bit :Whigh> the ne+t bits are the e+ponent :Whigh)1 downto 2> and the negative bits are the fraction :)1
downto Wlow>" For a 64)bit representation that specification #a-es the nu#ber loo- as follows?
/ //////// //////////////////////
8 7 / -" -#!
'4- exp. f)action
where the sign is bit 1( the e+ponent is contained in bits ;)2 :1 bits> with bit ; being the 8SN( and the #antissa is contained
in bits )1 ) )46 :64 ) 1 ) 1 B 46 bits> where bit )1 is the 8SN"
,he negative inde+ for#at turns out to be a very natural for#at for the floating)point nu#ber( as the fraction is always
assu#ed to be a nu#ber between 1"2 and 4"2 :unless we are denor#aliDed>" ,hus the i#plied /1"20 can be assu#ed on the
positive side of the inde+( and the negative side represents the fraction less than one"
*alid values for fp9e+ponent9width and fp9fraction9width are 6 and up" ,hus the s#allest :width wise> nu#ber that
can be #ade is fp : 6 downto G6> or a ;)bit Floating)point nu#ber"
A generic called @fp9denor#aliDe@ is also provided for all operations" ,his para#eter allows you to disable the
creation of denor#aliDed nu#bers" In nor#al :a-a poor #anOs> floating point( the nu#ber closest to @2@ consists of an
e+ponent of @1@ and a #antissa of @2@ :4JJ)14< in the 64 bit case>" Denor#al nu#bers allow for nu#bers s#aller than this
by assu#ing that if the e+ponent is @2@ than the #antissa represents a fraction less than 1" ,his adds a great deal of overhead
to the floating point operations( and was thus left as an option defaulted to @true@ in the I''' 64 and <A bit i#ple#entations(
but can be shut off"
/fp9chec-9error0 turns off overflow and 5A5 processing" As every nu#ber #ust go through this chec- for every
operation according to I''');5A this represents a significant hardware savings"
/fp9guard9bits0 are bits that are added to the end of every operation to #aintain precision" 8ost i#ple#entations of
floating point use 6 bits" Any nu#ber of bits :including 2> is valid" 5ote that setting the nu#ber of guard bits to 2 is si#ilar
to turning off rounding with the /round9Dero0 round9type"
Defined operations for floating point nu#bers are?
7nary )(abs( /I0( /)/( /J0( /K0( /re#0( /#od0( /B0( /KB0( /L0( /M0( /LB0( /MB0
All of these operations are overloaded for /integer0 and /real0 types" ,he non floating)point type is first converted into
floating point and the operation is perfor#ed" If the nu#ber is out of bounds for that nu#ber then the appropriate infinity or
Dero is returned" 'rrors fro# these routines are treated as described in I''');5A"
Defined functions for floating point nu#ber aredividbyp4 :divide by a power of 4>( reciprocal :1K+>( #a+i#u#(
#ini#u#( to9unsigned( to9signed( to9ufi+ed( to9sfi+ed( to9real( to9integer ( ,o9fp:SI!5'D>( ,o9fp:75SI!5'D>(
,o9fp:ufi+ed>( ,o9fp:sfi+ed>( ,o9fp:integer>( ,o9fp:real>( to921" ,hese functions operate silently( this is to say they the give
no warnings for overflow or underflow" Putputting either infinity( or 5A5 signals errors in the to9fp routines" 'rrors fro#
the routines that read F nu#bers are returned the sa#e way"
Functions reco##ended by I''')15A?
Copysign :+( y> G .eturns + with the sign of y"
Scalb :y( 5> G .eturns yJ:4JJn> :where 5 is an integer or SI!5'D> without co#puting 4JJn"
Logb :+> G .eturns the unbiased e+ponent of +
5e+tafter:+(y> G .eturns the ne+t representable nu#ber after + in the direction of y"
Fininte:+> G Noolean( true if = is not positive or negative infinity
Isnan:+> G Noolean( true if = is a 5A5 or Tuiet 5A5"
7nordered:+( y> G Noolean( returns true of either = or C are so#e type of 5A5"
Class:+> G valid9fpstate( returns the type of floating point nu#ber :see valid9fpstate definition above>
,wo e+tra functions na#ed brea-9nu#ber and nor#aliDe are also provided" /brea-9nu#ber0 ta-es a floating)point
nu#ber and returns a /SI!5'D0 e+ponent :biased by G1> and an /ufi+ed0 fi+ed point nu#ber" /nor#aliDe0 ta-es a SI!5'D
e+ponent and a fi+ed)point nu#ber and returns a floating)point nu#ber" ,hese functions are useful for ti#es when you want
to operate on the fraction of a floating)point nu#ber without having to do the shifts on every operation"
,o9slv :aliased to to9std9logic9vector and to9StdLogic*ector> as well as to9fp:std9logic9vector> are used to convert
between std9logic9vector and fp types" ,hese should be use on the interface of your designs" ,he result of /to9slv0 is a
std9logic9vector with the length of the input fp type"
,he procedures .eading and writing floating point nu#bers are also included in this pac-age" rocedures read(
write( oread( owrite :octal>( bread( bwrite :binary>( hread and hwrite :he+> are defined" ,o9string( to9ostring( and to9hstring
are also provided for string results" Floating point nu#bers are written in the for#at /2?222?2220 :for a ; bit F>" ,hey can
be read as a si#ple string of bits( or with a /"0 Pr /?0 separator"
Changing fro# one floating point for#at to another can be done through the /resiDe0 function provided" '+a#ple?
use ieee.fp&dl!#_pkg.all;
a)c&itectu)e :=B of 777 is
alias fp!# is ieee.fp&dl!#_pkg.fp; -- o) Must HfpI
alias fp60 is ieee.fp&dl60_pkg.fp;
signal x : fp!#;
signal ( : fp60;
begin
N < ieee.fp&dl60_pkg.)esi2e (a)g * (, exponent_widt& * fp_exponent_widt&,
f)action_widt& * fp_f)action_widt&, deno),ali2e * fp_deno),ali2ed, )ound_st(le
* fp_)ound_st(le);
"!allenges for #$nt!esis %endors:
5ow that we are bringing nu#bers that are less than 1"2 into the real# of synthesis( the type /.'AL0 beco#es
#eaningful" ,o9fp :8A,H9I> will now evaluate to a string of bits" ,his #eans that synthesis vendors will now have to not
only understand the /real0 type( but the functions in the /#ath9real0 I''' pac-age as well"
Noth of these pac-ages depend on a negative inde+" Nasically( everything that is at an inde+ that is less than Dero is
assu#ed to be to the right of the deci#al point" Ny doing this we were able to avoid using record types" ,his also represents
a challenge for so#e synthesis vendors( but it #a-es these functions portable to *erilog"
.eferences
1" Floating point for *HDL and *erilog G David Nishop( 'ast#an Xoda- ) D*Con 4226
4" I''' Std ;5A)1Q15 ) I''' Standard for Ninary Floating)oint Arith#etic"
6" I''' Std 15A)1Q1; ) I''' Standard for Ninary Floating)oint Arith#etic"
A" Lecture 5otes on the Status of I''' Standard ;5A for Ninary" Floating)oint Arith#etic ) rof
%" Xhan( 7niversity of California"
5" /%hat 'very Co#puter Scientist Should Xnow About Floating)oint Arith#etic(0 by David
!oldberg"
<" Floating point types for Synthesis G Dr" Ale+ Fa#firescu"
;" .S* based bandwidth allocation G Ananda .angan and *ignesh 5anda-u#ar( %ashington
7niversity in St" Louis"
1" &ttp:44babbage.cs.Oc.edu4cou)ses4cs!0"4;>>>-7$0.&t,l ) I''');5A
Floating)oint Conversion"
Q" &ttp:44www.,a)kwo)ld.co,4s&owfloat.&t,l ) Deco#pose I''' Floating oint
5u#ber"
12" &ttp:44www.ecs.u,ass.edu4ece4ko)en4a)it&4si,ulato)4JLCdd4 Y
Floating)point addition and subtraction"
11" I''' 12;<"6 ) *HDL Standard Synthesis pac-ages"
14" CadenceOs *erilog)=L .eference 8anual"

DVCon 2005

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DVCon 2005

Uploaded by

Copyright:

Available Formats

Fixed- and floating-point packages for VHDL 2005

David Bishop, Eastman Kodak Company, Rochester, NY

You might also like