This document summarizes new fixed-point and floating-point packages being added to VHDL in the 2005 update. The packages allow fixed-point and floating-point math to be performed directly in VHDL code without conversions. They define new fixed-point and floating-point data types and overload operators to operate on these types, raising the level of abstraction. All common arithmetic operations, functions, and conversions are supported for the new data types.
This document summarizes new fixed-point and floating-point packages being added to VHDL in the 2005 update. The packages allow fixed-point and floating-point math to be performed directly in VHDL code without conversions. They define new fixed-point and floating-point data types and overload operators to operate on these types, raising the level of abstraction. All common arithmetic operations, functions, and conversions are supported for the new data types.
This document summarizes new fixed-point and floating-point packages being added to VHDL in the 2005 update. The packages allow fixed-point and floating-point math to be performed directly in VHDL code without conversions. They define new fixed-point and floating-point data types and overload operators to operate on these types, raising the level of abstraction. All common arithmetic operations, functions, and conversions are supported for the new data types.
David Bishop, Eastman Kodak Company, Rochester, NY
Abstract The pending update to VHD R! contains severa" ne# packages and $unctions% The ne# packages inc"ude support $or &oth $i'ed(point and $"oating(point &inary math% These $u""y synthesi)a&"e packages #i"" raise the "eve" o$ a&straction in VHD% D*+ app"ications, #hich previous"y needed an independent processor core, or re,uired very di$$icu"t manua" trans"ation, can no# &e per$ormed #ithin your VHD source code% -n addition, *chematic(&ased D*+ a"gorithms can no# &e trans"ated direct"y to VHD% This paper #i"" descri&e these packages and give e'amp"es o$ their use% Introduction: For the past 15 years we have been using HDL to increase the level of abstraction in our ASIC and F!A designs" HDL was a #a$or leap fro# sche#atics" %hat have we done sense& Little" Atte#pts have been #ade" '( Syste#)C and Syste#)*erilog are good e+a#ples" ,hese are great ideas( but they do not give the designer the control and tool #aturity that *HDL and *erilog provide" %hy not si#ply increase the level of abstraction in a language that is already well -nown& ,he potential of *HDL has not yet been fully tapped" Designed fro# the ground up as a software language it is easily e+tendable and fle+ible" Constructed at a higher level then *erilog( it has the ability to provide higher levels of abstraction directly( with already #ature tools" ,ypically designers use integer #ath in their .,L code" For fi+ed point they tend to $ust /re#e#ber0 where the deci#al point is" For floating point you use a DS( which #ay even be off chip" Designers tend to use #ath solutions in order of /integer #ath0( /fi+ed point #ath0 and /floating point #ath0( where 123 of designs are done in integer( of the ne+t 423 12 3 of those are done in fi+ed point" 5ote that the co#ple+ity of fi+ed point #ath is not that #uch higher than integer #ath( but that floating point is about 6+ as co#ple+ as integer #ath" ,he integer #ath proble# has been effectively solved with the 578'.IC9S,D pac-ages :12;<"6( now part of *HDL) 422=)F,>" ,his pac-age has been well adopted and been in use for #any years" In this paper( I intend to describe a new set of pac-ages( which are being added to the *HDL language in the *HDL) 4225 update" ,hese pac-ages include *HDL overloads that allow you to do fi+ed and floating point #ath directly( without the user having to perfor# any conversions" ,hese pac-ages raise the level of abstraction in *HDL A5D give the user the fle+ibility and power of an HDL" Fixed-point package: Fi+ed)point #ath is basically integer #ath with nu#bers that can be less than 1"2" A fi+ed)point nu#ber has an assigned width and an assigned location for the deci#al point" As long as the nu#ber is big enough to provide enough precision then fi+ed point is fine for #ost DS applications" Since it is based on integer #ath it is e+tre#ely efficient as long as the long as the data does not very too #uch in #agnitude" ,he fi+ed)point #ath pac-ages are based on the *HDL 12;<"6 nu#eric9std pac-age and use the signed and unsigned arith#etic fro# within that pac-age" ,his #a-es the# highly efficient as the nu#eric9std pac-age is well supported by si#ulation and synthesis tools" ,his pac-age defines two new types /ufi+ed0 which is unsigned fi+ed point( and /sfi+ed0 which is signed fi+ed point" 7sage #odel? use ieee.fixed_pkg.all; .... signal a, b : sfixed (7 downto -6); signal c: sfixed (8 downto -6); begin .... a < to_sfixed (-!."#$, 7, -6); b < to_sfixed (inp", b%&ig&, b%low); c < a ' b; ,he two data types are defined as follows? t(pe ufixed is a))a( (intege) )ange <*) of std_logic; -- base +nsigned fixed point t(pe, downto di)ection assu,ed t(pe sfixed is a))a( (intege) )ange <*) of std_logic; -- base -igned fixed point t(pe, downto di)ection assu,ed ,his data type uses a negative inde+ to show you where the deci#al point is" ,he deci#al point is assu#ed to be between the @2@ and @)1@ inde+" ,hus is we can assu#e @signal y ? ufi+ed :A downto )5>@ as the data type :unsigned fi+ed point( 12 bits wide( 5 bits of deci#al>( then y B <"5 B @22112"12222@( or si#ply? ( < ./"/""/"////.; Cou can also say? ( < to_ufixed (6.$, 0, -$); where @A@ is the upper inde+( and @)5@ is the lower inde+( so you could also say? ( < to_ufixed (6.$, (1&ig&, (1low); ,he signed version uses a two co#pli#ent to show represent a negative nu#ber( $ust li-e the @nu#eric9std@ pac-age" Any non)Dero inde+ range is valid" ,hus? signal 2 : ufixed (-# downto -!); 2 < ."".; -- /.!7$ /./"" signal x : sfixed (0 downto "); ( < .""".; -- -# """/./ ,he data widths in the fi+ed)point pac-age were designed :by .yan Hilton> so that there is no possibility of an overflow" ,his is a departure fro# the /nu#eric9std0 #odel which si#ply throws away underflow and overflow bits" For unsigned fi+ed point? ufixed(a downto b) ' ufixed(c downto d) ufixed(,ax(a,c)'" downto ,in(b,d)) ufixed(a downto b) - ufixed(c downto d) ufixed(,ax(a,c)'" downto ,in(b,d)) ufixed(a downto b) 3 ufixed(c downto d) ufixed(a'c'" downto b'd) ufixed(a downto b) 4 ufixed(c downto d) ufixed(a-d'" downto b-c-") )ecip)ocal (ufixed(a downto b)) ufixed(a-b'" downto b-a-") ufixed(a downto b) )e, ufixed(c downto d) ufixed(c downto d) ufixed(a downto b) ,od ufixed(c downto d) ufixed(a downto b) For signed fi+ed point? sfixed(a downto b) ' sfixed(c downto d) sfixed(,ax(a,c)'" downto ,in(b,d)) sfixed(a downto b) - sfixed(c downto d) sfixed(,ax(a,c)'" downto ,in(b,d)) sfixed(a downto b) 3 sfixed(c downto d) sfixed(a'c downto b'd) sfixed(a downto b) 4 sfixed(c downto d) sfixed(a-d downto b-c) )ecip)ocal (sfixed(a downto b)) sfixed(a-b downto b-a) ufixed(a downto b) )e, ufixed(c downto d) ufixed(c downto d) ufixed(a downto b) ,od ufixed(c downto d) ufixed(a downto b) 7nsigned '+a#ple? signal x : ufixed ( 7 downto 5!); signal ( : ufixed ( # downto 56); If we #ultiply + by y we would get a signal which would be? x 3 ( ufixed (7'#'" downto 5!'(-6)) o) ufixed ("/ downto 5"#); Signed '+a#ple? signal x : sfixed (-" downto 5!); signal ( : sfixed (! downto "); If we divide + by y we would get a signal which would be? x4( sfixed (-"-" downto 5!-!) o) sfixed (-# downto 56); ,he /resiDe0 function can be used to fi+ the siDe of the output" However( rounding and saturate rules are applied? 7 < )esi2e (x 3 (, x%&ig&, x%low); %hat about an accu#ulator& An accu#ulator is a fi+ed width nu#ber that you continually add to" ,o i#ple#ent an accu#ulator in the fi+ed)point pac-ages( you can use the /resiDe0 function as follows? Signal = ? ufi+ed :; downto 2>E 7 < )esi2e (7 ' ", 7%&ig&, 7%low, false, false); %here the first /false0 is the round9style" Since we do not need to do any rounding( we set this to false" ,he second /false0 is the overflow9style" If this is set to true( we saturate( or go to the #a+i#u# possible nu#ber" %hen set to /false0 we wrap( #eaning that the upper #ost bit is dropped and the nu#ber si#ply recycles" 5ote that the default for both overflow9style and round9style is /true0" Integer and real overloaded for all operators( thus you can say? -ignal x : sfixed (0 downto 5$); -ignal ( : )eal; 8 9 : x ' (; In the case where an operation is perfor#ed which includes both a fi+ed)point nu#ber and an integer or real then the siDing rules are #odified" For a real nu#ber( then the real is converted to a fi+ed)point nu#ber that is the sa#e siDe as the fi+ed)point nu#ber that has been passed as the other argu#ent" ,hus in the above e+a#ple? 9 : x ' sfixed((, 0, -$); %ould be called( which would result in F being an /sfi+ed :5 downto G5>0 type" For an integer( the nu#ber is also converted to a fi+ed)point nu#ber( but the siDe is only /downto 20( as an integer can never have a fraction" ,hus( if /y0 were an integer the above e+a#ple would loo- li-e? 9 : x ' sfixed ((, 0, /); %hich in this case would not affect the resultant nu#berHs siDe" However this has a fairly large effect on the siDe of the output nu#bers in the #ultiply and divide routines" ,he following operations are defined for ufi+ed? I( )( J( K( re#( #od( B( KB( L( M( MB( LB( sll( srl( rol( ror( sla( sra ,he following functions are defined for ufi+ed? divide( reciprocal( scalb( #a+i#u#( #ini#u#( find9lsb( find9#sb( resiDe( ,o921( Is9=( Conversion functions are defined for ufi+ed? to9ufi+ed :natural>( to9ufi+ed :real>( to9ufi+ed :unsigned>( to9ufi+ed:signed>( re#ove9sign :sfi+ed>( to9unsigned( to9real( to9integer( to97Fi+ ,he following operations are defined for sfi+ed? I( )( J( K( re#( #od( B( KB( L( M( MB( LB( sll( srl( rol( ror( sla( sra( abs( ) :unary> ,he following functions are defined for ufi+ed divide( reciprocal( scalb( #a+i#u#( #ini#u#( find9lsb( find9#sb( resiDe( to921( I+9= Conversion functions are defined for ufi+ed? to9sfi+ed :natural>( to9sfi+ed :real>( to9sfi+ed :unsigned>( to9sfi+ed:signed>( add9sign :ufi+ed>( to9signed( to9real( to9integer( to9Fi+ All of the operators are overloaded for /real0 and /integer0 data types" In each case the nu#ber is converted into fi+ed point before the operation is done" ,hus the fi+ed)point operand #ust be of a for#at large enough to acco##odate the converted input or a /vector truncated0 warning is produced" In the case of an integer( the nu#ber is converted in the for# /integer9width downto 20 which causes the siDe of the output vector to change accordingly" In these functions /fi+ed9saturate0 is set to true regardless of what the /overflow9style0 constant is set to" ,his pac-age defines 6 constants that are used to #anipulate fi+ed)point nu#bers? constant fixed_)ound : boolean : t)ue; -- :ound o) t)uncate constant fixed_satu)ate : boolean : t)ue -- satu)ate o) w)ap constant fixed_gua)d_bits : natu)al : !; -- gua)d bits fo) )ounding ,hese constants are defaults( and can be overridden everywhere they are used" @round9style@ defaults to fi+ed9round :true> that turns on the rounding routines" If false then the nu#ber is truncated" If the 8SN of the re#ainder is a @1@ A5D the LSN of the unround result is a O1O or the lower bits of the re#ainder include a O1O then the result will be rounded" ,his is si#ilar to the floating)point /round9nearest0 style" @overflow9style@ default to fi+ed9saturate :true> that returns the #a+i#u# possible nu#ber if the nu#ber is too large to represent( otherwise a @wrap@ routine is used which si#ply truncates the top bits" 7nli-e the way it is done in /nu#eric9std0( the sign bit is not preserved when wrapping" ,hus it is possible to get positive result when resiDing a negative nu#ber in this #ode" Finally @guard9bits@ defaults to @fi+ed9guard9bits@ which defaults to 6" !uard bits are used in the rounding routines" If guard is set to 2( then the rounding is auto#atically turned off" ,hese e+tra bits are added to the end of nu#bers in the division and /to9real0 functions to #a-e the nu#bers #ore accurate" ,he /resiDe0 function is defined as follows? function )esi2e (a)g : sfixed; constant intege)_widt& : ;<=>?>:; constant f)action_widt& : ;<=>?>:; constant )ound_st(le : @AAB>C< : fixed_)ound; constant oDe)flow_st(le : @AAB>C< : fixed_satu)ate) In /saturate0 #ode :where overflow9style is true> if the output siDe is s#aller than the input nu#ber then the nu#ber will /saturate0" An unsigned fi+ed point will saturate to all /10( a signed positive nu#ber will be all /10 with the first bit a /20( and a signed negative nu#ber will saturate to be all /20 with the first nu#ber a /10" If in /wrap0 #ode :where overflow9style is false> the nu#ber will be truncated" In this case the top or the nu#ber is si#ply truncated without regard to the sign bits( so you can truncate a negative nu#ber to be a positive one" ,he rounding routines are left intact in /wrap0 #ode" If /round9style0 is true( then the rounding routines are turned on" Ptherwise the nu#ber is si#ply truncated" Shift operators are functionally the sa#e as the 12;<)1QQ6 shift operators with the e+ception of the arith#etic shift operations" An arith#etic shift :/sra0( or /sla0> on an unsigned nu#ber is the sa#e as a logical shift" An arith#etic shift on a signed nu#ber is a logical shift if you are shifting left( and an arith#etic shift :sign bit replicated> if you are shifting right" ,he divide function is defined as follows? function diDide ( l, ) : sfixed; gua)d_bits : <C=+:CB : fixed_gua)d_bits; )ound_st(le : @AAB>C< : fixed_)ound) )etu)n sfixed; ,he output is siDed with the sa#e rules as the /K0 operator" ,he function allows you to override the nu#ber of guard bits and the rounding operation" 5ote that the output siDe is calculated so that overflow is not possible" ,he reciprocal function is defined in a very si#ilar #anor to the divide function? function )ecip)ocal ( a)g : ufixed; gua)d_bits : <C=+:CB : fixed_gua)d_bits; )ound_st(le : @AAB>C< : fixed_)ound) )etu)n ufixed; ,his function perfor#s a /1K=0 function( with the output vector following the siDing rules as noted above" ,his function is very useful for dividing by a constant( e+a#ple? C : @4Eons; Can be rewritten as? C : @3)ecip)ocal(Eons); Since a #ultiply uses less logic then a divide this can save you significant hardware resources" ,he /scalb0 function is a fi+ed)point version of a very co##on floating)point function" ,he function loo-s li-e this? function scalb (( : ufixed; < : -;?<>F) )etu)n ufixed; ,his function co#putes y J 4JJ5 without co#puting 4JJ5 by using a shift operator" ,he siDe of the output nu#ber is the sa#e as the input" For this function overflow and rounding functions are ignored( as this is treated li-e a shift operator" ,he /50 input is also overloaded for the type /I5,'!'.0" ,he /#a+i#u#0 and /#ini#u#0 functions do a co#pare operation and return the appropriate value" ,hese functions are not overloaded for integer and real inputs" ,he siDe of the inputs does not need to #atch" ,he /find9lsb0 and /find9#sb0 functions are used to find the #ost significant bit or least significant bit of a fi+ed)point nu#ber" ,he function loo-s li-e the following? function find_,sb (a)g : ufixed; ( : -=F_+BA?;E) )etu)n ;<=>?>:; In this case( /y0 can be any /std9ulogic0 value" ,hese functions search for the first occurrence of /y0 in the fi+ed)point nu#ber" /find9#sb0 starts at the 8SN :argHhigh> and goes down" /find9#sb0 starts at the LSN :argHlow> and goes up" If that value is not found in the /find9#sb0 function( then /argHlow)10 is returned" If the value is not found in the /find9lsb0 function then /argHhighI10 is returned" /to9210 and /Is9=0 are si#ilar in function to the nu#eric9std functions with the sa#e na#e" 8ost synthesis tools do not support any IKP for#at other than /std9logic9vector0 and /std9logic0" ,hus functions have been created to convert between std9logic9vector and ufi+ed or sfi+ed and visa versa? 7f;96 LB to9ufi+ed :slv;( uf;96Hhigh( uf;96Hlow>E and Slv; LB to9slv :sf;96>E Pne of the changes #ade to all pac-ages in vhdl)4225 is that the read and write routines for all data types are now defined in the sa#e pac-age that defines that type" ,hus the .'AD( %.I,'( H.'AD( H%.I,'( P.'AD( and P%.I,' routines are defined for fi+ed)point data types" A /"0 Separator is added between the integer part and the fractional part of the fi+ed)point nu#ber" ,hus if you write out or /<"50 e+a#ple fro# above you will get the string @22112"12222@( which you can also read into that data type" 5ew to vhdl)4225 are the functions /to9string0( /to9ostring0 and /to9hstring0" ,hese are very useful in /assert0 state#ents" '+a#ple? Csse)t x( :epo)t to_st)ing(x) G H 4 H G to_st)ing(() )epo)t e))o); Pr( if you prefer to see the nu#bers as /real0 nu#bers( you can use? Csse)t x( :epo)t to_st)ing(to_)eal(x)) G H 4 H G to_st)ing(to_)eal(()) )epo)t e))o); 8ath%or-s Si#ulin- is these days the #ost co##on way to define a fi+ed point DS algorith#" It what would see# to be a #a$or step into the past as it is sche#atic based" In Si#ulin- an unsigned fi+ed point nu#ber is described as ufi+R1A(12S( which specifies a 1A bit long word with 12 bits after the fraction" ,his translates into /ufi+ed :6 downto G12>0 in the unsigned fi+ed)point type" ,he Si#ulin- /sfi+0 notation translates #uch better because of the e+tra sign bit that #ust be generated" Sfi+:1A( 12> will translate into /sfi+ed:A downto G12> in the notation of the /fi+ed9p-g0" Issues? A negative or /to0 inde+ is flagged as an error by the fi+ed point routines" ,hus if you define a nu#ber as /ufi+ed :)1 to 5>0 the routines will auto#atically error out" String literals are also a proble#" Ny default( if you do the following? 9 < a ' H/""/""I; ,he inde+ of the fi+ed)point nu#ber is undefined" ,he *HDL co#piler will assu#e that the range of this nu#ber has the range /IntegerHlow to integerHlowI50( #a-ing it very s#all" ,o avoid crashing the si#ulator with a 64(222 bit wide nu#ber this also will auto#atically error out" Floating-point numbers: After Fi+ed point the ne+t step is floating point" Floating)point nu#bers are well defined by I''');5A :64 and <A bit> and I''')15A :variable width> specifications" Floating point has been used in processors and I for years and is a well) understood for#at" ,here are #any concepts in floating point that #a-e it different fro# our well understood signed and unsigned nu#ber notations" ,hese co#e fro# how a floating)point nu#ber is defined" Lets first ta-e a loo- at a 64)bit floating)point nu#ber? - >>>>>>>> JJJJJJJJJJJJJJJJJJJJJJJ !" !/ #$ #0 / '4- exp. J)action Nasically( a floating)point nu#ber co#prises a sign bit :I or )>( a nor#aliDed e+ponent( and a fraction" ,o convert this nu#ber bac- into an integer( the following eTuation can be used? - 3 ("./ ' J)action4Kax f)action) 33 # (exponent 5 exponent_base) where the /e+ponent9base0 is 4JJ::#a+i#u# e+ponentK4>G1> and /Fraction0 is always a nu#ber less than one" ,hus for 64 bit floating point an e+a#ple would be? / "//////" "/"///////////////////// B I1 J 4JJ :14Q G 14;> J :1"2 I 12A15;<2K1<;;;41<> B I1 J 1"<45 J A"2 B <"5 ,here are also /denor#al nu#bers0( which are nor#ally nu#bers s#aller than can be represented with this structure" ,he tag for a denor#al nu#ber is that the e+ponent is /20" ,his forces you to invo-e another for#ula? / //////// "/////////////////////// B I1 J 4JJ )14< J :1611<21K1<;;;41<> B I1 J 4JJ)1 J 4JJ)14< B 4JJ)14; 5e+t are the /constants0 that e+ist in the floating)point conte+t? / //////// //////////////////////// B 2 " //////// //////////////////////// B )2 :which B 2> / """""""" //////////////////////// B positive infinity " """""""" //////////////////////// B negative infinity If you get a nu#ber with an infinite :all /10s> e+ponent and anything other than an all Dero fraction( then it is said to be a 5A5( or /5ot A 5u#ber0" 5A5s co#e in two types( signaling and non)signaling" For the purposes of these pac-ages I chose a fraction with an 8SN of /10 to be a signaling 5A5 and anything else to be a Tuiet 5A5" ,hus you wind up with the following classes :or states> that each floating)point nu#ber can fall into? nan Signaling 5a5 Tuiet9nan Uuiet 5a5 neg9inf 5egative infinity neg 5egative nor#aliDed nonDero neg9denor#al 5egative denor#aliDed neg9Dero )2 Dero I2 denor#al ositive denor#aliDed nor#al ositive nor#aliDed nonDero infinity ositive infinity In the pac-ages I use these states to both e+a#ine and create nu#bers needed for floating point operations" ,his defines the type /valid9fpstate0 " ,he constants Derofp( nanfp( Tnanfp( pos9inffp( neginf9fp( neg9Derofp are also defined" .ounding co#es in A different flavors .ound nearest .ound positive infinity .ound negative infinity .ound Dero /.ound nearest0 has the e+tra caveat that if the re#ainder is e+actly V then you need to round so that the LSN of the nu#ber you will get is a Dero" ,he i#ple#entation of this feature reTuires two co#pare operations( but they can be consolidated" .ound negative infinity rounds down( and round positive infinity always rounds up" .ound Dero is a #i+ of the two( and has the effect of doing a truncation :no rounding>" !e floating point packages: ,he new floating)point pac-ages ta-e advantage of a new feature in *HDL)4225 called pac-age generics" ,he 64 bit floating point pac-age loo-s li-e the following? package fp&dl!#_pkg is new ;>>>.fp&dl_pkg gene)ic ,ap ( fp_f)action_widt& * #!; -- #! bits of f)action fp_exponent_widt& * 8; -- exponent 8 bits fp_)ound_st(le * )ound_nea)est; -- )ound nea)est algo)it&, fp_deno),ali2e * t)ue; -- =u)n on Feno),ali2ed nu,be)s fp_c&eck_e))o) * t)ue; -- =u)n on <C< and oDe)flow p)ocessing fp_gua)d_bits * !); -- nu,be) of gua)d bits ac-age generics allow you to specify any data width or siDe of floating point nu#ber you li-e" ,he resulting data type will be called /fp0" ,hus you have the following use #odel? signal a, b, c : fp; signal x : unsigned ($ downto /); constant L; : )eal : !."0; begin b < to_fp (x); c < a ' L;; ,he actual floating)point type is defined as follows? t(pe fp is a))a( (fp_exponent_widt& downto -fp_f)action_widt&) of -=F_BA?;E; Pnce again we are using the negative inde+ tric- to separate the fraction part of the floating)point nu#ber fro# the e+ponent" ,he top bit is the sign bit :Whigh> the ne+t bits are the e+ponent :Whigh)1 downto 2> and the negative bits are the fraction :)1 downto Wlow>" For a 64)bit representation that specification #a-es the nu#ber loo- as follows? / //////// ////////////////////// 8 7 / -" -#! '4- exp. f)action where the sign is bit 1( the e+ponent is contained in bits ;)2 :1 bits> with bit ; being the 8SN( and the #antissa is contained in bits )1 ) )46 :64 ) 1 ) 1 B 46 bits> where bit )1 is the 8SN" ,he negative inde+ for#at turns out to be a very natural for#at for the floating)point nu#ber( as the fraction is always assu#ed to be a nu#ber between 1"2 and 4"2 :unless we are denor#aliDed>" ,hus the i#plied /1"20 can be assu#ed on the positive side of the inde+( and the negative side represents the fraction less than one" *alid values for fp9e+ponent9width and fp9fraction9width are 6 and up" ,hus the s#allest :width wise> nu#ber that can be #ade is fp : 6 downto G6> or a ;)bit Floating)point nu#ber" A generic called @fp9denor#aliDe@ is also provided for all operations" ,his para#eter allows you to disable the creation of denor#aliDed nu#bers" In nor#al :a-a poor #anOs> floating point( the nu#ber closest to @2@ consists of an e+ponent of @1@ and a #antissa of @2@ :4JJ)14< in the 64 bit case>" Denor#al nu#bers allow for nu#bers s#aller than this by assu#ing that if the e+ponent is @2@ than the #antissa represents a fraction less than 1" ,his adds a great deal of overhead to the floating point operations( and was thus left as an option defaulted to @true@ in the I''' 64 and <A bit i#ple#entations( but can be shut off" /fp9chec-9error0 turns off overflow and 5A5 processing" As every nu#ber #ust go through this chec- for every operation according to I''');5A this represents a significant hardware savings" /fp9guard9bits0 are bits that are added to the end of every operation to #aintain precision" 8ost i#ple#entations of floating point use 6 bits" Any nu#ber of bits :including 2> is valid" 5ote that setting the nu#ber of guard bits to 2 is si#ilar to turning off rounding with the /round9Dero0 round9type" Defined operations for floating point nu#bers are? 7nary )(abs( /I0( /)/( /J0( /K0( /re#0( /#od0( /B0( /KB0( /L0( /M0( /LB0( /MB0 All of these operations are overloaded for /integer0 and /real0 types" ,he non floating)point type is first converted into floating point and the operation is perfor#ed" If the nu#ber is out of bounds for that nu#ber then the appropriate infinity or Dero is returned" 'rrors fro# these routines are treated as described in I''');5A" Defined functions for floating point nu#ber aredividbyp4 :divide by a power of 4>( reciprocal :1K+>( #a+i#u#( #ini#u#( to9unsigned( to9signed( to9ufi+ed( to9sfi+ed( to9real( to9integer ( ,o9fp:SI!5'D>( ,o9fp:75SI!5'D>( ,o9fp:ufi+ed>( ,o9fp:sfi+ed>( ,o9fp:integer>( ,o9fp:real>( to921" ,hese functions operate silently( this is to say they the give no warnings for overflow or underflow" Putputting either infinity( or 5A5 signals errors in the to9fp routines" 'rrors fro# the routines that read F nu#bers are returned the sa#e way" Functions reco##ended by I''')15A? Copysign :+( y> G .eturns + with the sign of y" Scalb :y( 5> G .eturns yJ:4JJn> :where 5 is an integer or SI!5'D> without co#puting 4JJn" Logb :+> G .eturns the unbiased e+ponent of + 5e+tafter:+(y> G .eturns the ne+t representable nu#ber after + in the direction of y" Fininte:+> G Noolean( true if = is not positive or negative infinity Isnan:+> G Noolean( true if = is a 5A5 or Tuiet 5A5" 7nordered:+( y> G Noolean( returns true of either = or C are so#e type of 5A5" Class:+> G valid9fpstate( returns the type of floating point nu#ber :see valid9fpstate definition above> ,wo e+tra functions na#ed brea-9nu#ber and nor#aliDe are also provided" /brea-9nu#ber0 ta-es a floating)point nu#ber and returns a /SI!5'D0 e+ponent :biased by G1> and an /ufi+ed0 fi+ed point nu#ber" /nor#aliDe0 ta-es a SI!5'D e+ponent and a fi+ed)point nu#ber and returns a floating)point nu#ber" ,hese functions are useful for ti#es when you want to operate on the fraction of a floating)point nu#ber without having to do the shifts on every operation" ,o9slv :aliased to to9std9logic9vector and to9StdLogic*ector> as well as to9fp:std9logic9vector> are used to convert between std9logic9vector and fp types" ,hese should be use on the interface of your designs" ,he result of /to9slv0 is a std9logic9vector with the length of the input fp type" ,he procedures .eading and writing floating point nu#bers are also included in this pac-age" rocedures read( write( oread( owrite :octal>( bread( bwrite :binary>( hread and hwrite :he+> are defined" ,o9string( to9ostring( and to9hstring are also provided for string results" Floating point nu#bers are written in the for#at /2?222?2220 :for a ; bit F>" ,hey can be read as a si#ple string of bits( or with a /"0 Pr /?0 separator" Changing fro# one floating point for#at to another can be done through the /resiDe0 function provided" '+a#ple? use ieee.fp&dl!#_pkg.all; a)c&itectu)e :=B of 777 is alias fp!# is ieee.fp&dl!#_pkg.fp; -- o) Must HfpI alias fp60 is ieee.fp&dl60_pkg.fp; signal x : fp!#; signal ( : fp60; begin N < ieee.fp&dl60_pkg.)esi2e (a)g * (, exponent_widt& * fp_exponent_widt&, f)action_widt& * fp_f)action_widt&, deno),ali2e * fp_deno),ali2ed, )ound_st(le * fp_)ound_st(le); "!allenges for #$nt!esis %endors: 5ow that we are bringing nu#bers that are less than 1"2 into the real# of synthesis( the type /.'AL0 beco#es #eaningful" ,o9fp :8A,H9I> will now evaluate to a string of bits" ,his #eans that synthesis vendors will now have to not only understand the /real0 type( but the functions in the /#ath9real0 I''' pac-age as well" Noth of these pac-ages depend on a negative inde+" Nasically( everything that is at an inde+ that is less than Dero is assu#ed to be to the right of the deci#al point" Ny doing this we were able to avoid using record types" ,his also represents a challenge for so#e synthesis vendors( but it #a-es these functions portable to *erilog" .eferences 1" Floating point for *HDL and *erilog G David Nishop( 'ast#an Xoda- ) D*Con 4226 4" I''' Std ;5A)1Q15 ) I''' Standard for Ninary Floating)oint Arith#etic" 6" I''' Std 15A)1Q1; ) I''' Standard for Ninary Floating)oint Arith#etic" A" Lecture 5otes on the Status of I''' Standard ;5A for Ninary" Floating)oint Arith#etic ) rof %" Xhan( 7niversity of California" 5" /%hat 'very Co#puter Scientist Should Xnow About Floating)oint Arith#etic(0 by David !oldberg" <" Floating point types for Synthesis G Dr" Ale+ Fa#firescu" ;" .S* based bandwidth allocation G Ananda .angan and *ignesh 5anda-u#ar( %ashington 7niversity in St" Louis" 1" &ttp:44babbage.cs.Oc.edu4cou)ses4cs!0"4;>>>-7$0.&t,l ) I''');5A Floating)oint Conversion" Q" &ttp:44www.,a)kwo)ld.co,4s&owfloat.&t,l ) Deco#pose I''' Floating oint 5u#ber" 12" &ttp:44www.ecs.u,ass.edu4ece4ko)en4a)it&4si,ulato)4JLCdd4 Y Floating)point addition and subtraction" 11" I''' 12;<"6 ) *HDL Standard Synthesis pac-ages" 14" CadenceOs *erilog)=L .eference 8anual"