Thermal-Aware Testing of Network-on-Chip Using Multiple-Frequency Clocking

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Thermal-Aware Testing of Network-on-Chip Using Multiple-Frequency Clocking

Chunsheng Liu Vikram Iyengar D.K. Pradhan


Computer Science University of Bristol Bristol BS8 1UB UK pradhan@cs.bris.ac.uk

Computer and Electronic Engineering University of Nebraska-Lincoln Omaha, NE 68182, USA chunshengliu@unlnotes.unl.edu

IBM Microelectronics 1000 River Road, Building 863B Essex Jct, VT 05452, USA vikrami@us.ibm.com

Abstract
Chip overheating due to excessive and unbalanced power dissipation has become a critical problem during test of complex core-based systems. In this paper, we address the overheating problem in network-on-chip systems by using on-chip multiple-frequency clocking. We control the core temperatures during test scheduling by varying the test clock frequency assigned to each core, so that the power dissipation of each core during test can be adjusted individually and thermal balance is achieved. We present a heuristic where the optimization process can be integrated with test scheduling. Experimental results for NoC benchmarks show that the proposed method can guarantee thermal safety and yield better thermal balance.

1 Introduction
Network-on-chip (NoC) has been proposed as the preferred interconnection scheme for the next generation complex VLSI [1, 17] systems to replace the traditional systemon-Chip (SoC) methodology. This new paradigm relies on a packet-switching network implemented on the chip to provide high performance interconnection to embedded cores. Compared to traditional SoC, testing for NoC-based systems poses considerable challenges due to the existence of highly complex network components [17]. Figure 1 shows the implementation of the system d695 [12] in an NoC architecture. Thermal management during the design process has been studied using layout redesign and thermal placement [2, 4, 6]. However, efcient thermal management during test remains a challenge. High power dissipation during test can cause high power density, which forms hot spots. The problem becomes even more acute for core-based systems, since embedded cores can have a large variation in die size and power dissipation. In an ad hoc test schedule, cores having lower power dissipation or larger die size may remain cool, while those having higher power dissipation or smaller die size can overheat and cause damage. In addition, thermal re-placement of cores is impossible because layout is optimized for functional operation and is already xed at the time of test. Finally, switching activity across the chip differs considerably between functional operation and test. For example, cores that operate concurrently in functional mode may not be tested in parallel, which further aggravates thermal imbalance on the chip. A simple strategy is to use a slower test clock to reduce

power dissipation and to guarantee thermal safety. However, it is inefcient because thermal balance among cores cannot be achieved. As a result, the cores generating less heat are unnecessarily cooled down, which will adversely affect test time and increase test cost. Prior work attempts to reduce the excessive test power by using power constrained scheduling. However, it has been shown that power constraints cannot guarantee thermal safety [11, 14]. Recent work has attempted to achieve thermal safety for traditional SoC testing through test scheduling [11, 14, 15]. These methods are based on the use of dedicated test access mechanisms (TAMs) and the results are tightly related to the TAM design. In an NoC-based system, however, the implementation of network components (routers, channels, etc.,) has already imposed a considerable amount of area overhead. Therefore, recent advances tend to reuse the existing on-chip network for test data transportation without introducing new overhead [5, 10]. Therefore, existing thermal-aware testing methods for SoCs are not directly applicable to reused-based NoC testing. In this paper, we propose a new method for thermal-aware test scheduling in NoC-based systems. It is based on the use of multiple-frequency on-chip clocking [10]. This is specically designed for NoC-based systems because here cores are globally asynchronous, and they communicate by sending and receiving messages in the form of packets via the network. In the proposed method, each core can receive one of several test clock frequencies generated by on-chip logic. During test application, a core can vary its power dissipation, and hence temperature, by choosing a different clock frequency based on the test control information carried in test packets. Slower clocks are used to reduce temperature while faster clocks are used to reduce test time. This dynamic clock frequency scaling scheme can not only guarantee thermal safety but also achieve thermal balance and optimized test time. We present a heuristic algorithm by which the assignment of variable clock frequencies to each core can be optimized to achieve thermal balance. This can eventually reduce hot spot temperature and also reduce test time. The effectiveness of the method is corroborated by experimental results on several NoC benchmarks. Note that this scheme is different from the one proposed in [15], because the proposed method does not require variable tester clock frequencies during test application and cores scheduled simultaneously can use different on-chip clock fre-

Proceedings of the 24th IEEE VLSI Test Symposium (VTS06) 0-7695-2514-8/06 $20.00 2006

IEEE

peratures. All these approaches are designed for SoC systems based on dedicated TAM, but cannot be directly applied to NoC-based systems. A most recent work [18] attempts to obtain thermal safety and reduced test time simultaneously by using both faster and slower on-chip clocking. However, the thermal optimization process is applied on top of the scheduling and the test clock frequency assigned to a core cannot be varied during test application.

3 Multiple-frequency clocks
Figure 1. System d695 implemented in NoC architecture. quencies. Moreover, the test clock frequency of a core at a specic time is selected from several available on-chip clock frequenciess but cannot be scaled arbitrarily as in [15]. In this paper we use the term NoC to denote an on-chip interconnection network of routers and channels that may be implemented on an SoC, and the term NoC-based system to denote the system including NoC and embedded cores. The rest of the paper is organized as follows. In Section 2, we review some related prior work. In Section 3, we introduce the use of multiple-frequency on-chip clocking in testing NoC-based system and the use of thermal constraints in this work. In Section 4, we present a heuristic that integrates the assignment of test clocks to cores in test scheduling. Finally in Section 5, we present experimental results on several NoC benchmarks. Here, we introduce the use of multiple-frequency on-chip clocking in testing NoC-based system and the use of thermal constraints.

3.1 On-chip clocking in NoC-based systems


The multiple-frequency on-chip clocking scheme [10] used in this paper for thermal management is based on recent advances in on-chip clocking for test [3]. On-chip clock generation is a design-for-test technique in which the slow tester clock is multiplied by on-chip circuitry, e.g. a PLL, and used to launch and capture test data at internal ip-ops. It is shown in [10] that for an NoC-based system, it is often the case that some cores cannot efciently utilize the entire width of network channel (e.g. 32 bit). Therefore, one can use the idle channel width to transport more test data and assign a faster test clock to the core. On the other hand, some cores may have excessive power dissipation and they need slower test clocks to guarantee thermal safety. Since using a slower clock can cause channel bandwidth to be wasted, a time-division multiplexing is used such that several cores with slower test clocks can share the bandwidth on a channel. It is shown that this variable-rate clocking scheme can significantly reduce test application time while power constraints are met [10]. The test architecture using on-chip clocking is illustrated in Figure 2 [10]. In this simplied example, two blocks of test data are transported through the network channel and are presented to the core test wrapper. These test data remain stable on the network channel for the period of the original tester clock. Test data is loaded into the wrapper scan chains one block at a time in every on-chip clock cycle, which is twice as fast as the tester clock. No changes are required to the original core test wrapper, thereby protecting the core vendors IP as well as smoothing the cores integration into the NoC. The use of on-chip clocks slower than the tester clock is even simpler. It does not require any hardware modication but a time-division test scheduling, where test data for several cores can share the same routing channel [10] as illustrated in Figure 3.

2 Prior work
Thermal management and hot spot removal during design process for functional operation have been proposed [2, 4, 6]. These methods rely on either layout replacement or task management to achieve an even thermal distribution on the chip. However, they are optimized for functional operation but not for testing. Therefore, these approaches are not suitable for thermal management in testing. Most prior work deals with the overheating problem by using power constraints. Test scheduling algorithms for SoCs with power constraints have been extensively investigated [8, 9, 13]. More recently, power-aware test scheduling algorithms have been presented for NoC-based systems [5, 10]. However, it has been shown that using power constraints cannot guarantee thermal safety [11, 14]. This will be further corroborated in our experimental results. Recently, some thermal-aware test scheduling methods have been proposed [11, 14, 15, 18]. In [14], a test session thermal model is used to determine heat transfer characteristics during test, and a heuristic is used to obtain a schedule without violating a temperature constraint. In [15], variable tester clock frequencies are used to control the power dissipation in different test sessions to guarantee thermal safety. A disadvantage of this approach is the requirement of variablefrequency clocks on the tester. In [11], heuristics using layout information and progressive weighting are proposed to achieve thermal balance on the chip and reduced hot spot tem-

3.2 Variable-rate on-chip clocking prole


Since power dissipation is directly related to heat generation and core temperature, it is intuitive that we can use this multiple-frequency clocking scheme for thermal management. Hot cores will receive slower clocks while cool cores will receive faster clocks to achieve a global thermal balance

Proceedings of the 24th IEEE VLSI Test Symposium (VTS06) 0-7695-2514-8/06 $20.00 2006

IEEE

Slow tester clock Tester PLL

Fast on-chip clock

Core B tested using on-chip clock

Mux Test data Core A tested using slow clock Wrapper Core A 4 Router 4 Router

Wrapper Core B

Figure 4. Using clock scaling in test application.

Network channel SoC

Figure 2. Test architecture using on-chip clocking [10]. Figure 5. Clock frequency selection in NoC for core testing. data (carried in payload). Note that this does not require additional hardware since all hardware is implemented for functional operation and reused in testing [5, 10]. One eld in test control can be specied as test clock frequency selection, which is used to select one of the on-chip clock frequencies for testing the core. It can be seen that in essence each test vector can be applied using a different clock frequency. However, test scheduling in NoC is NP-complete [10] and frequently switching test clock frequencies can signicantly affect the efciency of the test scheduling algorithm and yield compromised test time. Therefore, in this paper we only switch test clock frequencies when a change occurs in scheduling, i.e., whenever the test of a core is nished or a new test is started. It can be concluded that in an NoC-based system, cores that are being scheduled simultaneously can be tested using different frequency clocks. Moreover, the test clock frequency on each core during the scheduling process can be varied. Note that this scheme is different from the clock scaling scheme used in [15], which is designed for traditional SoC system and a variable tester clock is required. Here we do not require variable-rate tester clocks during test. Instead, variable-rate clocks are generated on chip. Moreover, in [15] cores being tested use the same test clock frequency, while the proposed scheme takes the advantage of NoC architecture and can apply different clock frequencies on different cores. Finally, in [15] the tester clock frequency can be scaled arbitrarily. But in this paper the clock frequency only needs to be selected from the available on-chip clock frequencies.

Figure 3. Slower on-chip clocking in a time division


scheme [10].

on the chip. Using a slower clock can reduce hot spot temperature while using a faster clock can reduce test time, as long as the thermal safety is guaranteed. If the original tester clock diffrequency is , we can assume there are a total of ferent on-chip clock frequencies. E.g. for , the set of available clock frequencies can be . However, using a xed test clock frequency (either lower or higher than the original clock frequency ) for a core throughout its test application may not be efcient in all the cases. This is illustrated in Figure 4 where three cores being scheduled. Cores are represented by rectangles and the heights of the rectangles correspond to their test clock frequency (and hence power dissipation and temperature). In Figure 4(a), after the test of Core 1 is nished, Core 3 can be scheduled on the available network resource (details shown in Section 4). However, it is possible that core 3 is a relatively cool core and the overall chip temperature is well under the thermal safe constraint. Therefore, we can increase the test clock frequency of Core 2 from this time, as long as the thermal safe constraint is not violated. As a result, the overall test time is reduced from to . In Figure 4(c), after Core 1 is nished, scheduling Cores 2 and 3 simultaneously will cause a high core temperature (hot spot) that violates the thermal constraint. Therefore, we decrease the test clock frequency of Core 2. Note that although the test time of Core 2 is increased accordingly, the overall test time may not be compromised, in . Note that this change may occur whenthis example ever the schedule is changed, hence the test clock frequency of a core may vary several times during its test. It can be seen that this variable-rate clocking scheme (or clock scaling) can reduce test time and achieve thermal safety. A possible hardware architecture is shown in Figure 5. Test packet routed to the core should be rst unpacked to obtain test control information (carried in packet header) and test

3.3 Thermal simulation and parameters


In this paper we use the Hotspot thermal simulation tool and the corresponding RC-equivalent thermal model proposed in [16]. Difference from the assumptions in [14], we allow heat transfer between any core at anytime and cores not being tested can also be heated up due to heat conduction and convection. This represents a more realistic scenario. The NoC-based systems in this paper are created either from ITC02 SoC benchmarks or from industrial cores. The network is based on a 2-D mesh topology and X-Y routing [5]. Test scheduling is based on the reuse of the network as TAM to transport test data and test responses through a dedicated path routing method [10]. Figure 1 shows the implementation

Proceedings of the 24th IEEE VLSI Test Symposium (VTS06) 0-7695-2514-8/06 $20.00 2006

IEEE

of the system d695 in such a network where Cores 8 and 10 are being tested using dedicated routing paths. We assume in this work that the NoC itself (routers, channels etc.) is already tested as fault free and we focus on testing the cores. We handcraft a oorplan for each benchmark. We also assume that the router associated with each core is integrated with the core in layout so that the oorplan of a core represents the core, the router and the interconnections between them. The power generated by a core during testing is calculated based on its complexity, e.g. the number of ip-ops, I/Os, test vectors of the average core etc. Router power is assumed to be power. We neglect the details due to the lack of space.

Procedure NoC schedule(C, N, P

, F,CLK )

4 Test scheduling with thermal optimization


In this paper, we propose to integrate the thermal optimization through the use of multiple-frequency clocking into test scheduling. The goal of thermal optimization is to achieve both thermal balance (hence reduced hot spot temperature) and reduced test time simultaneously. Such a test scheduling problem for NoC-based system can be formally expressed as follow: : Given the test set parameters for the set of cores, cores C, the NoC-based system N (including inputs, outputs, routing algorithm and the network topology), the set of available clock frequencies CLK provided by the on-chip clock generator, the set of thermal parame, the chip oorplan F, and the maximum temters P perature constraint for any core, determine a clock frequency assignment prole with variable on-chip clocking for each core and a test schedule, such that 1) the maximum tem, and 2) the testing perature over all cores does not exceed time is minimized. The above problem can be easily reduced to a standard NoC scheduling problem, which is proved as NPis NPcomplete [10]. Therefore, the problem complete and fast heuristic is required. The pseudo code for NoC test scheduling using multiplefrequency clocking with thermal optimization is outlined in Figure 6. The schedule strategy is similar to those proposed in [10], but here thermal simulation is included for calculating core temperature. It is different from[18] in that the clock frequency assigned to each core can be varied along the schedule to achieve an optimized thermal distribution. Before the scheduling starts, we need to verify that a single concore can be indeed scheduled without violating the straint. This is done by the Thermal integrity check subroutine, in which each core is assumed to be tested individually using the slowest clock frequency and the core temperatures are calculated by thermal simulation. We neglect the details because of limited space. We rst obtain an initial clock frequency assignment and create a list of cores based on the decreasing order of their test times, which are calculated from the clock frequency assignment. This is because test time of a core is determined by the test clock frequency assigned. We then permute all the possible combinations of I/O pairs and the test scheduling begins

/*Create ordered list of various on-chip clocks*/ ,F,CLK); 1. Thermal integrity check(C, P 2. Set initial clock assignment to cores; 3. Sort cores in decreasing order of test time under current clock assignment; 4. Permute the combinations of I/O pairs; 5. For each permutation 6. While there are unscheduled cores 7. For each unscheduled core 8. Find a free I/O pair; 9. If no free I/O pair , F, CLK); 10. Clock adjustment(C,N,P 11. Update temperatures and current time, repeat from Line 6; 12. Else 13. If NoC check path=PATH BLOCK 14. If all cores have been attempted , F, CLK ); 15. Clock adjustment(C,N,P 16. Update temperatures and current time, repeat from Line 6; 17. Else 18. Try next core in the list, repeat from Line 13; 19. Else 20. Assign core to path, update time tags.

Figure 6. NoC test scheduling using multiple-frequency


clocking with thermal optimization.

in Line 5 for each permutation. We try to assign the rst core in the list to the rst available I/O pair. If no I/O is available, it indicates that all network resources have been utilized and no more cores can be scheduled at current time, i.e. current schedule is determined. We then invoke a Clock adjustment procedure in Line 10, trying to obtain thermal safety and reduced test time by adjusting the clock frequency assignment to cores currently being scheduled, which will be shown in Figure 7. After clock frequencies are adjusted, we update the temperatures of cores by thermal simulation as well as the current time in Line 11. The scheduling will then repeat at the new time. Otherwise, if a free I/O pair is found, a tentative routing path is created, and the subroutine NoC check path is used to check if there is any resource (network channels and I/Os) conict on this path, see Line 13. We neglect the details for path checking due to the lack of space. A time tag is maintained on every network resource to indicate its availability for routing. If due to the resource conict no core can be scheduled, then Clock adjustment is invoked again in Line 15 and the temperatures and current time are updated in Line 16. If no path conict is detected in Line 13, the core is scheduled at current time and next core will be attempted. The Clock adjustment routine is shown in Figure 7. It is invoked when the schedule at a specic time is determined, hence we only consider the cores that is currently being scheduled, included in set . We rst run thermal simulation to determine the core temperatures under current clock frequency assignments in Line 3. If the hottest core temperature ex, the current clock frequency assignment violates ceeds the thermal safe constraint. We save the assignment in a list L such that it will not be attempted in the subsequent optimization process again to save simulation time. We then invoke the Adjust clk process to slow down the clock frequency on a hot core to obtain thermal safety in Line 9. If is not

Proceedings of the 24th IEEE VLSI Test Symposium (VTS06) 0-7695-2514-8/06 $20.00 2006

IEEE

Procedure Clock adjustment(C,N,P

, F, CLK ) ;

1. Find set of cores currently being tested , 2. While 1 /*adjust clock assignment*/ 3. Thermal simulation to obtain core temperatures; 4. Find core with the highest temperature ; and 5. If 6. break; /*adjustment nished*/ 7. If /*Thermal violation*/ 8. Save assignment in list L; 9. If Adjust clk( , slow, L, CLK)=FAIL 10. Return FAIL; 11. Else /*Thermal safety met*/ 12. If Adjust clk( , fast, L, CLK)=FAIL 13. Return FAIL; 14. Update test time and all time tags, . 15.

because they require many optimization runs and the current Hotspot tool for thermal simulation is still computational intensive. The proposed heuristic is fast and efcient, as corroborated by the experimental results in Section 5.

5 Experimental Results
In this section, we present experimental results for NoC benchmarks: four are crafted from four ITC02 SoC Test Benchmarks [12], and three are created using complex industrial cores from IBM [7]. The IBM benchmarks include processor cores, digital logic cores, embedded PowerPC register arrays and fast serial links. We created hypothetic , layout for each of them. We set and ambient temperature is set to . All simulations can be concluded on a Sun Blade2000 workstation with 1.2G CPU in less than 5 minutes. Before experiments, we set up the parameters for thermal simulation, including layers (silicon, interface material, heat spreader, heat sink), thickness and thermal resistances of all layers and materials, chip dimensions, convection capability of heat sink etc. We omit the details due to the lack of space. Note that the HotSpot thermal model takes into account both lateral and vertical heat conduction and convection. In experiment, we will compare the proposed scheme with the method using power constraint in [10]. We rst show that power constraints cannot guarantee thermal safety. We perform test scheduling with thermal simulation using the powerconstrained test scheduling in [10]. We set the power conand of the sum of all cores power, straint to be corresponding to loose and tight power constraints, respectively. And we obtain the corresponding highest core temper(hot spot) during the schedule (in ) and the ature system test time (in clock cycles). We also calculate the av, the maximum variation of erage of core temperatures (temperature difference between core temperatures the hottest and coolest cores) and the average variation of core temperatures (average difference between core temper) . These values reect the thermal ature and balance characteristic on the chip. Results are shown in Tables 1 and 2.
NoCs d695 g1023 p22810 p93791 IBM-1 IBM-2 IBM-3 Time 12917 11624 144802 523224 7873606 781424 3277504 264.0 298.9 311.0 256.9 292.0 276.3 241.0 134.8 158.2 184.1 130.3 192.6 158.5 148.0 191.8 201.4 221.3 191.3 220.7 215.0 148.5 129.2 140.6 126.8 126.6 99.4 117.8 93.0

Figure 7. Clock frequency adjustment.


Procedure Adjust clk( , clk, L, CLK)

1. If /*use slower clock*/ in decreasing order of temperatures; 2. Sort cores in 3. For each core in 4. If slower clock CLK and not assignment L 5. Return clock frequency assignment; 6. Return FAIL; 7. Else /*use faster clock*/ 8. Sort cores in in increasing order of temperatures; 9. For each core in 10. If faster clock CLK and not assignment L 11. Return clock frequency assignment; 12. Return FAIL.

Figure 8. Adjust clock frequency assignment in test


scheduling.

exceeded, we use Adjust clk to apply a faster clock on a cool core to reduce test time in Line 12. If Adjust clk is success, we need to update the test time of the core and all time tags in the NoC. This is because after clock frequency adjustment, not only has the test time of the core need to be updated, but the time tags on all network resources and the bandwidth in the time division scheme (if a slower clock is used) may also need to be updated. These changes will affect all the subsequent scheduling. The process will quit when a maximum is reached and the number of optimization runs thermal safety is obtained. The process of Clock adjustment is outlined in Figure 8. We simply check if the new slower or faster clock is available in CLK, note that this new clock frequency should have passed the thermal integrity checking to guarantee that the core can be scheduled. In addition, the new clock frequency assignment should not appear in L, otherwise the thermal safety will be violated. We note that the optimization process at each point of the schedule is similar to a simulated annealing process, where while hot spot temperatures are gradually reduced below test time is reduced by increasing test clock frequency on cool cores. The optimization of multiple-frequency clocks during scheduling will eventually achieve thermal balance over the chip. We do not directly apply simulated annealing algorithms

Table 1. Test scheduling results with

power constraints.

It can be seen that although stringent power constraints ( ) are used to restrict power dissipation, the maximum temperature is still far over the thermal safe threshold of . Note that under power constraint, no schedule can be generated for d695 and p93791 because the schedule of a single core will violate the constraint. We also observe that using a tighter constraint will cause a signicant increase

Proceedings of the 24th IEEE VLSI Test Symposium (VTS06) 0-7695-2514-8/06 $20.00 2006

IEEE

NoCs d695 g1023 p22810 p93791 IBM-1 IBM-2 IBM-3

Time 23061 280856 9264966 1032201 6554936

288.0 303.5 229.1 201.4 240.0

138.7 165.7 149.5 119.7 149.3

203.7 237.2 164.1 144.7 148.1

149.3 137.8 79.7 81.7 90.4

[2] G. Chen and S. Sapatnekar. Partition-driven standard cell thermal placement. Proc. Int. Symp. on Physical Design, pp. 75 80, 2003. [3] V. Chickermane, P. Gallagher, S. Gregor and T. St.Pierre. A building block BIST methodology for SOC designs: A case study. Proc. Int. Test Conf., pp. 111120, 2001. [4] C. N. Chu and D. F. Wong. A matrix synthesis approach to thermal placement. IEEE Trans. on CAD, vol. 17, pp. 1166 1174, Nov 1998. [5] E. Cota, L. Carro and M. Lubaszewski. Reusing an On-Chip Network for the Test of Core-based Systems. ACM Trans. on Design Automation of Electronic Systems, vol. 18, pp. 471 499, 2004. [6] W. Hung et al. Thermal-aware IP virtualization and placement for networks-on-chip architecture. Proc. Int. Conf. on Computer Design, pp. 430437, 2004. [7] ASICs Test Methodology, IBM Corporation, Essex Junction, VT 05403. [8] V. Iyengar and K. Chakrabarty. System-on-a-chip test scheduling with precedence relationships, preemption, and power constraints. IEEE Trans. on CAD, vol. 21, pp. 10881094, Sep 2002. [9] E. Larsson, K. Arvidsson,H. Fujiwara, and Z. Peng. Efcient test solutions for core-based designs. IEEE Trans. on CAD, vol. 23, pp.758775, 2004. [10] C. Liu, V. Iyengar, J. Shi, and E. Cota. Power-aware test scheduling in network-on-chip using variable-rate on-chip clocking. Proc. VLSI Test Symp., pp.349354, 2005. [11] C. Liu, K. Veeraraghavan and V. Iyengar. Thermal-aware test scheduling and hot spot temperature minimization for corebased systems. Proc. Int. Symp. DFT, pp. 552560, 2005. [12] E. J. Marinissen, V. Iyengar, and K. Chakrabarty. A set of benchmarks for modular testing of SOCs. Proc. Int. Test Conf., pp. 521528, 2002. [13] M. Nourani and J. Chin. Test scheduling with power-time tradeoff and hot-spot avoidance using MILP. Proc. IEE Computer and Digital Techniques, vol. 151, pp. 341355, 2004. [14] P. Rosinger, B. Al-Hashimi and K. Chakrabarty. Rapid generation of thermal-safe test schedules. Proc. Design, Automation and Test in Europe (DATE) Conf., pp. 840845, 2005. [15] E. Tafaj, P. Rosinger and B. Al-Hashimi. Improving thermalsafe test scheduling for core-based system-on-chip using shift frequency scaling. Proc. Int. Symp. DFT, pp. 544551, 2005. [16] K. Skadron et al. Temperature-aware microarchitecture. Proc. Int. Symp. on Computer Architecture, pp. 213, 2003. [17] B. Vermeulen, J. Dielissen, K. Goossens, and C. Ciordas. Bringing communication networks on-chip: the test and verication implications. IEEE Communications Mag., vol. 41, pp. 7481, 2003. [18] C. Liu and V. Iyengar. Test scheduling with thermal optimization for network-on-chip systems using variable-rate on-chip clocking. Proc. Design Automation and Test in Europe Conf., 2006, to appear.

Table 2. Test scheduling results with


NoCs d695 g1023 p22810 p93791 IBM-1 IBM-2 IBM-3 Time 11474 14825 (-50.2%) 150958 (-46.3%) 321264 7908686 (-14.6%) 1201219 (+16.4%) 2840846 (-56.6%)

power constraints.

121.3 126.7 (-161.3) 124.9 (-178.6) 126.6 126.6 (-102.5) 124.1 (-77.3) 120.0 (-120.0)

101.7 112.8 (-25.9) 111.3 (-54.4) 116.1 116.6 (-32.9) 91.3 (-28.4) 109.9 (-39.4)

50.1 35.0 (-168.7) 56.5 (-180.7) 26.7 34.8 (-129.3) 61.6 (-83.1) 29.6 (-118.5)

19.6 13.9 (-135.4) 13.6 (-124.2) 10.4 10.0 (-69.7) 32.7 (-49) 10.1 (-80.3)

Table 3. Test scheduling results using proposed thermalaware algorithm.

on test time, but does not necessarily reduce temperatures. Therefore, thermal constraint must be used instead of power constraint to guarantee thermal safety. In Tables 3, we present the results when the proposed method is used for thermal safety and thermal optimization. We also show the reduction on test time (in percentage) and the temperatures, compared to the results in Table 2. It can be seen that compared to Tables 1 and 2, the proposed algorithm can signicantly reduce core temperatures and achieve thermal safety. Meanwhile, the temperature variations are also substantially reduced, indicating a much better thermal balance is achieved. Moreover, in most cases (with the only exception of IBM-2) the test times are also signicantly reduced ). These results corroborate the effectiveness (by up to of the proposed thermal-aware scheduling method.

6 Conclusions
We have addressed the thermal-aware test scheduling in NoC system using on-chip multiple-frequency clocking. We proposed to assign test clock frequencies to cores during test scheduling to dynamically adjust core temperatures. We presented a heuristic where the thermal optimization process is integrated with test scheduling. Experimental results for NoC benchmarks show that the proposed method can guarantee thermal safety, yield better thermal balance and reduce test time.

References
[1] L. Benini and G. D. Micheli. Networks on chips: a new SoC paradigm. IEEE Computer, vol. 35, pp. 7078, 2002.

Proceedings of the 24th IEEE VLSI Test Symposium (VTS06) 0-7695-2514-8/06 $20.00 2006

IEEE

You might also like