extraction process for multi-level logic. How do we extract good single-cube or multiple-cube divisors from Boolean equations expressed in the algebraic model in a Boolean logic network? And the one mechanical step that we still have missing is how do I actually find a good prime rectangle in any of the matrices that are associated with the extraction process. So in this final lecture, we're going to show how to heuristically do a rectangle cover on one of these matrices. So let's go see how we do that. So how do you find the prime rectangle? Well, first let me just tell you that there's a name for this problem, this is called rectangle covering problem. And recall that we showed you a different kind a covering problem where we also had a matrix of ones and zeros, and we also had rows and columns, and we had a particular rule for what kind of a cover we were looking for. This is a different kind of a cover with different kind of a rules. These are the prime rectangles that we're looking for here. So, how are we going to do it? We're going to use a greedy heuristic. So, a heuristic means we do not hope to get the most perfect, ideal, best answer. We're going to get a very good answer. We're going to get it quickly. And greedy means that at every point in the algorithm where we have the opportunity to make a decision We're only going to look ahead one step. And we're going to make a quick perfect best decision for just this one simple single next step in the problem. We're not going to sort of take a big top level view and try to get the best answer. So in this particular case, what greedy means is that we're going to grow the prime rectangle, row and a column at a time, and every time we have the opportunity to add either a row or a column, we're to do the best thing we can, but just one row and one column at a time. So this is from Rick Rudell's PhD thesis in 1989. Rick wrote one of the first multilevel synthesis tools at Burkley for his PhD and Ping Pong is a nice, simple example to talk about. So, how do you grow a big prime rectangle? The first thing you do is you pick the best single row. So this is a very simple kind of a rectangle, it's got just one row in it. What row do you pick? You pick the one row rectangle with the best number of literals saved, so that's easy. And then what do you do? You look at the other rows with ones in them in the same places, and maybe more than your best single row, and you add them one at a time, trying to maximize the number of literals saved. So you start with the best single row, and then you start adding individual rows one at a time in a greedy fashion, so that the rectangle you get has as many literal savings as possible. Then what do you do? Then you add other columns with ones in the same places as this growing rectangle. And you add them one at a time trying to maximize the number of literals saved, and you add them until you can't find any more, and then you go back up and you try to add some more rows. So you add some rows to your potential rectangle. And when you can't add any more rows, you add some columns to your rectangles, and then maybe you got some ones in some new places, so you can add some more rows. And you just go back and forth and back and forth. And when you can't grow the rectangle any more in any direction, you're done. So I know this is kind of a high level description, but what I wanted to share is that it's actually not very complicated. How well does something simple like that work? And the answer is really very, very well. And so these are just some results from Rick's 1989 Berkeley PhD. I've got a big table here, so the table has a column for examples. And so these are some industrial circuits, old, but industrial. It's got a column that says how many inputs for the circuit. Got a column that says how many outputs for the circuit. It's got a column that says how many literals when we started. It's got a column saying how many literals when we were done. It's got a column that says how much CPU time, and this is for using the ping pong algorithm. And this is Rick's heuristic that rows a good prime alternatively by rows and columns. And then its got another pair of columns using a different algorithm called best rect, and this is uses a very aggressive search to find the ideal best rectangle, so best rect is really expensive cause it finds the best single rectangle you can, and ping pong is not really expensive because it uses heuristic. So for ping pong there's a column that says how many literals in the final result, and how much cpu time. And this is on a terribly old cpu, so don't even worry about what the number is. And also for best directed column that says how many literals in a column, that says how much cpu time. Just look at the difference. So there are three examples, apex one, two, and five. You know, apex1 has 45 inputs, 45 outputs, 2,887 literals, 1,314 literals using the ping pong algorithm, 1,304 literals using best rect. That's not very different, but the difference between 14 CPU, whatever, seconds, whatever, and 858, that's a gigantic big number. Apex2 has 39 inputs and 3 outputs. It starts with around 15,000 literals. It ends with around 4,000 literals in both ping pong and best rec, but again roughly 200 CPU units versus 3500 CPU units, apex five, 117 inputs and 88 outputs. Remember way back at the beginning of this class I said look, we need computational methods, because I can't ask you to use manual Boolean algebra or a cardinal map for something with, In this case 117 inputs and 88 outputs and something like 7000 literals, things on the right hand side of the equal sign. The number of literals when we are done is about 3800 in both ping pong and best rect. But again, the difference between 40 some units of CPU time and 300 some units of CPU time. So this is a nice little table, and there's a lot more results of this type. But look, the relevant result is, all of the examples are well within 1% of the perfect, best solution for pulling out, in this case, just a single cube from all of these networks. But they're running 10 to 100 times faster, and today our computers are zillions of times faster than the computers were in 1989. So you can do this on really big designs, really fast. So here's our high-level summary of the extraction process. For single cube extraction, we're going to build the cube-literal matrix. We're going to find a prime rectangle, and that prime rectangle is a good single-cube divisor. And there's some simple bookkeeping that lets us estimate the savings of the number of literals. For the multiple cube case, we're going to Kernell all the expressions in the network, and then use the Kernells and co-Kernell from all that algebraic division to build the co-kernel cube matrix. Each prime rectangle is a good multiple cube divisor. And again, there's a simple bookkeeping that lets us make saving in the number of literals in Boolean network. And mechanically, these are both rectangle cover problems. They look a lot like karno maps, which is very interesting. You got to admit. That's just kind of surprising and cool. And they're good heuristics for solving for a good prime rectangle, but are fast and effective. Now, there are also ways, going to put this one in sort of parenthesis down at the bottom, there are ways to extract more than a single divisor from the network, but we just didn't cover those. They're kind of tedious in terms of the special case stuff. I just don't think it's worth your time. Now, I want to do a little bit of a reality admission here. How do we really do this? And the answer is we still actually can use rectangle covering, but we don't typically use rectangle covering on all the kernels and the co-kernels. And that's the key thing here. It is really too expensive to do the rectangle problem on big circuits. Let's say things with like 20,000 or more gates to pull out all of the kernels and all of the co-kernels from all of the nodes for all of the functions in a really big bullion logic network. It's too expensive to go and do all of those kernels and co-kernels. So nevertheless, we can still use all of the ideas that we've seen so far, and what people often do is they use heuristics to find a quick set of likely divisors. So instead of fully kerneling every node in the network, because there's just too many cubes, there's actually some tricks to extract a subset of the useful kernels quickly. Roughly speaking, you don't search really deep in the kernel hierarchy. You just find some quick kernels to pull them out, and then you try to use those, and then you can either do the rectangle cover on this smaller problem and their smaller just because you've got a fewer things to build the matrices. Or you can try to do something even simpler and the overall network restructuring, for example can you try all pairwise substitution of one node substitute into another. You can just keep the ones you find. So remember what I said at the start of multi-level synthesis, that this is just like BDDs, a universe of data structures and operators, and the operators are all of these operations. Simplifying the insides of a node, that's an espresso like thing. Taking nodes that aren't really very interesting and shoving them back into their fan outs, that's a macroscopic restructuring. Taking complicated sets of nodes and trying to pull factors out, we write scripts to do these things. There's lots of different scripts that can do these things. There's lots of different operators. Some of the operators are big and powerful, and complex, and expensive. They give really good results, but they're kind of slow. Some of the techniques are not so big, not so complex, not so expensive. You trade off a little quality for CPU time. It's just like everything there else is in the software universe. But I just wanted to mention that all of the techniques we used are applicable, but one gets to choose how completely one applies these in any real logic network. And finally, just to complete, there's lots and lots of good references about these papers, so though the paper by Brayton and Rudell and Sangiovanni-Vincentelli about MIS: A Multiple-Level Optimization System. That's a really iconic paper in this business worth checking out. Manny De Micheli said this is an ultimization of the Joules circuits, because it's a really good book for this stuff. The ping pong algorithm is actually from a paper by Brayton and Rick Rudell, Alfredo San, Giovanni, Vincentelli and Albert Wang from an international conference on CAD in 1987. Rick Rudell's Berkeley PhD thesis is pretty good, and I think it's actually out on the web. And there's another book by Srinivas Devadas, Abhijit Gosh, and Kurt Keutzer called Logic Synthesis that was done in 1994 that also a rather good summary of this stuff. And so with that, we're done with the extraction problem. And so now on to a new set of problems in multi-level logic. [SOUND]