Computer Science 294-7 Lecture #12
Interconnect


Notes by Varghese George

"It's the interconnect, stupid" - Tom Knight, Jr.

1.0 Interconnect

Some numbers from Lecture 4




The interconnect constitutes ~90% of the total area and 70-80% of the total delay.

In the reconfigurable domain, the interconnect has many issues to deal with
For evaluating the different options let us assume that all of the operators have to be connected simultaneously.

1.1 Bisection Bandwidth

If a design is partitioned into two equal halves, it is the minimum number of wires which must cross between the halves considering all possible partitions.
Bisection bandwidth = 2
Fig. 1.1

Bisection bandwidth = 1
Fig. 1.2

Bisection bandwidth = 3
Fig. 1.3

The bisection bandwidth of Fig. 1.3 is slightly tricky, the connections have to be modified so that the fanout of three from each line crossing occurs on one side of the partition.

1.2 Interconnect Design Issues

It is seen that, concerning the bisection bandwidth, the requirement is contradictory as far as flexibility and the area is concerned.

1.3 Attempt 1 - Crossbar

Any operator may want to take as input the output of any other operator.

Fig. 1.4 Crossbar

1.3.1 Crossbar Analysis

Let k be the number of inputs of a LUT, n the number of LUTs, and l the length of wire.

    Delay

    Parasitic loads = kn
    Switches/path = l
    Wire length = O(kn)
    Delay = O(kn)

    Area

    Bisection BW = n
    Switches = kn2
    Area = O(n2)
The crossbar approach guarantees to route everything.

1.4 Attempt 2 - Mesh

This is based on the following reasoning

Fig. 1.5 Mesh Network

In this structure, if you try to place everything close, the transitive fanin grows faster than the "close" sites. This is illustrated in Fig. 1.6.

Fig. 1.6 Trying to place everything close

The fanin increases as 4n, where n is the manhattan distance.

1.4.1 Mesh Analyis

    Let l be the length of the wire, w the channel width, and n the number of LUTs.

    Delay

    Switches/path = l
    Wire length = O(l)
    Best Delay = O(w)
    Worst Delay = O(n½w)

    Area

    Bisection BW = wn½
    Switches = O(nw2)
    Area = O(nw2)
In this analysis the things to watch out for are

2.0 Rent's Rule

Rent's rule is an empirical relationship which estimates the interconnect requirements based on the gate count.
For a collection of n gates the number of I/O signals is given by

IO = Cnp

Consider an extreme case,

For this case we get The local connections modify C and p from the above values depending on the circuit being implemented.
Rent's rule can be used to estimate the Bisection BW by recursively partitioning the chip.

Fig. 2.1 Recursive Partitioning of circuit

The average wire length and the channel width (w) are direct consequents of Rent's rule. Typically 0.5<=p<=0.75. For unpartitioned or unstructured architectures it is more like 0.9<=p<=1.0.

Fig. 2.2 Rent's numbers for combinational logic

    Random Logic

    Specific Architectures


2.1 Caveats