Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Question 1: Assume a Weight Stationary (WS) dataflow for 1D convolution, where the filter
size is 6, input fmap size is 12 and stride is 1.
a) Show a corresponding loop nest of this WS dataflow.
b) Show space-time reference patterns of this WS dataflow for input activations, filter weights,
and output fmap psums, separately.
Question 2: Assume a Output Stationary (OS) dataflow for 1D convolution, where the filter size is
6, input fmap size is 12 and stride is 1.
a) Show a corresponding loop nest of this OS dataflow.
b) Show space-time reference patterns of this OS dataflow for input activations, filter weights,
and output fmap psums, separately.
Question 3: Assume a Weight (WS) dataflow for 1D convolution, where the filter size is 12, input
fmap size is 23 and stride is 1.
a) Show a tiled loop nest of this WS dataflow where the number of tiles for weights is 4 (i.e., size
of each tile would be 3).
b) Show a tiled loop nest of this WS dataflow where the number of tiles for output activations is 3
(i.e., size of each tile would be 4).
Question 4: Assume a 2D Convolution to be performed as Row Stationary (RS) dataflow in a
Processing Element Array of 3x3 (i.e., 3 rows of PEs, wehere there are 3 PE per row). The filter
size is 3x3, and input feature map size is 4x4.
a) Show graphically, how this 2D convolution would map on the given PE Array (i.e., show which
row of filters and which row of input fmap values would map to which PE, and which output
fmap rows would be produced by accumulation of which psums).
b) What should be the size of local register file for each PE to maximize filter weight reuse, and
input activations?
Question 5: Assume we want to build a hierarchical mesh network whose source, router and
destination clusters are shown, below.
a) Show the corresponding connections needed to have a mesh network between source and
router clusters, and all-to-all network between router and destination clusters.
b) Show the active connections when source 1 (S1) broadcasts to all destinations (i.e., destinations
from D1 to D8).
Source Cluster 1 Source Cluster 2
Destination Cluster 1 Destination Cluster 2
router Cluster 1 router Cluster 2