Ir2Vhdl#
Translation of IR (Intermediate Representation) to VHDL happens in two stages. For each IR graph, these two stages are performed separately. For those familiar with the C toolchain an IR graph is very similar to a translation unit. In the following we will show the translation from :ir: to VHDL in the context of the full translation pipeline and describe the two stages in more detail.
Overview#
Translation of a neural network model into VHDL code typically involves
several lowering passes. Each of which transforms a high-level IR into a
lower-level IR. While all these IRs rely on the same core datastructures
Node
, Edge
, Graph
, the concrete representations may contain a
lot more information.
Let’s assume we have a neural network defined in an MK framework that looks like the following.
Network from ML framework
classDiagram direction LR class Conv1d class Linear Conv1d --|> Module Linear --|> Module Sequential --|> Module Conv1d : +Tensor weight Conv1d : +Tensor bias Conv1d : +int stride Conv1d : +int in_channels Conv1d : +int out_channels Linear : +Tensor weight Linear : +Tensor bias Linear : +int in_features Linear : +int out_features Module : +\_\_call\_\_(Tensor x) Tensor class neural_network["neural_network :Sequential"] class conv_0["conv_0 :Conv1d"] class linear_0["linear_0 :Linear"] neural_network *-- conv_0 neural_network *-- linear_0 conv_0 --> linear_0 : data flow
Stage 1: High-level IR#
In a first step we translate this network into a high-level intermediate representation (IR), containing just enough data to describe all relevant parameters and the structure of the network.
Network after translation to IR
classDiagram direction LR class Node { +str name +str type +dict attributes } class Edge { +str src +str dst +dict attributes } class Graph { +dict[str, Node] nodes +dict[tuple[str, str], Edge] edges +add_node(Node n) +add_edge(Edge e) +get_successors(str name) +get_predecessors(str name) } Graph *-- Node Graph *-- Edge class neural_network["neural_network :Graph"] class input["input :Node"] class conv_0["conv_0 :Node"] class linear_0["linear_0 :Node"] class output["output :Node"] neural_network *-- conv_0 neural_network *-- linear_0 neural_network *-- input neural_network *-- output note for input "type = 'input' name = 'input' attributes = {}" note for output "type = 'output' name = 'output' attributes = {} " note for conv_0 "name = 'conv_0' type = 'conv1d' attributes = { 'weights': [[1.0, 2.0], [0.0, 1.0]], 'input_shape': (3, 2), 'output_shape': (2, 1), 'bias': ..., 'in_channels': 2, 'out_channels': 1, ... }" note for linear_0 "name = 'linear_0' type = 'linear' attributes = { 'weights': [[1.0], [0.5]], 'input_shape': (2, 1) 'output_shape': (1, 1) 'bias': ... ... }" input --> conv_0 : Edge conv_0 --> linear_0 : Edge linear_0 --> output : Edge
Caution
As most nodes have input and output shapes, we are considering making
those members required fields for the Node
data class.
This representation will be consumed by a lowering pass. The lowering pass will transform the high-level IR into a lower-level IR. Depending on its structure, this lower-level IR can be either translated to a target language (e.g., VHDL) or further transformed into an even lower-level IR.
Stage 2: Low-level IR#
For simplicity we assume that we are applying a lowering pass that produces an IR that is suitable for translation to VHDL. We provide the result in json as the class diagrams from above become more and more complex. The result could look like this
[
{
"name": "neural_network",
"type": "clocked_combinatorial",
"attributes": {
"runtime_input_shape": [2, 4],
"top_kernel_size": 2,
"top_stride": 1,
"runtime_output_shape": [1, 1]
},
"nodes": [
{
"name": "input",
"type": "input",
"implementation": ""
},
{
"name": "conv_0_i0",
"type": "mac",
"implementation": "conv_0",
"input_shape": [2, 2],
"output_shape": [1, 1]
},
{
"name": "conv_0_i1",
"type": "mac",
"implementation": "conv_0",
"input_shape": [2, 2],
"output_shape": [1, 1]
},
{
"name": "linear_0_i0",
"type": "mac",
"implementation": "linear_0",
"input_shape": [2, 1],
"output_shape": [1, 1]
}
],
"edges": [
{
"src": "input",
"dst": "conv_0_i0",
"src_dst_indices": [
[0, 0],
[1, 1],
[2, 2],
[3, 3]
]
},
{
"src": "input",
"dst": "conv_0_i1",
"src_dst_indices": [
[2, 0],
[3, 1],
[4, 2],
[5, 3]
]
},
{
"src": "conv_0_i0",
"dst": "linear_0_i0",
"src_dst_indices": [[0, 0]]
},
{
"src": "conv_0_i1",
"dst": "linear_0_i0",
"src_dst_indices": [[0, 1]]
},
{
"src": "linear_0",
"dst": "output",
"src_dst_indices": [[0, 0]]
}
]
},
{
"name": "conv_0",
"type": "mac",
"weight": [[1.0, 0.0, 1.0, 0.1]],
"bias": [],
"nodes": [],
"edges": []
},
{
"name": "linear_0",
"type": "mac",
"weight": [[1.0], [0.0]],
"bias": [],
"nodes": [],
"edges": []
}
]
Several noteworthy things happened in this lowering pass:
The node
'conv_0'
was replaced by'conv_0_i0'
and'conv_0_i1'
. This is because the convolution operation was split into two parts. The first part computes the first output element and the second part computes the second output element. This is a common optimization in hardware design to increase parallelism. The two new names will directly be used for the entity instances in the generated VHDL code.Two new IR graphs were created:
"conv_0"
and"linear_0"
(at the end of the file). These graphs do not contain any nodes or edges, but their weights and biases as well together with their type will be used to generate the corresponding VHDL designs in the next step.The types for convolutions and linear layers were changed to
"mac"
, because that is the operation we want to apply at this level.
Stage 3: VHDL Generation#
The third stage is where the low-level IR is translated into VHDL code. Usually one IR graph will be translated into one VHDL entity. Exceptions to this are possible though, e.g., to split complex designs into multiple smaller entities.
Caution
We might limit this behaviour in the future, to ensure a one-to-one relationship between IR graphs and resulting VHDL code, in case problems like name clashes arise.
Each node in a graph will be translated into the instance of a vhdl entity. We interpret the edges as data flow between instances. Dependending on the IR graph we need to translate we have to solve one or more of the following problems:
decide what algorithm to use for the following steps
determined by the type of the implementation
set correct default generic parameters for currently generated entity
determined attributes and input/output nodes of the implementation
define control signals for each instance
determined by the type of each node
define data signals for each instance
determined by the input/output shape of each node
connect the instances using the correct data/control signals
deterimined by the edges of the implementation and node types
instantiate the entities with the correct generic parameters
determined by node attributes and type
instantiate the entities with the correct port mappings
determined by node type and implementation edges
The following software components play an important role during code generation:
template.TemplateBuilder
allows to generate templates from given strings.
vhdl_template.EntityTemplateDirector
uses the
TemplateBuilder
to generate templates for VHDL entities.ir2vhdl.Instance
contains the knowledge about the interface of an entity, i.e., the ports and generics, and provides a function to generate VHDL code for instantiating the entity.
ir2vhdl.InstanceFactory
is responsible for creating
Instance
s from nodes in an IR graph.
While the classes above live in the elasticai.creator
package,
specific implementations live in the elasticai.creator_plugins
namespace.
As such it is up to each of the plugins that support translating a
specific type of implementation to know how to handle each of the
encountered node types. As an example we consider the combinatorial
plugin. It features set of different node types with two different
interfaces. The supported node types are:
shift_register
striding_shift_register
sliding_window
clocked_combinatorial
unclocked_combinatorial
input
output
including (un)clocked combinatorial allows us to nest combinatorial designs
The combinatorial
plugin defines a Instance
type for each of these
nodes types and uses the InstanceFactory
to create instances from
nodes in an IR graph.