Convolutional Neural Network (CNN) Compact Accelerator IP

Implement Machine Learning Inferencing in mWs

Take advantage of the FPGA’s parallel processing capability to implement compact CNNs including binarized versions known as BNNs. This IP enables you to implement CNNs in the Lattice iCE40 UltraPlus FPGAs that have power consumption in the mW range.

This IP uses on-chip DSP resources of the iCE40 UltraPlus devices to implement CNNs. Eleven Embedded Block Ram (EBR) are used as working memory by the acceleration engine. Users can choose to use EBR or the larger Single Port Memory (SPRAM) blocks to store the weights and instructions used by the engine.

This IP is paired with the Lattice Neural Network Complier tool. The compiler takes networks developed in Caffe or TensorFlow, and allows compilation into instructions that can be run by the Accelerator IP.

  • Implement CNNs including BNNs in iCE40 UltraPlus using on-chip DSP and memory blocks
  • Implement deep learning with mW power consumption
  • Network weight and operation sequence stored in either EBR or SPRAM blocks
  • Adjust operations and network weights for different BNN functions without changing the FPGA RTL
Lattice sensAI

Jump to

Block Diagram

BNN Implementations

CNN Implementations

Performance and Size

iCE40 UltraPlus Performance and Resource Utilization in BNN Mode1
Memory Type BNN Blob Type Registers LUTs EBR SRAM clk Fmax2 (MHz)
EBRAM +1/0 1822
2419 27 0 41.762
DUAL_SPRAM +1/0 1803 2447 11 2 31.565
SINGLE_SPRAM +1/0 1802
2430 11 1 41.103
SINGLE_SPRAM +1/-1 1992 2706 11 1 40.748

1. Generated using Lattice Radiant Software 1.0.0.350.0 with Lattice Synthesis Engine targeting to iCE40 UP5K-SG48I. Performance may vary when using a different software version or targeting a different device density or speed grade.
2. Fmax is generated when the FPGA design only contains Compact CNN Accelerator IP Core, these values may be reduced when user logic is added to the FPGA design.

iCE40 UltraPlus Performance and Resource Utilization in CNN Mode1
Memory Type Scratch Pad3 Registers LUTs EBR SRAM clk Fmax2 (MHz)
EBRAM 1K
1725
2816
23 0 28.164
DUAL_SPRAM 1K
1706
2867
7
2 27.672
SINGLE_SPRAM 1K
1705
2841 7
1 26.782
SINGLE_SPRAM 4K
2052
3989
19 1 25.950

1. Generated using Lattice Radiant Software 1.0.0.350.0 with Lattice Synthesis Engine targeting to iCE40 UP5K-SG48I. Performance may vary when using a different software version or targeting a different device density or speed grade.
2. Fmax is generated when the FPGA design only contains Compact CNN Accelerator IP Core, these values may be reduced when user logic is added to the FPGA design.
3. The K value in Scratch Pad is equivalent to kilobyte. For example, 1K is equal to 1 kB of scratch pad memory.

Documentation

Quick Reference
Downloads
TITLE NUMBER VERSION DATE FORMAT SIZE
Compact-CNN-Accelerator-IP-Core-User-Guide
FPGA-IPUG-02038 1.1 9/24/2018 PDF 898.6 KB
TITLE NUMBER VERSION DATE FORMAT SIZE
Compact CNN Accelerator IP Package
1.0 9/24/2018 ZIP 199.6 KB


Like most websites, we use cookies and similar technologies to enhance your user experience. We also allow third parties to place cookies on our website. By continuing to use this website you consent to the use of cookies as described in our Cookie Policy.