Skip to content
Yaman Umuroglu edited this page Feb 13, 2016 · 4 revisions

The fpga-tidbits platform wrapper has several components for building cross-platform hardware accelerators. You (as the developer) use a "standardized" set of interfaces for developing your accelerator, and use the platform wrapper to deploy it onto different FPGA cards/systems.

Services

Accessing the platform's memory system. Your accelerator is provided with a (configurable) number of GenericMemoryMasterPort's (see dma) for accessing the platform's memory system, and you just send these "generic" memory requests and receive responses during memory access. The wrapper installs the necessary adapters for converting the generic req/rsps to/from the platform-specific memory operations.

Easier software integration. Any signal-level I/Os you have in the accelerator are automatically packed into a register file, and a simple C++ driver is generated to give read/write access to these registers (with proper names retrieved from the I/O signals, instead of just register indices).

A virtual platform for testing in emulation. One of the target platforms is a "virtual platform" which (naively) implements the "main memory" and register file access services normally provided by the FPGA hardware platform. This allows you to use Chisel's fast emulator with the exact same software source code for testing your accelerator designs.

Jumpstart

  1. Derive your accelerator from GenericAccelerator instead of Module and its io Bundle from GenericAcceleratorIF. Set numMemPorts to the desired number of memory ports.
  2. Implement your accelerator functionality. You can add regular I/O signals and use the io.mainMem() ports for accessing the platform memory system.
  3. Instantiate the desired target's wrapper, and pass a function that instantiates your accelerator as the constructor argument. You can use the TesterWrapper to test your accelerator in emulation.
  4. Call the generateRegDriver function on the wrapper to create a register driver for your accelerator. Add this driver and the files for the target platform from under the platform-wrapper/regdriver into your software project.

Examples

There are a few examples under platform-wrapper/tests that can help get you started. There is example code under platform-wrapper/tests/cpp for testing these accelerators in software.

  • TestRegOps is the most basic example, with no memory system access and just adding two I/O signals and returning the result in a third as the functionality.
  • TestSum is a basic example that uses memory system access to read and sum a sequence of 32-bit values.
  • TestMultiChanSum is similar to TestSum, except it creates a multi-channel setup to read and sum several sequences at onces.
  • TestCopy copies a block of memory from one address to another.
  • TestRandomRead sums selected indices from a sequences ("gather" and sum)

The Main.scala of fpga-tidbits is configured to generate these examples. Try e.g:

  • sbt "run emulator TestSum Tester" -- will generate an emulator/ folder with contents for testing TestSum in Chisel emulation
  • sbt "run verilog TestSum ZedBoard" -- will generate a Verilog for TestSum targeting the ZedBoard.

Supported FPGA platforms

Each FPGA platform to be targeted must be supported in the framework. Currently supported platforms are:

  • ZedBoard (software targets bare-metal) -- use ZedBoardWrapper
  • Convey Wolverine WX690T -- use WolverinePlatformWrapper
  • A "virtual platform" for testing in the Chisel emulator -- use TesterWrapper but note that main memory is restricted to 512 MB

Building the hardware

The generated Verilog must still be fed to FPGA synthesis with the appropriate top-level module.

  • ZedBoard: can use the IP core under platform-wrapper/axi/ip-cores/ZedBoardWrapper with Vivado IP integrator to instantiate. Need to add the generated Verilog to the project, set manual compile order and move it to the top of the compilation order.
  • Convey Wolverine WX690T: use the cae_pers.v under platform-wrapper/convey as part of a Convey project, but remember to set the correct memory port count in the project's Makefile.

Software and drivers

Similar to the hardware-level adapters in the wrapper, there is a thin software layer for each target platform that implements the register access and memory coherency functions. These files are found under platform-wrapper/regdriver, here is a brief overview of what's there:

  • wrapperregdriver.h -- abstract class with the functions expected from each platform (register read/write, copy memory between host and accelerator)
  • platform.h -- declaration of the initPlatform() and deinitPlatform() functions
  • platform-(PLATFORMNAME).cpp -- implementation of the initPlatform() and deinitPlatform() functions
  • (PLATFORMNAME)regdriver.hpp -- implementation of the WrapperRegDriver functionality for the platform

When using your own accelerator's generated register driver, just pass the WrapperRegDriver pointer returned by initPlatform to the constructor.

Recent changes

  • support resetting the accelerator via register file accesses -- just write a 0 or 1 to the signature register to set reset low or high
  • ZedBoard Linux platform support (via totally unsafe mmap'ing the accelerator register memory addresses)

TODO

  • add the signal-index mapping data (a map of string -> int) as part of the regdriver base interface
  • add illegal memory address checks in the Tester (virtual platform)
  • allow declaring "constraints" on I/O signal values, and propagate these to the software driver (e.g. "value must be divisable by 4")
  • tighter packing of registers into the register file, it's at least 1 register per signal right now, regardless of the actual signal width
  • ensure same regdriver is generated for all platforms (need to check signal I/O does not refer to the PlatformWrapperParameters)