- Workshop presentation
- Open Access
FPGA-based acceleration of MRI registration: an enabling technique for improving MRI-guided cardiac therapy
© Kwok et al.; licensee BioMed Central Ltd. 2014
- Published: 16 January 2014
- Registration Method
- Float Point Operation
- Robust Registration
- Computation Bottleneck
- Neighboring Gradient
Quantification of edema and scar maps with cardiac MR images (cMRIs) enables effective Radiofrequency Ablation (RFA) of arrhythmias during the Electrophysiology (EP) procedure . This demonstrates the paramount advantage over the EP catheterization under X-ray and ultrasound guidance. High-contrast and resolution cMRIs can be obtained preoperatively as a EP roadmap for surgical planning of RFA, whilst real-time MRI (rt-MRI) can be used to guide catheterization and update the cMRI model  to provide intraoperative visualization of a 3D vascular map. A fast and efficient technique of non-rigid image co-registration is required. Although feature-based registration methods can be rapidly processed by computing sparse features, the outcome is sensitive to blurred images with artifacts that happens regularly in low-resolution rt-MRI, causing significant errors in feature detections. With the use of Field-programmable Gate Array (FPGA), we hypothesized that novel data structure and architecture of memory access can allow robust registration based on comparison of image intensity patterns, thus fulfilling the real-time requirements for clinical practice.
Acquiring image gradient is a common step in intensity-based registration methods  (e.g. Demons ), but also the primary computation bottleneck. Image gradient computation requires information of pixel/voxel neighborhood, leading to large amount of non-coalesced memory accesses and floating point operations. A customized FPGA-based computation kernel of Demons is proposed. Multiple pixel/voxel processing units (PUs) are placed in the FPGA. Each has its own pixel/voxel memory. Input pixels/voxels are processed as a data stream that propagate via the kernel. The workloads are then distributed to the PUs such that neighboring gradients are connected by neighboring PUs, hence memory bandwidth is further reduced. Rapid computation of image registration is achieved by 1) the highly-customized PUs; 2) the parallelism of multiple PUs and pixel/voxel memories; and 3) bandwidth reduction through inter-PUs information exchange channels.
The performance of the proposed computing architecture demonstrates its high potential for accelerating registration of 3D-gated MRI images to improve visualization of the MRI-guided cardiac therapy.
NIH U41-RR019703, R43 HL110427-01, AHA 10SDG261039, EPSRC and Croucher Foundation Fellowship.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.