 Technical note
 Open access
 Published:
Accelerating the computation for realtime application of the sinc function using graphics processing units
Journal of Analytical Science and Technology volume 11, Article number: 8 (2020)
Abstract
In magnetic resonance imaging, the fidelity of image reconstruction is an important criterion. It has been suggested that the infiniteextent sinc kernel is the ideal interpolation kernel for ensuring the reconstruction quality of nonCartesian trajectories. However, the application of the sinc function has been limited owing to its computational overheads. Recently, graphics processing units (GPUs) have been employed as fast computation tools because of their efficient and versatile parallel computation abilities. We implemented an accelerated convolution function with the sinc kernel using GPUs computing and evaluated the reconstruction performance. The computation time was significantly improved: Computation using the proposed method was approximately 270 times faster than that on a central processing unit (CPU) and approximately 4.6 times faster than that on a CPU optimized by level3 Basic Linear Algebra Subprograms. The images reconstructed using the fast sinc function exhibited no adverse errors at all matrix sizes (resolutions). The total reconstruction time was approximately 0.3–3 s for all matrices, indicating that the sinc function could be a practical option for image reconstruction. Ultimately, its application would present a fundamental improvement to the performance of image reconstruction, and the GPU implementation of the convolution function with the sinc kernel could resolve various challenges in image data processing.
Introduction
Magnetic resonance imaging (MRI) has been widely used in medical imaging as a safe and noninvasive method for the detection and prognosis of diseases (Kraff et al. 2015; Stone et al. 2008; Wright et al. 2014). It has advanced from two and threedimensional imaging to fourdimensional acquisition and has been combined with parallel imaging or compressed sensing techniques for rapid scanning (Hansen et al. 2008; Nam et al. 2013; Pratx and Xing 2011; Smith et al. 2012). The benefits of these acquisition methods are generally due to superior spatial resolution, inducing enhancements to the diagnosis of diseases (Hansen et al. 2008; Kraff et al. 2015; Nam et al. 2013). However, high computational overheads are incurred by the large datasets and the complex reconstruction process that resolves the optimization problem (Hansen et al. 2008; Kraff et al. 2015; Nam et al. 2013; Pratx and Xing 2011; Smith et al. 2012). As the entire sequence time depends on the amount of both data processing and collection of kspace (Kasper et al. 2018), a reasonable balance between the higher image quality (IQ) and the efficiency of acquisition type is required in a clinical setting (Hansen et al. 2008; Nam et al. 2013; Pratx and Xing 2011).
The drop in IQrelated motion artifacts could be prevented by an attempt to decrease the total scan time (Kasper et al. 2018; Stone et al. 2008). NonCartesian (NC) imaging has emerged as an alternative to standard Cartesian imaging owing to its scanning speed (Schiwietz et al. 2006; Wright et al. 2014). There are various approaches to reconstruct an image from NC scanning raw data (Pauly 2005), but the convolution function method to resample the data is preferred because it preserves reasonable IQ (Rasche et al. 1999). Its performance is closely related to the characteristics of the kernel that is used in the reconstruction process (Jackson et al. 1991; O’sullivan 1985; Rasche et al. 1999). Jackson et al. improved results using an oversampling factor and distinct kernel widths (Jackson et al. 1991). However, some parameter settings contributed to the optimal performance. The infiniteextent sinc kernel has been suggested as an ideal interpolation kernel; however, its clinical application has been hindered due to computational limitations (Bernstein et al. 2004; Jackson et al. 1991; O’sullivan 1985). Several studies have stated that the use of the sincinterpolation function could reduce digitization error in various fields (Bernstein et al. 2004; Wang and Liu 2015). A previous study has employed the sinc kernel for mechanics and not the MRI field; the frequency offset yielded by the sinc function was significantly less than that yielded by traditional fixedkernel functions (Li et al. 2017). Hence, mitigating the reconstruction time of the ideal kernel may substantially improve the accuracy of the approximation, leading to a remarkable reconstruction performance by reducing artifacts related to mismatches.
Graphics processing units (GPUs) have been considered a tool for fast computation because of their efficient and versatile parallel computation (Pratx and Xing 2011). The GPU programs, which are extensions of standard C programs, can be employed without understanding the hardware structure (Hansen et al. 2008; Nam et al. 2013; Pratx and Xing 2011). Therefore, there have been approximately 1600 articles on studies pertaining to the use of GPUs in the MRI field, from 2005 to 2016. These reports have demonstrated that intensive calculation on a GPU could be accelerated by a factor of 2–285 as compared with the computation on a central processing unit (CPU) (Wang et al. 2018). Guo et al. have demonstrated that the reconstruction performance for PROPELLER trajectory can be improved by a factor of 9, with suitable image quality (Guo et al. 2009). In addition, an enhanced algorithm—reverse gridding algorithm—improved the computation by approximately 7.5 times by using GPUs (Yang et al. 2013). Moreover, compressed sensing reconstruction for 3D radial trajectories has been accelerated by approximately 54 times in cardiac MR imaging (Nam et al. 2013). They suggested that GPU computing is suitable for realtime reconstruction. Although GPU implementations have mostly focused on massive reconstruction algorithms, the use of a GPU could sufficiently resolve the fundamental challenge of sinc function computation.
Many functions are presently implemented on GPUs, for instance, a parallel of the nonequispaced fast Fourier transform for arbitrary trajectory (Sørensen et al. 2008), but the application of the sinc function on a GPU for NC reconstruction process is yet to be reported. We hypothesized that the images reconstructed by the sinc interpolation on GPUs do not differ from the reference images and that the fast sinc function can be practically utilized in clinical settings. In this article, we review the theoretical concept in part and present an implementation of an accelerated convolution function with the sinc kernel. We then report its computational power for different spatial resolutions and evaluate its reconstruction performance. Using the proposed strategy, the computation time is reduced to a level suitable for realtime applications. Reconstruction fidelity is demonstrated by the outstanding reproduction of reference images. Lastly, we conclude the paper with a short summary of the current study and mention the scope for future work.
Methods
Theory
We utilized the formulation of inverse gridding (INV) operation on a 2D nonCartesian trajectory because it is the most timeconsuming step of nonCartesian reconstruction [5]. This mathematical formulation partially follows those of Rasche et al. (1999) and Pauly (2005), and the objective is to pass a function over the data sampled on a rectilinear grid. Let m(x, y) and M(k_{x}, k_{y}) be a Fourier transform pair. To perform INV, m is divided by kernel c(x, y) for deapodization. Here, m is the intermediate image on Cartesian sampling points obtained by the gridding function as follows:
In kspace, this yields
At this stage, M_{k} remains on the Cartesian data points. To estimate the nonCartesian data, the image M_{k} is convolved by the kernel C(k_{x}, k_{y}) used in the gridding operation. Subsequently, it is sampled by the Shah function.
In this equation, we change the kernel C into the sinc function in the kspace domain as follows:
where ω ′ = ω/N ∆ω and N is the matrix size in the x and y directions. To reduce the computation time as much as possible, this equation is reformulated as the following matrix multiplication operation:
where M is the acquired kspace data, and H and E are matrices for the x and yaxes, respectively. The first step is yaxis interpolation in kspace, yielding
To complete the convolution function of the 2D image, the successive step is performed by using the Hadamard product for the xaxis components.
Reconstruction performance measurements
Reconstruction algorithms for the matrix calculations were built for both the CPU and the GPU. There are two manners to implement the sinc function for the CPU. First, MATLAB (Mathwork, MA, USA) was utilized to evaluate general computation. Subsequently, an advanced method was employed for maximizing the speed of the data processing. This method is the level3 Basic Linear Algebra Subprograms (BLAS) technique, which performs highperformance matrix–matrix operations (Dongarra et al. 1990). The GPU program was constructed almost identically to the CPU program by using the cuBLAS technique of the CUDA library (version 7.2). The CPU algorithm was implemented on a CPU with 8 GB of memory and a core clock speed of approximately 1200 MHz. The GPU algorithm was implemented on a NVIDIA Geforce GTX 1070 with 8 GB of global memory and a core clock speed of 1506 MHz.
The Shepp–Logan phantom (SL) image was employed for measuring the computational time and evaluating reconstruction performance. The matrix size was varied from 64 × 64 to 512 × 512 in the field of view of 240 × 240 mm. For the interpolation factor (IPT) setting of the sinc kernel, we evaluated the zeropadding effect with several numbers, ranging from 1 to 2 in steps of 0.25, in single resolution setting (384 × 384). To fairly compare the original and reconstructed images, the voxelbased rootmeansquared error (RMSE) was employed. The increase in RMSE from which the IPT was 1.25 (1.3677 × 10^{−15} (considered sufficiently as 0.0 in digital processing), 0.0010, 0.0011, 0.0012, 0.0012 from IPT 1 to 2) is illustrated in Fig. 1. It could be caused by the fact that the increased sampling rate due to the IPT enhances Gibbsringing artifacts in MR images (Bernstein et al. 2004). Thus, zero padding in our study was not considered. All reconstruction processes among the GPU and twotype CPU processing were individually performed 20 times. The average reconstruction times were recorded, and reconstruction images were acquired. To validate the fidelity of the reconstruction on the GPU, the GPU images at all resolutions were subtracted from the images computed by the CPU libraries and the SL image. Subsequently, an analysis of RMSE and percent error (PE) was conducted. The PE is the RMSE of the reconstructed image divided by the rootmeansquared value of the reference image, as presented by Stone et al. (2008).
Results
The average processing times of the sinc convolution were measured to compare the computational performances of the CPU and the GPU. The computational time on the GPU substantially decreased as the amount of data increased. In the highest resolution here, the total reconstruction time was 727.6, 12.5, and 2.7 s with the sequence of the normal CPU time (CPU_{ref}), the BLASoptimized CPU time (CPU_{opt}), and the GPU time, respectively. The corresponding values of the respective matrix sizes are listed in Table 1. To evaluate the GPU’s performance, the speedup factors were calculated as follows: (1) each CPU time was divided by the GPU time and (2) CPU_{ref} was divided by the CPU_{opt}. Although CPU_{opt} was faster than CPU_{ref} by approximately 58 times, the GPU time showed substantially rapid computation at approximately 270 times faster than CPU_{ref} and 4.6 times faster than CPU_{opt} (Fig. 2). In contrast, CPU_{opt} at low resolutions was faster than the GPU time. This implies that the time required to transfer data to the GPU device was longer than the activation of threads (Cheng et al. 2014; Hansen et al. 2008; Smith et al. 2012). These results are in good agreement with those of previous studies (Hansen et al. 2008; Nam et al. 2013; Pratx and Xing 2011; Sørensen et al. 2008; Smith et al. 2012).
We evaluated the image reconstruction errors to validate the reconstruction fidelity of the GPU program. The GPU images were identical to the images reconstructed using the CPU methods in terms of RMSE (= 0.0) as shown in Fig. 3. Subsequently, we compared the GPU images to the SL image. Reconstruction errors were exhibited in the subtraction images; however, they were sufficiently trivial (RMSE = 1.58 × 10^{−15}). There was no reconstruction error at all resolutions (Fig. 4). The highest RMSE value was 0.0 (1.08 × 10^{−15}) and the PE was similar (4.38 × 10^{−13}). This indicates that the GPUbased reconstructed images closely matched the reference image.
Discussion
The objective of this study was to reduce the computation time of the convolution function using the sinc kernel. Compared with the CPUbased computations, the GPUbased computation achieved a significant acceleration of 4.6 to 270 times. Moreover, the reconstructed images were virtually identical to the reference images. GPUbased processing tends to perform substantially better than CPUbased processing. This could lead to fundamental improvements in image reconstruction.
In clinical and/or research MRI settings, a fast reconstruction time is required for instantaneous and reliable responses with respect to IQ (Kasper et al. 2018; Pratx and Xing 2011; Smith et al. 2012). However, the computation time is closely related to the square of the data points (Pauly 2005) and the number of algorithms connected to the process (Oppenheim and Schafer 2014). Thus, the application of the sinc kernel, which is of infinite extent, has been practically limited (Bernstein et al. 2004; Jackson et al. 1991; O’sullivan 1985). We significantly reduced the computation time of the sinc function by approximately 3 s (around 63% compared with CPU_{opt}), suggesting that the GPUbased sinc function could be practically used in image reconstruction. The effect of GPU computing has been demonstrated by reducing massive calculations such as compressed sensing and/or parallel imaging techniques by 5–65% (Hansen et al. 2008; Nam et al. 2013; Pratx and Xing 2011; Stone et al. 2008). More specifically, the total reconstruction time of prior studies (Hansen et al. 2008; Nam et al. 2013; Smith et al. 2012) was approximately 3–150 s, which has been mentioned as realtime reconstruction conditions in 196^{2}–512^{2} resolutions. Hansen et al. suggested that lowlatency reconstruction is suitable for realtime reconstruction, provided that the speed for reconstruction is substantially faster than the data acquisition time (Hansen et al. 2008). In addition, the spiral acquisition for highresolution 2Dbrain imaging (FOV 230 mm, 0.5 mm inplane resolution) requires a minute at 7 T MRI (Kasper et al. 2018). Our result in 512 × 512 resolution took less than 3 s to complete. This indicates that our study can achieve realtime image reconstruction using the sinc kernel.
Alternative kernels for a reasonable computation time not only generate aliasing artifacts but also attenuate the signal toward the edges of the field of view (Bernstein et al. 2004; Jackson et al. 1991; Rasche et al. 1999). It requires compensation steps, leading artifacts to be accentuated (O’sullivan 1985; Pauly 2005). In our results, the images reconstructed by the fast sinc function showed no adverse effects. This outstanding performance (RMSE and PE = 0.0) is attributable to the wide range of the kernel, which multiplies the center of an image domain by a constant value as a rectangular shape (Bernstein et al. 2004; O’sullivan 1985; Rasche et al. 1999). It sufficiently supports that a bandlimited function basically has the least transition by the apodization (Pauly 2005). We anticipate that a realistic application of this function could simplify the nonCartesian reconstruction process. An efficient way to decrease influences by several kernels has been demonstrated by applying an oversampling factor (Rasche et al. 1999), but zerofilling could induce increases in reconstruction time and truncation artifacts (Bernstein et al. 2004). We exhibited the reconstruction errors caused by IPT (RMSE = 0.001 – 0.0012), although they were minor quantities. Hence, there was no IPT application step in our study. Moreover, the deapodization stage, which compensates for alternative kernels (Pauly 2005; Rasche et al. 1999), could be extracted owing to the excellent performance of the sinc function. Furthermore, this could presumably relieve the overheads of the iterative step for an actual trajectory estimation (Pauly 2005), which can additionally reduce the computation time. Consequently, the sinc functionbased on GPU would present fundamental improvements in image data processing.
Our GPUbased implementation is restricted to a maximum resolution of 512 × 512. This inherently depends on the size of the global memory in the GPU and could be improved by further parallelization methods such as utilizing shared cache memory access, gridlevel concurrency, and multiGPU techniques (Cheng et al. 2014; Pratx and Xing 2011). These methods should further increase the computation power for larger datasets. We used the Shepp–Logan phantom image as an intermediate image and employed a uniform sampling pattern. Hence, the initial reconstruction condition in the inverse gridding operation was not satisfied. To complete the progress of nonCartesian reconstruction, a gridding function with an identical kernel should be implemented for an intermediate image (Rasche et al. 1999) and the arbitrary sampling should be estimated for the actual trajectory (Pauly 2005; Wright et al. 2014). Because the IQ obtained by nonCartesian acquisition has been competitively improved (Kasper et al. 2018), the performance with the entire reconstruction process should be demonstrated in the in vivo imaging by a comparison with the Cartesian acquisition.
Conclusion
We implemented a GPUbased method for the accelerated computation of the sinc function. Its application enables a bandlimited function to be practically used, resulting in an improved performance with few errors. A GPUbased MRI reconstruction could be used to dramatically reduce image delivery time to physicians and researchers. In addition, the GPUbased implementation of the convolution function with the sinc kernel may help resolve various challenges in the field of MRI research.
Availability of data and materials
The computational code sets used in this study are available on a reasonable request from the corresponding author.
Abbreviations
 BLAS:

Basic Linear Algebra Subprograms
 CPU:

Central processing unit
 CPU_{opt} :

BLASoptimized CPU time
 CPU_{ref} :

Normal CPU time
 GPUs:

Graphics processing units
 INV:

Inverse gridding
 IPT:

Interpolation factor
 IQ:

Higher imagequality
 MRI:

Magnetic resonance imaging
 NC:

NonCartesian
 PE:

Percent error
 RMSE:

Rootmeansquared error
 SL:

Shepp–Logan phantom
References
Bernstein MA, King KF, Zhou XJ. Handbook of MRI pulse sequences. Amsterdam: Elsevier; 2004.
Cheng J, Grossman M, McKercher T. Professional CUDA C programming. Hoboken: Wiley; 2014.
Dongarra JJ, Duff I, Hammarling S, Du Croz J. A set of level 3 basic linear algebra subprograms: model implementation and test programs. ACM Trans Math Softw. 1990;16(1):1–17.
Guo H, Dai J, He Y. GPU acceleration of propeller MRI using CUDA. In: 2009 3rd international conference on bioinformatics and biomedical engineering: IEEE; 2009. p. 1–4. https://doi.org/10.1109/icbbe.2009.5162890.
Hansen MS, Atkinson D, Sorensen TS. Cartesian SENSE and kt SENSE reconstruction using commodity graphics hardware. Magn Reson Med. 2008;59:463–8.
Jackson JI, Meyer CH, Nishimura DG, Macovski A. Selection of a convolution function for Fourier inversion using gridding (computerised tomography application). IEEE Trans Med Imaging. 1991;10:473–8.
Kasper L, Engel M, Barmet C, Haeberlin M, Wilm BJ, Dietrich BE, Schmid T, Gross S, Brunner DO, Stephan KE. Rapid anatomical brain imaging using spiral acquisition and an expanded signal model. Neuroimage. 2018;168:88–100.
Kraff O, Fischer A, Nagel AM, Monninghoff C, Ladd ME. MRI at 7 Tesla and above: demonstrated and potential capabilities. J Magn Reson Imaging. 2015;41:13–33.
Li Z, Zhu M, Chu F, He X. Adaptive radial sinc kernel distribution and its application in mechanical fault diagnosis. Proc Inst Mech Eng C. 2017;231:485–93.
Nam S, Akcakaya M, Basha T, Stehning C, Manning WJ, Tarokh V, Nezafat R. Compressed sensing reconstruction for wholeheart imaging with 3D radial trajectories: a graphics processing unit implementation. Magn Reson Med. 2013;69:91–102.
Oppenheim AV, Schafer RW. Discretetime signal processing. Upper Saddle River: New Jersey: Prentice Hall; 2014.
O’sullivan J. A fast sinc function gridding algorithm for Fourier inversion in computer tomography. IEEE Trans Med Imaging. 1985;4:200–7.
Pauly J (2005) NonCartesian reconstruction. http://mriq.com/uploads/3/4/5/7/34572113/paulynoncartesian_recon.pdf. Accessed 8 Jan 2020.
Pratx G, Xing L. GPU computing in medical physics: a review. Med Phys. 2011;38:2685–97.
Rasche V, Proksa R, Sinkus R, Bornert P, Eggers H. Resampling of data between arbitrary grids using convolution interpolation. IEEE Trans Med Imaging. 1999;18:385–92.
Schiwietz T, Chang TC, Speier P, Westermann R. MR image reconstruction using the GPU. In: Medical imaging: physics of medical imaging; 2006. https://doi.org/10.1117/12.652223.
Smith DS, Gore JC, Yankeelov TE, Welch EB. Realtime compressive sensing MRI reconstruction using GPU computing and split Bregman methods. Int J Biomed Imaging. 2012;2012:1–6. https://doi.org/10.1155/2012/864827.
Sørensen TS, Schaeffter T, Noe KØ, Hansen MS. Accelerating the nonequispaced fast Fourier transform on commodity graphics hardware. IEEE Trans Med Imaging. 2008;27:538–47.
Stone SS, Haldar JP, Tsao SC, Hwu WM, Sutton BP, Liang ZP. Accelerating advanced MRI reconstructions on GPUs. J Parallel Distrib Comput. 2008;68:1307–18.
Wang H, Peng H, Chang Y, Liang D. A survey of GPUbased acceleration techniques in MRI reconstructions. Quant Imaging Med Surg. 2018;8(2):196–208.
Wang Y, Liu T. Quantitative susceptibility mapping (QSM): decoding MRI data for a tissue magnetic biomarker. Magn Reson Med. 2015;73:82–101.
Wright KL, Hamilton JI, Griswold MA, Gulani V, Seiberlich N. NonCartesian parallel imaging reconstruction. J Magn Reson Imaging. 2014;40:1022–40.
Yang J, Feng C, Zhao D. A CUDAbased reverse gridding algorithm for MR reconstruction. Magn Reson Imaging. 2013;31(2):313–23.
Acknowledgements
Not applicable.
Funding
This work was supported by the Technology Innovation Program (10067787) funded by the Ministry of Trade, Industry and Energy, Republic of Korea.
Author information
Authors and Affiliations
Contributions
Both authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Kim, S., Lee, C. Accelerating the computation for realtime application of the sinc function using graphics processing units. J Anal Sci Technol 11, 8 (2020). https://doi.org/10.1186/s4054302002051
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s4054302002051