# Parallel implementation of Thomas algorithm for the 2D heat equation

### Abstract

In this paper was considered a parallel implementation of the Thomas algorithm for the 2D heat equation. MPI was chosen as the technology for parallelization. The numerical solution of the two-dimensional heat conduction problem was solved using the two step iteration process of alternating direction implicit method (ADI). The Thomas algorithm is simple to implement a sequential program, but difficult to parallelize due to dependent data transfers. When applying this method to solve the 2D heat equation, there is a need to implement Thomas algorithm along each direction x-axis and y-axis. It was implemented the parallelization of this problem using the Yanenko method using 1D and 2D data decomposition. In particular, in 2D data decomposition, the Yanenko method was used along each x-axis and y-axis direction. In the article the speedup and efficiency of parallel programs using 1D and 2D data decomposition were shown in the form of tables and graphs. The presented algorithm was tested on a cluster of the computing center of Novosibirsk State University for a different number of points in the computational domain (from 512x512 to 4096x4096). The obtained test results are presented and analyzed, on the basis of which the features of the used decompositions are described.### References

[1] Samarskiy A.A., Nickolayev Ye.S., Metody resheniya setochnykh uravneniy [Methods of grid equations solving] , (Moscow: The science, 1987), 130.

[2] Samarskiy A.A., Vvedeniye v chislennyye metody [Introduction to numerical methods] , (St. Petersburg: Deer, 2005).

[3] Davidson A., Zhang Y., Owens J., "An auto-tuned method for solving large tridiagonal systems on the GPU" , (Conference Parallel & Distributed Processing Symposium (IPDPS), IEEE International, May 16-20, 2011): 956-965.

[4] Kim H.S., Wu S., Chang L.W., Hwu W.M., "A scalable tridiagonal solver for GPUs." , (In Parallel Processing (ICPP), International Conference, September 13-16, 2011): 444 453.

[5] Hockney R.W., Jesshope C.R., Parallel computers: architecture, programming and algorithm , (Bristol: 1986), 274-280.

[6] Chang L.W., Stratton J.A., Kim H.S., Hwu W.M., "A scalable, numerically stable, highperformance tridiagonal solver using GPUs. " ,(High Performance Computing, Networking, Storage and Analysis (SC), for IEEE Computer Society Press, November 10-16, 2012): 11.

[7] Polizzi E., Sameh A., "A parallel hybrid banded system solver: The SPIKE algorithm " , Parallel Computing, vol. 32, no 2 (2006): 177-194.

[8] Polizzi E., Sameh A., "A parallel environment for solving banded linear systems" , Computers and Fluids, vol. 36, no 1 (2007): 113-120.

[9] Loganova L.V., Golovashkin D.L., "Parallel'nyy algoritm realizatsii metoda vstrechnykh tsiklicheskikh progonok dlya dvumernykh setochnykh oblastey[Parallel algorithm for implementing the counter cyclic sweep method for two-dimensional grid regions]" , Journal of Computing Technologies, vol. 16, no 4 (2011): 64-71.

[10] Yanenko N.N., Konovalov A.N., Bugrov A.N., Shustov G.V., "Ob organizatsii parallel'nykh vychisleniy i rasparallelivanii progonki [On the organization of parallel computing and parallelizing sweeps]" , Computational methods of continuum mechanics, vol. 9, no 7 (1978): 139-146.

[11] Sapronov I.S., Bykov A.N., "Parallel'no-konveyyernyy algoritm [Parallel pipeline algorithm]" , Atom, no 44 (2009): 24-25.

[12] Povitsky A., "Parallelization of the pipelined Thomas algorithm" , Journal of Parallel and Distributed Computing, vol. 59, no 1 (1999): 68-97.

[13] Sengupta S., Harris M., Zhang Y., Owens J.D., "Scan primitives for GPU computing" , (Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware, August 04-05, 2007): 97-106.

[14] Goddeke D., Strzodka R., "Cyclic reduction tridiagonal solvers on GPUs applied to mixed-precision multigrid" , IEEE Transactions on Parallel and Distributed Systems vol. 22, (2011): 22-32.

[15] Davidson A., Owens J.D., "Register packing for cyclic reduction: A case study" , (Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, March, 2011): 65-79.

[16] Eglo D., "High performance nite di erence PDE solvers on GPUs" , Wilmott magazine, (2010): 32-40.

[17] Zhang Y., Cohen J., "Fast tridiagonal solvers on the GPU" , (Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, January 09-14, 2010): 127-136

[18] Fedorov A.A., Bykov A.N., "Metod dvukhurovnevogo rasparallelivaniya progonki dlya resheniya trekhdiagonal'nykh lineynykh sistem na gibridnykh EVM s mnogoyadernymi soprotsessorami [A method of two-level parallelization of sweeps for solving tridiagonal linear systems on hybrid computers with multi-core coprocessors]" , Computational methods and programming, vol. 17 (2016): 234-244.

[19] Terekhov A.V., "Parallel dichotomy algorithm for solving tridiagonal system of linear equations with multiple right-hand sides" , Parallel Computing, vol. 36 (2010): 423-438.

[20] Akimova Y.N., "Rasparallelivaniye algoritma matrichnoy progonki [Parallelization of matrix Thomas algorithm ]" , Mathematical modeling, no 9 (1994): 61-67.

[21] Ilyin V.P., "Parallel'nyye neyavnyye metody peremennykh napravleniy [Parallel implicit methods of alternating directions]" , Journal of Computational Mathematics and Physics, vol. 37, no 8 (1997): 899-907.

[22] Bykov A.N., Yerofeyev A.M., Sizov Ye.A., Fedorov A.A., "Metod rasparallelivaniya progonki na gibridnykh EVM [Parallel Thomas method on hybrid computers]" , Computational methods and programming, vol. 14, no 2 (2013): 43-47.

[23] Ilyin S.A., Starchenko A.V., "Rasparallelivaniye skhemy pokomponentnogo rasshchepleniya dlya chislennogo resheniya uravneniya teploprovodnosti [Parallelization of the component-wise splitting scheme for the numerical solution of the heat equation]" , (Parallel Computing Technologies, Proceedings of the international scienti c conference, August 31 - September 4, 2015):399-402.

[24] Kenzhebek Y.G., "Algoritm Yanenko dlya odnomernogo uravneniya teploprovodnosti [Yanenko algorithm for the one- dimensional heat equation]" , Herald of KazNITU, no 2 (2019): 512-516.

[25] "Using software" , Information and Computing Center of Novosibirsk State University, accessed July 13, 2019, http://nusc.nsu.ru/wiki/doku.php.

[2] Samarskiy A.A., Vvedeniye v chislennyye metody [Introduction to numerical methods] , (St. Petersburg: Deer, 2005).

[3] Davidson A., Zhang Y., Owens J., "An auto-tuned method for solving large tridiagonal systems on the GPU" , (Conference Parallel & Distributed Processing Symposium (IPDPS), IEEE International, May 16-20, 2011): 956-965.

[4] Kim H.S., Wu S., Chang L.W., Hwu W.M., "A scalable tridiagonal solver for GPUs." , (In Parallel Processing (ICPP), International Conference, September 13-16, 2011): 444 453.

[5] Hockney R.W., Jesshope C.R., Parallel computers: architecture, programming and algorithm , (Bristol: 1986), 274-280.

[6] Chang L.W., Stratton J.A., Kim H.S., Hwu W.M., "A scalable, numerically stable, highperformance tridiagonal solver using GPUs. " ,(High Performance Computing, Networking, Storage and Analysis (SC), for IEEE Computer Society Press, November 10-16, 2012): 11.

[7] Polizzi E., Sameh A., "A parallel hybrid banded system solver: The SPIKE algorithm " , Parallel Computing, vol. 32, no 2 (2006): 177-194.

[8] Polizzi E., Sameh A., "A parallel environment for solving banded linear systems" , Computers and Fluids, vol. 36, no 1 (2007): 113-120.

[9] Loganova L.V., Golovashkin D.L., "Parallel'nyy algoritm realizatsii metoda vstrechnykh tsiklicheskikh progonok dlya dvumernykh setochnykh oblastey[Parallel algorithm for implementing the counter cyclic sweep method for two-dimensional grid regions]" , Journal of Computing Technologies, vol. 16, no 4 (2011): 64-71.

[10] Yanenko N.N., Konovalov A.N., Bugrov A.N., Shustov G.V., "Ob organizatsii parallel'nykh vychisleniy i rasparallelivanii progonki [On the organization of parallel computing and parallelizing sweeps]" , Computational methods of continuum mechanics, vol. 9, no 7 (1978): 139-146.

[11] Sapronov I.S., Bykov A.N., "Parallel'no-konveyyernyy algoritm [Parallel pipeline algorithm]" , Atom, no 44 (2009): 24-25.

[12] Povitsky A., "Parallelization of the pipelined Thomas algorithm" , Journal of Parallel and Distributed Computing, vol. 59, no 1 (1999): 68-97.

[13] Sengupta S., Harris M., Zhang Y., Owens J.D., "Scan primitives for GPU computing" , (Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware, August 04-05, 2007): 97-106.

[14] Goddeke D., Strzodka R., "Cyclic reduction tridiagonal solvers on GPUs applied to mixed-precision multigrid" , IEEE Transactions on Parallel and Distributed Systems vol. 22, (2011): 22-32.

[15] Davidson A., Owens J.D., "Register packing for cyclic reduction: A case study" , (Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, March, 2011): 65-79.

[16] Eglo D., "High performance nite di erence PDE solvers on GPUs" , Wilmott magazine, (2010): 32-40.

[17] Zhang Y., Cohen J., "Fast tridiagonal solvers on the GPU" , (Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, January 09-14, 2010): 127-136

[18] Fedorov A.A., Bykov A.N., "Metod dvukhurovnevogo rasparallelivaniya progonki dlya resheniya trekhdiagonal'nykh lineynykh sistem na gibridnykh EVM s mnogoyadernymi soprotsessorami [A method of two-level parallelization of sweeps for solving tridiagonal linear systems on hybrid computers with multi-core coprocessors]" , Computational methods and programming, vol. 17 (2016): 234-244.

[19] Terekhov A.V., "Parallel dichotomy algorithm for solving tridiagonal system of linear equations with multiple right-hand sides" , Parallel Computing, vol. 36 (2010): 423-438.

[20] Akimova Y.N., "Rasparallelivaniye algoritma matrichnoy progonki [Parallelization of matrix Thomas algorithm ]" , Mathematical modeling, no 9 (1994): 61-67.

[21] Ilyin V.P., "Parallel'nyye neyavnyye metody peremennykh napravleniy [Parallel implicit methods of alternating directions]" , Journal of Computational Mathematics and Physics, vol. 37, no 8 (1997): 899-907.

[22] Bykov A.N., Yerofeyev A.M., Sizov Ye.A., Fedorov A.A., "Metod rasparallelivaniya progonki na gibridnykh EVM [Parallel Thomas method on hybrid computers]" , Computational methods and programming, vol. 14, no 2 (2013): 43-47.

[23] Ilyin S.A., Starchenko A.V., "Rasparallelivaniye skhemy pokomponentnogo rasshchepleniya dlya chislennogo resheniya uravneniya teploprovodnosti [Parallelization of the component-wise splitting scheme for the numerical solution of the heat equation]" , (Parallel Computing Technologies, Proceedings of the international scienti c conference, August 31 - September 4, 2015):399-402.

[24] Kenzhebek Y.G., "Algoritm Yanenko dlya odnomernogo uravneniya teploprovodnosti [Yanenko algorithm for the one- dimensional heat equation]" , Herald of KazNITU, no 2 (2019): 512-516.

[25] "Using software" , Information and Computing Center of Novosibirsk State University, accessed July 13, 2019, http://nusc.nsu.ru/wiki/doku.php.

Published

2019-10-28

How to Cite

KENZHEBEK, Yerzhan et al.
Parallel implementation of Thomas algorithm for the 2D heat equation.

**Journal of Mathematics, Mechanics and Computer Science**, [S.l.], v. 103, n. 3, p. 31-42, oct. 2019. ISSN 2617-4871. Available at: <https://bm.kaznu.kz/index.php/kaznu/article/view/652>. Date accessed: 22 oct. 2020. doi: https://doi.org/10.26577/JMMCS-2019-3-24.
Section

Computer Science

Keywords
high-performance computing, Thomas algorithm, Yanenko method, parallel computing, ADI method, MPI