Cluster analysis application in the compulsoryinsurance of civil-legal liability of the vehicles’ owners
DOI:
https://doi.org/10.26577/JMMCS-2019-2-28Кілттік сөздер:
cluster analysis, unsupervised machine learning, k-means algorithm, insurance, underwriting analysisАннотация
With an increase in flow of the processed and stored information in insurance organizations in
Kazakhstan, associated with the building of customers’ base, mergers and acquisitions processes
and implementation of the new insurance products; the relevance of the problem of preliminary
information processing for its structuring, allocation of distinctive attributed, generalization
and sorting grows. Without appropriate scientific and methodological approach, data processing
and analysis will be more difficult for insurance organizations and, may require the utilization
of significant informational-computing and financial resources. In the present article as a modern
scientific-research approach to the solution of this problem, it is suggested to apply a procedure of
the cluster analysis by k-means algorithm, which makes it possible to simplify the processing and
further analysis of data set by arranging data in relatively homogeneous groups. Particularly, the
present article describes a process of the cluster analysis application by the k-means algorithm to
the data on losses by a class of Compulsory insurance of civil-legal liability of the vehicles’ owners.
The purpose of the present article is to split the losses by this class of insurance into homogeneous
qualitative groups (clusters) based on frequency and severity of losses and, to interpret acquired
clusters. Results of the k-means algorithm confirm that each acquired cluster has statistically significant
data with similar impact upon losses’ process, which may be employed in the future for
evaluation of losses of the insurance organization. Methodological approaches and results obtained
in this article will, first of all, be interesting to the professional participants of insurance market
of the Republic of Kazakhstan to conduct better underwriting research on the formation of the
efficient structure of the insurance portfolio of Compulsory insurance of civil-legal liability of the
vehicles’ owners in accordance with tariff rates.
Библиографиялық сілтемелер
[2] "Current state of the insurance sector of the Republic of Kazakhstan National Bank of Kazakhstan, accessed May 15, 2019, https://nationalbank.kz/cont/%D0%A2%D0%A1%2001.04.2019%20eng.pdf.
[3] Resolution of Board of the Agency of the Republic of Kazakhstan on regulation and supervision of the financial market and the financial organizations, March 25, 2006, No. 85.
[4] Resolution of Board of National Bank of the Republic of Kazakhstan, December 26, 2016, No. 304.
[5] "Big Data v Kazahstne: O krupnom zakazchike, kadrah i perspektivah"[Big Data in Kazakhstan: On a large customer, personnel and prospects], accessed May 15, 2019, https://kapital.kz/tehnology/71257/big-data-v-kazahstane-o-krupnomzakazchike-kadrah-i-perspektivah.html.
[6] Cherezov D.S., Tyukachev N.А., "Obzor osnovnyh metodov klassifikacii i klasterizacii dannyh [Overview of basic data classification and clustering methods]" , The Bulletin of Voronezh State University 2 (2009): 27.
[7] Atapina N.V., "Upravlenie processom anderrajtinga v imushestvennom strahovanii [Property Insurance Underwriting Management]" , Molodoj uchenyj 1 (2011): 84-87.
[8] Octaviani D, "Portfolio rule-based clustering at automobile insurance in Portugal" , Internship report presented as partial requirement for obtaining the Master’s degree in statistics and information management proposal, NOVA Information Management School (2016).
[9] Berry M.J.A. and Linoff G.S., Data Mining Techniques: for Marketing, Sales and Customer Relationship Management (United States of America: Wiley Publishing, 2014), 1150.
[10] Kaufman L. and Rousseeuw P.J., Finding groups in data: an introduction to cluster analysis (United States of America: Wiley-Interscience, 2009), 3.
[11] Brito P.Q., Soares C., Almeida S., Monte A. and Byvoet M., "Customer segmentation in a large Data base of an Online customized fashion business" , Robotics and Computer-Integrated Manufacturing 36 (2015): 93-100.
[12] Hasan M.S. and Duan Z.H., "Hierarchical k-means: a hybrid clustering algorithm and its application to study gene expression in lung adenocarcinoma" , Emerging trends in computational biology, bioinformatics and systems biology (2015): 51-67. Accessed May 10, 2019. doi:10.1016/B978-0-12-802508-6.00004-1.
[13] Han J. and Kamber M., Data mining concepts and techniques (United States of America: Morgan Kaufmann Publishers, 2012), 600-703.
[14] He Z., Xu X., Huang J.Z. and Deng S., "Mining class outliers: сoncepts, algorithms and applications in CRM", Expert Systems with Applications 27 (2004): 681-697.
[15] Ali Ghorbani and Sara F., "Fraud detection in automobile insurance using a data mining based approach" , International Journal of Mechatronics, Electrical and Computer Technology (IJMEC) 8(27) (2018): 3764-3771.
[16] Yi P., Gang K., Alan S., Zhengxin C., Deepak K., Yong S. and Peter K., "Application of clustering methods to health insurance fraud detection"paper presented in the International conference on service systems and service management, Troyes, France (2006).
[17] Thakur S.S. and Sing J.K., "Mining Customer’s Data for Vehicle Insurance Prediction System using k-Means Clustering-An Application" , International Journal of Computer Applications in Engineering Sciences 3(4) (2013): 148-153.
[18] Kaveh K-D., Farshid A. and Shaghayegh A., "Insurance customer segmentation using clustering approach" , International Journal of Knowledge Engineering and Data Mining (2016). Accessed May 09, 2019. doi:10.1504/IJKEDM.2016.082072.
[19] Ai C.Y., Kate A.S., Robert J.W. and Malcolm B., "Clustering technique for risk classification and prediction of claim costs in the automobile insurance industry" , Intelligent systems in accounting, finance and management 10 (1) (2001): 39-50.
[20] Everitt B.S., Landau S. and Leese M., Cluster Analysis (London: Arnold, 2001): 260.
[21] Mirkin B., "Choosing the number of clusters" , WIRE Data Mining and Knowledge Discovery 3 (2011): 252-260.
[22] Romesburg C.H., Cluster Analysis for Researchers (North Carolina: Lifetime Learning Applications, Belmont, Ca. Reproduced by Lulu Press, 2004), 15-334.
[23] Mirkin B., Clustering for Data Mining: a data recovery approach (United States of America: Chapman & Hall/CRC, 2012), 93-137.
[24] Pang-Ning T., Michael S. and Vipin K., Introduction to Data Mining (United States of America: Addison-Wesley Longman Publishing Co., Inc. Boston, MA, 2005), 125-147.
[25] P´erez-Ortega J., Almanza-Ortega N.N and Romero D., "Balancing effort and benefit of K-means clustering algorithms in Big Data realms" , PLoS One 13(9) (2018). Accessed May 07, 2019. doi:10.1371/journal.pone.0201874.
[26] Madhulatha T.S., "An overview on clustering methods" , IOSR Journal of Engineering 2(4) (2012): 719-725.
[27] Ghoreyshi S. and Hosseinkhani J., "Developing a clustering model based on k-means algorithm in order to creating different policies for policyholders in insurance industry" , International Journal of Advanced Computer Science and Information Technology (IJACSIT) 4(2) (2015): 46-53.
[28] Jain A., Murty M. and Flynn P., "Data clustering: A review" , ACM Computing Surveys Vol. 31, No. 3 (1999): 264-323.
[29] The Law of Republic of Kazakhstan "On the republican budget for 2018-2020".
[30] Izakova N.B. and Kapustina L.M., "Primenenie metodov klasternogo analiza dlya segmentirovaniya promyshlennyh rynkov [Application of cluster analysis for segmentation of industrial markets]" , Vestnik of Samara State University of Economics 9(131) (2015): 100-107.