Modeling the processing of a large amount of data

Authors

  • G. T. Balakayeva al-Farabi Kazakh National University, Almaty, Republic of Kazakhstan
  • D. K. Darkenbayev al-Farabi Kazakh National University, Almaty, Republic of Kazakhstan

DOI:

https://doi.org/10.26577/jmmcs-2018-1-490

Keywords:

Large amounts of data, data processing, analysis, modeling, methods

Abstract

The definition of large amounts of data, Big Data, is used to refer to technologies such as storing
and analyzing a significant amount of data that requires high speed and real-time decision-making
when processing. Typically, when serious analysis is said, especially if the term DataMining is
used, hat there is a huge amount of data. There are no universal methods of analysis or algorithms
suitable for any cases and any volumes of information. Data analysis methods differ significantly in
performance, quality of results, usability and data requirements. Optimization can be carried out
at various levels: equipment, databases, analytical platform, preparation of source data, specialized
algorithms. Big data is a set of technologies that are designed to perform three operations. First,
to process large amounts of data compared to "standard" scenarios. Secondly, be able to work with
fast incoming data in very large volumes. That is, the data is not just a lot, but they are constantly
becoming more and more. Thirdly, they must be able to work with structured and poorly structured
data in parallel in different aspects. Large data suggest that the input algorithms receive a stream
of not always structured information and that more can be extracted from it than any one idea.
The results of the study are used by the authors in modeling large data and developing a web
application.

References

[1] Abadi J. Daniel, Madden Samuel, Hachem Nabil. ColumnStores vs. RowStores: How Different Are They Really?, Proceedings
of the ACM SIGMOD International Conference on Management of Data, Vancouver, BC, Canada, June 2008,
vol. 3, (2008): 57-61.
[2] Balakayeva G. and Nurlybayeva K. Simulation of Large Data Processing for Smarter Decision Making. AWERProcedia
Information Technology Computer Science, 3rd World Conference on Information Technology, vol. 03, (2013): 1253-1257
[3] Batini C., Ceri S. and Navathe S. Conceptual Database Design: An Entity-Relationship Approach. Redwood City, CA:
Benjamin Cummings, (1992): 185 p.
[4] Blaha M. and Premerlani W. Object-oriented modeling and Design for Database Applications. Prentise Hall, (1997):
201 p.
[5] Boncz P., Zukowski M. and Nes N. MonetDB/X100: Hyper-pipelining query execution. In CIDR, (2005): 324 p.
[6] Boncz P. A. and Kersten M. L. MIL primitives for querying a fragmented world. VLDB Journal, vol. 8, no 2 (1999):
101-119.
[7] Frenk B. Ukrashenie bol’shikh dannykh: kak izvlekat’ znanie iz massivov informacii s pomosh’iu glubokoi analitiki [Exploitation
of most data: how to search for information from mass media analysts with help]. Moscow, (2014). 127 p.
[8] Glushakov S.V. Lomat’ko D.V. Bazy dannykh: uchebnyi kurs [Databases: training course]. Moscow: OOO "Izdatel’stvo
ACT", (2000): 504 p.
[9] Obukhov A. In-Memory. Baza dannykh v operativnoj pamjati [In-Memory. Databases in RAM], (2014): 128 p. http://ecmjournal.
ru/post/In-Memory-Baza-dannykh-v-operativnojj-pamjati.aspx
[10] Sosnov A. Osnovy proektirovanie informacionnykh sistem [Basics of projection of information systems]. Moscow: DMK
Press, (2002): 1020 p.
[11] Stonebraker M., Abadi D. J., Batkin A., Chen X., Cherniack M., Ferreira M., Lau E., Lin A., Madden S. R., O’Neil E.
J., O’Neil P. E., Rasin A., Tran N., and Zdonik S. B. C-Store: A Column-Oriented DBMS. In VLDB, (2005): 553-564.
[12] Vishnevskii A. SQL Server. Effektnaya rabota [SQL Server. Effective work]. Sankt-Peterburg, (2009): 541 p.

Downloads

Published

2018-08-27