Home Computer Science
Web-Oriented Tools for Data Analytics
Table of Contents:
Tie amount of open data is increasing and it could be effectively used in different fields of studies.
Tiere are numerous web-oriented services which allow user to visualize open data such as world bank open data (https://data. worldbank.org/), Google Public Data Explorer (https://www. google.com/publicdata/directory), Global Health Observatory (https://www.who.int/gho/en/), Registry of Open Data (RODA) on AWS (https://registry.opendata.aws/), They contain open data as already graphical visualization of the data. But it could be more efficient to have a tool which could allow flexibly to manipulate with open data.
An online service was developed by integrating the following frameworks: Laravel (back-end), Vue.js (front-end), Bootstrap, Pyodide, Highcharts, CodeMirror.
Tie developed service has the following functions (Figure 9.7):
Figure 9.7 Interface of the developed service.
"Die typical scenario for the usage of the developed service includes the following steps:
Ethics, Regulations, and Law Constraints for Data Analytics
Analytics and AI are powerful tools that have real-word outcomes. Applying practical, ethical, and legal constructs and scenarios enables getting effective analytics results.
"Die GDPR, which entered into force last Friday, May 25, guarantees the citizens the ability to decide on the processing of their data through a series of options linked to each of the uses that companies make them or by exercising the rights recognized in the regulations themselves.
Consequently, the new legislation will also limit the use of Big Data that many companies have been developing for commercial or security purposes. It was precisely the concern generated in this area in the European authorities that led to the development of a unifying regulation that would put an end to the gap that the digital revolution has been leaving in recent years.
Therefore, entities must change their strategies in this regard, as the illicit use of customer data could lead them to pay fines of up to 20M EUR in the case of the most serious infractions.
In the search for new solutions that allow them to make legal use of the data generated, risk analysis is the first step that companies will have to take as well as the creation of a figure in charge of making good use of the data of customers. To guarantee this, the companies’ control bodies themselves must have sufficient human and material resources to determine whether there is illicit benefit or not in that use of data.
In short, with the application of the GDPR, customers gain power and control over their data, while companies must comply with a series of obligations that limit them in their commercial activities through that data. However, the proper management of this information can be a great opportunity for them because it will lead to more direct and personalized advertising that goes beyond the current analysis and segmentation of customers.
The application of the GDPR, which really came into effect on May 25, 2016, was suspended until this year for companies to have enough time to adapt their regulations in accordance with the new legislation. However, according to a study by Leet Security, 88% of companies have not completed the process of adaptation to the regulations.
This investigation was aimed at reviewing some relevant data analytics techniques which have been applied to three different case of study: prediction of sports competition results based on open data, the prediction of the cold sickness, and companies’ cyber risk assessments.
We have demonstrated that the use of appropriate analysis tools can provide relevant information for the final purpose of each case study. Different parameters have been obtained to assess the quality of the model and the prediction obtained.
The conclusion is that there is no unique method, model, or approach which provides the best results in every scenario. An appropriate study must be performed on the different parameters; in some cases, a preprocessing of the data is required, and the election of the most appropriate regression or classification methods is not trivial.
Angierski, A., Kuehn, V. (2013). Aliasing-tolerant sub-Nyquist sampling ofFRI signals. IEEE International Conference on Communications (ICC). Budapest, pp. 4957-4961. Doi: 10.1109/ICC.2013.6655364.
Bartolini D.N., Benavente-Peces, C., Ahrens, A. (2017a). Using risk assessments to assess insurability in the context of cyber insurance. 14th International Joint Conference on e-Business and Telecommunications. (ICETE 2017). Madrid: July 24-26, pp. 337-345.
Bartolini, D.N., Benavente-Peces, C., Ahrens, A. (2017b). Risk assessment and verification of insurability. Proceedings of the 7th International Joint Conference on Pervasive and Embedded Computing and Communication Systems. (PECCS 2017). Madrid: July 24-26, pp. 105-108.
Bartolini, D.N., Zascerinska, J., Ahrens, A. (2018a). Instrument design for cyber risk assessment in insurability verification. Informatics, Control, Measurement in Economy and Environment Protection, 3, pp. 7-10. Doi. org/10.5604/01.3001.0012.5274.
Bartolini, D.N., Benavente-Peces, C., Ahrens, A. (2018b). Cyber risk assessmentfor insurability verification. Proceedings of the 8th International Joint Conference on Pervasive and Embedded Computing and Communication Systems. (PECCS 2018). Porto: July 29-30, pp. 231-235.
Biener, Ch., Eling, M., Wirfs, J.H. (2015). Insurability of cyber risk: An empirical analysis. Working Papers on Risk Management and Insurance, No. 151, January.
Boser, B.E., Guyon, I.M., Vapnik, V.N. (1992). A training algorithm for optimal margin classifiers. Proceedings of the 5th Annual Workshop on Computational Learning Theory. COLT ’92. New York: ACM, pp. 144-152.
Cortes, C., Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), pp. 273-297.
Dixon, M.J, Coles, S.G. (1997). Modelling association football scores and inefficiencies in the football betting market. Journal of the Royal Statistical Society: Series C, 46(2), pp. 265-280.
Elgendy N., Elragal A. (2014). Big data analytics: A literature review paper. In: P. Perner, (eds.), Advances in data mining. Applications and theoretical aspects. ICDM 2014. Lecture notes in computer science, 8557, Cham: Springer, pp. 214-227.
Emmanuel, I., Stanier, C. (2016). Defining Big Data. Proceedings of the International Conference on Big Data and Advanced Wireless Technologies. BDAW’16. New York: ACM, pp. 5:l-5:6. Doi:10.1145/ 3010089.3010090.
Graham, B. (2018). Predicting football matches using EA player ratings and ten- sorfiow. Retrieved from: https://towardsdatascience.com/predicting- premier-league-odds-from-ea-player-bfdb52597392. (Access: 10.2019)
Harville, D. (2003). The selection or seeding of college basketball or football teams for postseason competition. Journal of the American Statistical Association, 98(461), pp. 17-27. Retrieved from: http://www.jstor.org/ stable/30045190. (Access: 10.2019)
Hyndman, R.J., Athanasopoulos, G. (2018). Forecasting: Principles and practice, 2nd ed. Melbourne, Australia: Otexts.
Hyndman, R.J., Koehler, A.B. (2006). Another look at measures of forecast accuracy. InternationalJournal of Forecasting, 22(4), pp. 679-688.
Khair, U., Fahmi, H., A1 Hakim S., Rahim, R. (2017). Forecasting error calculation with mean absolute deviation and mean absolute percentage error .Journal of Physics: Conference Series, 930(1).
Maher, M.J. (1982). Modelling association football scores. Statistica Neerlandica, 36(3), pp. 101-163.
Pyne, S., Rao, P.B.L.S., Rao, S.B. (2016). Big data analytics: Methods and applications. New Delhi: Springer. Doi.org/10.1007/978-81-322-3628-3.
Srinivasa, S., Mehta, S. (2014). Big Data analytics. Third International Conference. BDA 2014. New Delhi: December 20-23.
Use Open Standards, Open Data, Open Source, and Open Innovation. Retrieved from: https://digitalprinciples.org/principle/use-open-standards- open-data-open-source-and-open-innovation/. (Access: 10.2019).