Desktop version

Home arrow Mathematics

  • Increase font
  • Decrease font


<<   CONTENTS   >>

Knowledge Mining Methodology

Step 1 concerns the proper preparation of data for modeling, handling of missing values, and support or minimization of total errors through elimination of outliers and non-real values. Filtering should be applied at this stage, that is, choosing the type and range of data to be analyzed. For example, it consists of selecting a variety of specific products, or often narrowing down data across millions of data records. Data transformation should be non-romantic (mostly consist of normalization or normalization of the data) if necessary for the use of a specific DM method. For example, the use of MIN-MAX regularization at the model tower step is required for samples using neural networks where it is recommended, reducing the risk of redundant effects of data size on model results.

In the second stage, software is mainly used. Certain data mining methods are implemented to perform exploration tasks. At this point, the dataset prepared in step 1 is used. The speed of working with the database and tower of the model depends not only on the complexity of the problem itself, but also on the type, value, and weft of the data (qualitative or quantitative data, models obtained with or without a teacher, the number of data records, and the dependency between input and output variables). This stage of use of the data mining method is most surprisingly automatic, while the realization time is related to the complexity of the mentioned problem, but to the performance of the computing equipment itself. While the data mining solver doesn't require a powerful graphics card or large, inflexible disks (more and more often data distribution and grid computing methods are used), it's important to have a good CPU and a large RAM value. However, a powerful computer device with properly configured graphic sebum can significantly push computing.

Step 3 is about the interpretation of the results obtained in step 2. It is important to participate in the trials, gain expertise in Statistical Methods and Data Mining. Step 2 may seem relatively simple using the capabilities of a modern computer, but other steps require specialized knowledge. Enterprises realize the ripple effect of data exploration projects that require the cooperation of experts from multiple branches and visitor departments. In the literature, various data mining methodologies are proposed in the form of scenarios for collecting and preparing data for telemetry and distributing the results for the implementation of specific solutions.

Exercise

  • 1. What do you mean by time series?
  • 2. Describe various components of time series.
  • 3. What is the difference between data mining and knowledge mining?
  • 4. Describe various stages of knowledge mining.
  • 5. Justify the use of Python in time series analysis.

References

Abdullah, Ramli. "Short term and long term projection of Malaysian palm oil production." Oil Palm Industry Economic Journal 3, no. 1 (2003): 32-36.

Beveridge, Stephen, and Charles R. Nelson. "A new approach to decomposition of economic time series into permanent and transitory components with particular attention to measurement of the 'business cycle'." Journal of Monetary Economics 7, no. 2 (1981): 151-174.

Caniato, Federico, Matteo Kalchschmidt, and Stefano Ronchi. "Integrating quantitative and qualitative forecasting approaches: Organizational learning in an action research case." Journal of the Operational Research Society 62, no. 3 (2011): 413-424.

Crowe, Sarah, Kathrin Cresswell, Ann Robertson, Guro Huby, Anthony Avery, and Aziz Sheikh. "The case study approach." BMC Medical Research Methodology 11, no. 1 (2011): 100.

Enders, Walter, Gerald F. Parise, and Todd Sandler. "A time-series analysis of transnational terrorism: Trends and cycles." Defence and Peace Economics 3, no. 4 (1992): 305-320.

Mitchell, Wesley C. "Empirical research and the development of economic science." In Arthur F. Burns (Ed.) Economic Research and the Development of Economic Science and Public Policy, pp. 1-20. NBER, Cambridge, MA (1946).

Mohammed, Jameel, Sanjay Bahadoorsingh, Neil Ramsamooj, and Chandrabhan Sharma. "Performance of exponential smoothing, a neural network and a hybrid algorithm to the short term load forecasting of batch and continuous loads." In 2017 IEEE Manchester PowerTech, pp. 1-6. IEEE, Manchester (2017).

Olsen, Wendy. "Triangulation in social research: Qualitative and quantitative methods can really be mixed." Developments in Sociology 20 (2004): 103-118.

 
<<   CONTENTS   >>

Related topics