简体   繁体   中英

Sequential Pattern - Data Mining

I am new to data mining, so I apologize if this question may be an obvious question to anyone. I know there are quite a few data mining algorithms out there, such as sequential pattern mining, or the apriori algorithm. I would like to know if the following code I have implemented would be considered data mining, specifically for sequential patterns, if I have a database with approximately 20,000 students, or do I have to specifically use one of the existing data mining algorithms?

String x = "SELECT STUDENTS.ROW, STUDENTS.MAJOR, STUDENTS.NAME " +
"CASE WHEN prior_row.NAME IS NOT NULL" +
"AND EXISTS(SELECT 'x' FROM STUDENTS prior_row " +
"WHERE STUDENTS.MAJOR = prior_row.MAJOR" +
"AND STUDENTS.ROW > prior_row.ROW + 1" +
"SELECT STUDENTS.MAJOR, STUDENTS.ROW, STUDENTS.NAME WHERE" +
"MAJOR < (SELECT MAJOR FROM STUDENTS WHERE MAJOR = 'MATH' 
"AND WHERE MAJOR > (SELECT MAJOR FROM STUDENTS WHERE MAJOR = 'SCIENCE' THEN 1 ELSE NULL          END Flagged_Values";

 st.executeQuery(x);

  String y = "SELECT STUDENTS.ROW, STUDENTS.MAJOR, STUDENTS.NAME" +
"CASE WHEN previous.NAME IS NOT NULL" +
"AND EXISTS(SELECT 'y' FROM STUDENTS previous" +
"WHERE STUDENTS.MAJOR = previous.MAJOR" +
"AND STUDENTS.ROW > previous.ROW + 1" +
"SELECT STUDENTS.MAJOR, STUDENTS.ROW, STUDENTS.NAME WHERE" +
"MAJOR < (SELECT THE_OUTCOME FROM STUDENTINFO WHERE MAJOR ='Math' +
"AND WHERE MAJOR > (SELECT MAJOR FROM STUDENTS WHERE MAJOR = 'SCIENCE'" +
"AND WHERE MAJOR > (SELECT MAJOR FROM STUDENTS WHERE MAJOR = 'Engineering'
"THEN 1 ELSE NULL END Flag ";

 st.executeQuery(y);

What you are doing are SQL select statements . Projection, selection and aggregation.

Have you read the Wikipedia article on data mining ?

The actual data mining task is the automatic or semi-automatic analysis of large quantities of data to extract previously unknown interesting patterns such as groups of data records (cluster analysis), unusual records (anomaly detection) and dependencies (association rule mining). This usually involves using database techniques such as spatial indices. These patterns can then be seen as a kind of summary of the input data, and may be used in further analysis or, for example, in machine learning and predictive analytics. For example, the data mining step might identify multiple groups in the data, which can then be used to obtain more accurate prediction results by a decision support system. Neither the data collection, data preparation, nor result interpretation and reporting are part of the data mining step, but do belong to the overall KDD process as additional steps.

The term "data mining" is often misused for any kind of data collection or selection, but one should call these tasks "data collection" and "database query" instead of pulling up random buzzwords. Data mining is the intersection of statistics, AI, machine learning, and databases . If these components are missing (and except for databases, I don't see them in your query), it should be called eg "databases", "machine learning" or "statistics".

In general, and keep in mind, this is inherently opinion based, data mining refers to the process of taking data that is in a relatively unusable format and converting it into a format that is more usable.

For instance, if I have a huge .txt dump of unstructured text and I then extract relevant portions (according to some formal definition of relevant) and place it into a .bson store or something similar, that would be data mining, regardless of exactly how I do the extraction.

However, since your data is already in a SQL database, I wouldn't consider this data mining. I would consider it SQL development, though again, this is largely opinion-based. A SQL database is already a highly useful way of storing data, so accessing that data isn't introducing a level of functionality that wasn't already present.

tl;dr: I wouldn't say this counts as data mining, but it's a gray area.

在数据挖掘领域,执行SQL查询将不被视为数据挖掘。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM