数据仓库与数据挖掘
数据仓库与数据挖掘
5000+ 人选课
更新日期:2025/09/20
开课时间2025/08/21 - 2025/12/31
课程周期19 周
开课状态开课中
每周学时-
课程简介

《数据仓库与数据挖掘》在线课程注重理论联系实践,理论为经,应用为纬。立足数据,在统一框架内介绍数据仓库和数据挖掘技术,主要包括数据概念、数据仓库模型、知识类型,数据预处理、数据分类、数据回归、关联挖掘、数据聚类、异常检测、数据可视化等方法,以及大数据挖掘平台的设计与实现。通过学习,学生可以掌握海量数据仓库存储与挖掘的基本原理,利用数据预处理、关联规则挖掘、聚类分析、分类挖掘、异常检测等算法,研制软件工具,解决实际工程中海量数据的高效管理与深度利用问题。该课程为学生今后从事科学研究工作或从事各种数据利用工作提供必要的基础理论和基本技能。


The online course "Data Warehouse and Data Mining" focuses on the connection of theory with practice, with theory as warp and application as weft. Based on data, data warehouse and data mining technology is introduced within a unified framework, including data concepts, data warehouse models, knowledge types, data preprocessing, data classification, data regression, association mining, data clustering, anomaly detection, data visualization and so on, as well as the design and implementation of a big data mining platform. By learning the course, you can master the basic principles of massive data warehouse storage and mining, and further take advantage of data preprocessing, association rule mining, cluster analysis, classification mining, anomaly detection and other algorithms to develop software tools to solve the problems on efficient management and in-depth utilization of massive data in actual projects. This course provides the necessary basic theories and basic skills for students to engage in scientific research or engage in various data utilization tasks in the future.

课程大纲

1 Introduction

1. What Is Data Mining and Why Data Mining

2.Data Mining Process

3. Data to be Mined

4. Data Mining Tasks

5.Evaluation of Knowledge

Test 1

2 Data

Data Objects and Attribute Types

Basic Statistical Descriptions of Data

Measuring Data Similarity and Dissimilarity

Test 2

3 Data Preprocessing

Overview

Data Cleaning

Data Integration

Data Transformation

Data Reduction

Test 3

4 Association Rule Mining

Basic Concept

Frequent Itemset Generation

Rule Generation

Factors Affecting Complexity of Apriori

Compact Representation of Frequent Itemsets

Pattern Evaluation

Test 4

5 Classification

Classification: Basic Concepts

Decision Tree Induction

Bayes Classification Methods

Techniques to Improve Classification Accuracy: Ensemble Methods

Classification of Class-Imbalanced Data Sets

Model Evaluation and Selection

Test 5

6 Cluster Analysis

An Introduction

Partitioning Methods

Hierarchical Methods

Density- and Grid-Based Methods

Evaluation of Clustering

Test 6

7 Outlier Analysis

Outlier and Outlier Analysis

Outlier Detection Methods

Statistical Approaches

Proximity-Based Approaches

Clustering-Based and Classification–Based Approaches

Test 7

8 Data visualization

Introduction

Function of Data Visualization

Data Visualization Methods

Tools of Data Visualization

Test 8

9 Data warehouse

An Introduction

Test 9

10 Perspective

数据资源

10.2数据使用

10.3数据生态

Test 10