- 作业标题:CSCI 4144 - Data Mining and Data Warehousing Assignment 2 - Cube Computation
- 课程名称:Dalhouse University CSCI 4144 Data Mining and Data Warehousing
- 完成周期:4天
Section 1 - Discretization
In this section, you will first minimally preprocess some data in order to make all inputs nominal.
Dataset
Anti-democratic political movements, of various sizes, are springing up across many democratic nations, including Brazil. Your data is from the Brazil Conflict Tracker (https://www.kaggle.com/datasets/justin2028/brazil-conflict-tracker-20182023). This is a dataset that tracks both non-violent and violent conflicts in
Brazil since 2018. The 8 January 2023 invasion of Brazil’s National Congress by Jair Bolsonaro supporters served as inspiration for this dataset. All data are official figures from the Armed Conflict Location & Event Data Project (ACLED) that have been compiled and structured by Justin Oh and released under the
CC BY-NC-SA 4.0 (https://creativecommons.org/licenses/by-nc-sa/4.0/) license.
The single file is in the CSV (https://www.w3schools.com/python/pandas/pandas_csv.asp) file format, with a single header row and the several fields, including:
。。。
Section 2 - Bottom-up computation
Here, you will code the ‘bottom-up computation’ method described in class (i.e., Lecture 5, Slides 45-54). In particular, you have pseudocode (in Lecture 5, Slides 50-51) and extra detail at these locations:
。。。
Bonus [5 Marks]
- We will give up to 5 bonus marks for innovative work going substantially beyond the minimal requirements.
- These marks can make up for marks lost in other sections of the assignment, but your overall mark for this assignment cannot exceed 100%.
- You may decide to pursue any number of tasks of your own design related to this assignment, although you should consult with the instructor or the lead
- TA before embarking on such exploration, and the value of bonus work is left to the discretion of the markers.
- Be sure to document your work sufficiently for the markers to understand what you’re doing. You can add additional Code or MarkDown cells below, as necessary.
- Certainly, the rest of the assignment takes higher priority