Browsing by Author "Torrebruno, Aldo"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- Data mining tool for academic data exploitation: graphical data analysis and visualizationPublication . Prada, Miguel Angel; Dominguez, Manuel; Morán, Antonio; Vilanova, Ramon; Vicario, José; Pereira, Maria João; Alves, Paulo; Podpora, Michal; Barbu, Marian; Torrebruno, Aldo; Spagnolini, Umberto; Paganoni, AnnaThe vast amount of data collected by higher education institutions and the growing availability of analytic tools, makes it increasingly interesting to apply data mining in order to support educational or managerial goals. The SPEET (Student Profile for Enhancing Engineering Tutoring) project aims to determine and categorize the different profiles for engineering students across Europe, in order to improve tutoring actions so that they help students to achieve better results and to complete the degree successfully. For that purpose, it is proposed to perform an analysis of student record data, obtained from the academic offices of the Engineering Schools/Faculties of the institutions. The application of machine learning techniques to provide an automatic analysis of academic data is a common approach in the fields of Educational Data Mining (EDM) and Learning Analytics (LA). Nevertheless, it is often interesting to involve the human analyst in the task of knowledge discovery. Visual analytics, understood as a blend of information visualization and advanced computational methods, is useful for the analysis and understanding of complex processes, especially when data are nonhomogeneous or noisy. The reason is that taking advantage of the ability of humans to detect structure in complex visual presentations, as well as their flexibility and ability to apply prior knowledge, facilitates the process aimed to understand the data, to identify their nature, and to create hypotheses. For that purpose, visual analytics uses several strategies, such as preattentive processing and visual recall, that reduce cognitive load. But a key feature is the interactive manipulation of resources, which is used to drive a semi-automated analytical process that enables a dialog between the human and the tool. During this human-in-the-loop process, analysts iteratively update their understanding of data, to meet the evidence discovered through exploration. This report documents the steps conducted to design and develop an IT Tool for Graphical Data Analysis Visualization within the SPEET1 ERASMUS+ project. The proposed goals are aligned with those of the project, i.e., to provide insight into student behaviors, to identify patterns and relevantfactors of academic success, to facilitate the discovery and understanding of profiles of engineering students, and to analyze the differences across European institutions. And the intended use of the tool is to provide support to tutoring. For that purpose, the concepts and methods used for the visual analysis of educational data are reviewed and a tool is proposed, which implements approaches based on interaction and the integration of machine learning. For the implementation details and validation of the tool, a data set has been proposed. It only includes variables present in a typical student record, such as the details of the student (age, geographical information, previous studies and family background), school, degree, courses undertaken, scores, etc. Although the scope of this data set is limited, similar data structures have recently been used in developments oriented to the prediction of performance and detection of drop-outs or students at risk. In the third chapter, the report presents, describes and structures the academic data set which is used as a basis for the visual analysis. Chapter 4 reviews the concepts, goals and applications of visual data exploration, specifically of interactive visual analytics in the framework of educational data mining. Chapter 5 discusses visual analysis methods that are interesting for the proposed goals, which include providing insights of behaviors, patterns and factors of success, both locally and across European institutions. The proposed methods are initially presented and, later, applied to subject of study. The last chapter describes the tool implementation. For that purpose, the design and the technologies used for its implementation are presented, the availability of the tool is discussed, and a short user guide is included.
- Data mining tool for academic data exploitation: selection of most suitable algorithmsPublication . Vicario, José; Vilanova, Ramon; Bazzarelli, M.; Paganoni, Anna; Spagnolini, Umberto; Torrebruno, Aldo; Prada, Miguel Angel; Morán, Antonio; Dominguez, Manuel; Pereira, Maria João; Alves, Paulo; Podpora, Michal; Barbu, MarianSPEET project is aimed at exploiting the potential synergy among the huge amount of academic data actually existing at universities and the maturity of data science in order to provide tools to extract information from students’ data. A rich picture can be extracted from this data if conveniently processed. The purpose of this project is to apply data mining algorithms to process this data in order to extract information about and to identify student profiles. In this document, the results obtained at SPEET project under the development of the data mining tools are presented. More specifically, two mechanisms have been developed: a clustering/classification scheme of students in terms of academic performance and a drop-out prediction system. The document starts by addressing the motivation of the development of data mining tools along with the considerations taken into account for academic data gathering. These considerations include the proposed unified dataset format and some details about confidentiality issues. Next, the students’ clustering and classification schemes are presented in detail. More specifically, a description of the considered machine learning algorithms can be found. Besides, a discussion of obtained results when considering data belonging to the different SPEET project’s partners is addressed. Results show how groups of clusters can be automatically identified and how new students can be classified into existing groups with a high accuracy. Finally, the implemented drop-out prediction system is considered by presenting several algorithms alternatives. In this case, the evaluation of the dropout mechanism is focused on one institution, showing a prediction accuracy around 91 %. Algorithms presented at this document are available at repositories or inline code format, as accordingly indicated.