Repository logo
 
No Thumbnail Available
Publication

Big data and analytics: the case study of students mobility

Use this identifier to reference this record.
Name:Description:Size:Format: 
Dmitrii Mukhin.pdf21.99 MBAdobe PDF Download

Abstract(s)

It's no secret that the most important thing in our world is information. Nowadays, almost every action leaves a trace. And if we use this data correctly, we will get new knowledge and predictions. But this requires new specialized technologies such as Big Data. The work described in this dissertation focuses on three methods of Big Data analysis: descriptive analysis, correlation and predictive analysis. The purpose of the work is to explore these methods for practical application to a dataset containing information about IPB and Erasmus students. The following tasks were performed: collecting data from international students about their university practices and mobility, conducting descriptive analysis on general characteristics by year, course, gender, place of residence, degree, number of subjects studied and their grade point average. Correlation heat charts were constructed between the values in the dataset and dependencies were analyzed. The most important contribution of this paper is the practical application of three machine learning algorithms (Linear regression, Ridge regression, and Random forest) to predict the number of Erasmus students for the next year. The machine learning algorithms build a model from sample data, known as "training data," to make predictions or decisions without being explicitly programmed to do so.

Description

Mestrado em Informatica

Keywords

Big data Machine learning Descriptive analysis Graph Table Correlation Predictive analysis Linear regression Comb regression Random forest

Citation

Research Projects

Organizational Units

Journal Issue