Statistical Representations of a Dashboard to Monitor Educational Videogames in Natural Language

This paper explains how Natural Language (NL) processing by computers, through smart programs as a way of Machine Learning (ML), can represent large sets of quantitative data as written statements. The study recognized the need to improve the implemented web platform using a dashboard in which we collected a set of extensive data to measure assessment factors of using children’s educational games. In this case, applying NL is a strategy to give assessments, build, and display more precise written statements to enhance the understanding of children’s gaming behavior.We propose the development of a new tool to assess the use of written explanations rather than a statistical representation of feedback information for the comprehension of parents and teachers with a lack of primary level knowledge in statistics. Applying fuzzy logic theory, we present verbatim explanations of children’s behavior playing educational videogames as NL interpretation instead of statistical representations. An educational series of digital game applications for mobile devices, identified as MIDI (Spanish acronym of “Interactive Didactic Multimedia for Children”) linked to a dashboard in the cloud, is evaluated using the dashboard metrics. MIDI games tested in local primary schools helps to evaluate the results of using the proposed tool. The guiding results allow analyzing the degrees of playability and usability factors obtained from the data produced when children play a MIDI game. The results obtained are presented in a comprehensive guiding evaluation report applying NL for parents and teachers. These guiding evaluations are useful to enhance children’s learning understanding related to the school curricula applied to ludic digital games.


INTRODUCTION
In the past two decades, several studies have identified the importance of using certain types of videogames (defined as serious games) or digital games for mobile applications as educational tools for children and adolescents. The development of this type of game has been supported by the advancement of computer and mobile technologies, with a dramatic growth in the field of learning games [1,2]. The design of games with meaningful content and challenges to accomplish an objective have an impact on social interactions, learning communities, and culture [3]. Some authors explain that educational digital games have great positive potential to address specific problems, to educate about global issues or, to teach particular skills [1,2]. It is in addition to their entertainment value, performing a mental test for any consumer [1,2,4]. Furthermore, observational studies have revealed that videogames are a preferred learning tool among children, which effectively motivates and encourages proactivity [1]. However, there seems to be a gap in the market for games with educational values and academic content that follow children´s school curricula accordingly. Moreover, very few applications allow the exportation of data to track gaming results for measuring children´s behavioral use and game playability. The possibility of measuring the academic sense of the use of videogames as serious games has the potential to provide useful information to demonstrate how educational digital games can stimulate the cognitive development of children [5]. On the other hand, academic experts in Learning Analytics (LA) explain the existence of applications that allow building dashboards from data or traces left by a user on the network [6]. These applications monitor users to get the necessary data on their interests, obtaining indicators that allow planning interactive games used in classes with content according to what is stored in a cloud and shown on the dashboard. Usually, the results obtained from the data storage using the dashboard administration option show descriptive statistic values, graphics, and tables [7]. However, the problem arises when people who must interpret the results generated have limited knowledge in statistics and interpretation of graphs. Computer machinery can acquire the cognitive structure which flares up in Natural Language (NL). According to Powers and Turk [8], there are more than thirty years of confidence that NL processing by machines through smart programs, seems to emulate even imitate the actual process of understanding a language as a concept of Machine Learning (ML) method. The application of NL as an ML process can be applied using artificial intelligence to interpret any quantitative data produced and stored in a system, expecting to be interpreted by textual explanation, suitable for the understanding of any audience.

Local background of educational game use
In Ecuador, there are some initiatives driven by university researchers to design and encourage the use of "serious game" with academic content for the primary school. They have developed some mobile applications videogames such as series understudy titled MIDI (Acronym of the Spanish translation of "Children's Interactive Didactic Multimedia") [7,9]. These games generate control data for usability in a JavaScript Object Notation (Json) on and cloud playability measurements stored and managed in a dashboard [7,9]. The dashboard application developed provides statistical feedback to parents and teachers about the progress of the content and the level of the games used [9]. However, teachers and parents who can use the dashboard as a control system for the MIDI series video games applications, may not have enough technical knowledge about statistics to interpret the results of the metrics used to analyze the playability and usefulness of these applications. In some developing countries such as Ecuador, a large number of low -income primary schools subsidized by the government are staffed with a decreased number of teachers with a superior level of education, hence the gap in knowledge regarding statistical interpretation among the population [10].

The research objective
This study aims to propose an alternative method of showing the quantitative data administered in a dashboard emulated in NL as qualitative assessments, more understandable for any users. Overall, the focus of this work involves the use of artificial intelligence through fuzzy logic to explain in NL the statistical graphs results shown in a dashboard. The metrics used allow analyzing the degrees of playability and usability of the MIDI games series. The process developed and application designed are parts of an innovative module to be included as part of an existing cloud control system.

THEORETICAL FRAMEWORK
For this study, it is necessary the application of LA and the use of Artificial Intelligence (AI) through fuzzy logic to explain in NL numeric values previously shown in statistical graphs in a dashboard. LA is a growing area of Technology-Enhanced Learning (TEL) research that emerged in the last decade, with a secure connection between a variety of fields, including business intelligence, educational data mining, recommender systems, and web analytics [11]. LA is the process of measurement, collection, analysis, and presentation of data collected in learning sessions to understand and improve it from a range of perspectives. According to Elias [12], the typical phases of LA are four. These phases start with obtaining raw data. Secondly, the data becomes meaningful information for thirdly accomplishing knowledge that finally serves to achieve objectives, see Figure 1. The dashboard technology used is the rebirth of the Executive Information Systems (EIS) to enhance a manager's ability to process information and to act [13]. Its usefulness assumed like aspects of business intelligence [14]. With a dashboard, the extent of specific data can be analyzed and measured to provide an overview of what is going on, clearly and rapidly [14,15]. The dashboards are rising in popularity, structured into different types according to their functional and visual features. They have the challenge of providing the right information to the right people at the right moment [12]. The structure of the dashboard chosen for the developed module is like the Klipfolio type. The Klipfolio dashboard structure performs readings and comparisons of big monthly or time-period data. It helps to know things that can and should be corrected as part of an analysis that visualizes all open issues for a specific project [16,17]. Fuzzy logic is applied to this study investigation to provide a mechanism of inference that simulates human reasoning, which is imprecise by nature, based on a knowledge system [18]. The objective is to handle fuzzy concepts systematically, given that there are elements in human thought that do not have numbers, but concepts. These concepts expressed as fuzzy sets are for example: "extremely intelligent," "more or less successful," "very attractive," and others intermediate rages such as "little less than hot" or "too cold" or "more or less warm" [18,19]. Figure 2 shows, the process starts with crisp numeric values entered as a discrete input. These values pass through the fuzzifier that is responsible for giving it a qualitative sense. Through the inference engine and the fuzzy or linguistic rules that make up the knowledge base, qualitative evaluation values processed, result in another qualitative value, and this goes through the process of defuzzification that converts the qualitative values to quantitative [20]. The fuzzy logic is related to the theory of fuzzy sets, where a function of belonging gives the degree of fitting of an element and where they follow a pattern of human reasoning [18,20]. In this sense, it is possible to give element conditions not only of "low" or "high" but also "very low," "relatively high," "slightly low," among others [7]. Furthermore, different ranges of values between 0 to 1 can be assigned to identify, for example, satisfactory, unique, or necessary conditions of membership functions in a selection case [21]. Furthermore, it allows knowledge representation applies to a Fuzzy Inference Systems (GIS) as human thought does. This system defines a non-linear correspondence between one or more input variables and one output variable from the determined fuzzy set; In this way, there is a basis from which decisions can be made [18,22]. The steps that make up a fuzzy inference system are: x The input variables and their output variable, their linguistic values, and their membership functions are defined.
x The rules that specify the relationship between the input variables and the output variables are defined. The If-Then rules "specify the relationship between the variables of system entry and output. The relationships diffuse determine the degree of presence or absence of association or interaction between elements of 2 or more sets" [22]. The interpretation of IF-Then rules, usually defined from the knowledge of experts through interviews, involves two steps. First, is to evaluate the SHS Web of Conferences 77, 05003 (2020) ETLTC2020 https://doi.org/10.1051/shsconf /20207705003 antecedent by applying any diffuse operator. The second step is the implication or application of the result of antecedent to the consequent [22]. It assumes the form: If X1 is A1 and X2 is A2 and …… and Xk is Ak Then Y is B Where A1, A2, ..., Ak, B, are linguistic values defined in fuzzy sets for linguistic variables X1, X2, ..., Xk, and Y, respectively.
x The outputs of each rule are combined to obtain a single fuzzy set with determined ranges. This process is commutative, which means the order in the output of each adding rule does not matter.

METHODOLOGY
Using a pragmatic approach, a pluralistic methodology in which all the methods and research tools are essential for guiding multidisciplinary research to solve a problem without compromising with a specific method [23,24]. The focus of this work involves the use of artificial intelligence through fuzzy logic, explaining in NL the statistical graphs of a dashboard that show results about the level of efficiency, effectivity, flexibility, satisfaction, and playability of using digital educational games. For the testing process, we obtain the data from the MIDI games played by a group of children from primary schools of Guayaquil with the corresponding ethical consent and permission obtained from the school authorities and parents. The data is sent through a Json to the database tables, operated by a dashboard stored in the cloud. The quantitative data stored and monitored by the dashboard is tested and alternatively processed using fuzzy analysis applied to LN. The outcomes written as simple statements are qualitative assessments more understandable for any users.

The data collection and analysis processes
With the data obtained from Json implemented in four of the MIDI series game created as beta apps [7], the aim is to examine the degrees of efficiency, effectiveness, flexibility, and satisfaction used to measure the levels of digital games and their playability. The data produced is stored in a cloud database linked to an implemented dashboard [7,9]. The digital games apps are available through the MIDI Webpage or downloading them directly from the Google Play Store. When a child plays with these apps, related information is stored in a database. The information is related to the behavior and performance of the child using the videogame. Figure 3 shows the data reception structure designed for the different stages of the app's use. The structure shows the level of content by themes (chapter), animated content (story), and testing games (level). The data recorded through a Json is uploaded in a database implemented in PostgreSQL with the dashboard [7].

The development processes
The required components for the implementation of a fuzzy interpretation system are the linguistic variable and the fuzzy sets. The chosen linguistic variables came from the table of metrics [9] determined by the group of technical researchers in an initial stage of this project. The quality table of metrics was obtained by combining the measure of quality in the development of the videogames, entitled Playability Quality Model (PQM-Metrics), and expanding the concept of Effectiveness and Efficiency with the Technology Acceptance Model (TAM) [7,25]. A module to be implemented uses the same platform and tools previously applied for the existing dashboard. The dashboard is developed in Sailsjs with a section in Angular. The language that handles these frameworks is JavaScript [7]. The module is for converting the valuations of the metrics or linguistic variables (Low, Medium, High) into NL by using linguistic criteria. Meaning to each valuation (de-fined as a conclusion rules) for pedagogical review is given. These rules added in a database

Linguistic input / output variables
The implementation process of the system model, linked to the dashboard as a component result of the current research, is composed of the list of the input variables used to pass them through the inference system and to obtain the output variable results. The list of values determined for the fuzzy set and the initials ranges of each of the variables, describing the maximums and minimums obtained, as it is shown in Table 1.
For each variable of inputs and outputs, it is necessary to provide a fuzzy set such as low, medium, and high. These sets are going to be used to give them a respective assessment value. The linguistic variables and their corresponding fuzzy sets determined for each case are shown in Table 1. Additionally, the fuzzy logic process needs to define the maximum and minimum of each linguistic variable as an initial range of values. These values would have dynamic changes concerning data produced each time a user plays a game. Examples of the fuzzy sets' definitions and initial ranges for maximum and minimum considered for each of the variables are also shown in Table 1.

Linguistic and fuzzy rules
To have all the components ready and thus completed, the fuzzification process rules need to be designed, structured, and developed with the criterion for each of the linguistic variables determined. A database table with interpretation criteria for each linguistic variable is coded (with written statements for Spanish speakers) to link the rules for the combination of cases. The fuzzy set of numeric equivalences used to code the linguistic rules are 0=Low, 1=Medium, 2=High, and 3=Inconsistent. For each combination of the value taken for the linguistic variables, a conclusion rule is determined (see example Efficiency and Playability rules in Table 2).

Table 2. A fragment example of linguistic rules a b c d e A Conclusion Efficiency rules (linguistic criteria)
0 0 0 0 1 3 Inconsistent results because if the target efficiency for correct answers is low, therefore the efficiency for incorrect answers should be high and not low. The value these variables take from their fuzzy set follows the rules that are defined, considering the metrics and the parameters involved in them. For example, if the valuation that takes "a (goal time)" is "low," it does not necessarily mean that this variable finally contributes so that the overall efficiency is "low." Quite the opposite, it is convenient that the goal time or the average time that children use to complete a level is low. For example, to understand why the final efficiency assessment (A) receives the value it takes, it is first defined whether the valuations of a, b, c, d, and e, mean something good or bad, as it is described in Table 2. The number of fuzzy rules is related to the number of inputted linguistic variables in each case. The formula for obtaining the total rules in A=Efficiency is 35 = 243. The total rules for B=Effectivity are 33 = 27; for C=Flexibility is 32 = 9; for D=Satisfaction is 31 = 3. In the case of Playability as the output variable result, a total of rules is 34 = 81 Table 3 shows four examples of fuzzy rules for efficiency. Table 3. The Fuzzy Rules of Efficiency example a) Favorable case. If valuation (a) = "Low" and valuation (b) = "High" and valuation (c) = "Low" and valuation (d) = "High" and valuation (e) = "Low" then valuation (A) = "High" b) Unfavorable case. If valuation (a) = "High" and valuation (b) = "Low" and valuation (c) = "High" and valuation (d) = "Low" and valuation (e) = "High" then valuation (A) = "Low" c) Middle case. If valuation (a) = "Low" and valuation (b) = "Average" and valuation (c) = "Average" and valuation (d) = "Average" and valuation (e) = "Average" then valuation (A) = "Average" d) Inconsistent case. If valuation (a) = "High" and valuation (b) = "Low" and valuation (c) = "Low" and valuation (d) = "Low" and valuation (e) = "High" then valuation (A) = "Inconsistent"

Evaluation test and results
After implementing the new modules proposed, the integration with the dashboard and the database was carried out. The application contained previous records stored in a previous stage of this study and new data collected. The testing was done examining the dashboard section "En mi Entorno Natural" (In my Natural Environment) which correspond to level 1 of the game of "Seres" (beings) related to the MIDI app available in the Google Play Store. With the existing data, we undertake several tests for analyzing favorable and unfavorable cases obtained. Table  4 shows a translation of the original Spanish table, presenting the cases for the assessments of Efficiency metric of Input values obtained. Finally, Table 5 and Table  6 show examples of the favorable and unfavorable cases obtained for the Playability metrics.  2 Effectivity high The number of children who completed the level was almost total. Those who completed the level and made at least one attempt, they answered the questions correctly in practically a total percentage.
3 Flexibility high By playing the story previously, the child obtains a high goal efficiency at the levels used. This performance indicates that the rate of correct answers is more than half of the total number of questions at that level. They also obtain a high percentage of goal efficiency, which indicates they answered many questions per minute.
4 Satisfaction high Children prefer that level over others, as proof of this, many of them completed it.
5 Playability high Under all the previous parameters helps to conclude that this level meets all the playability metrics For the test, we invite some teachers and parents (using the MIDI games series in rural schools) to evaluate the outcomes showing in the dashboard. Applying the new module, we detect and remove some inconsistencies in the data. We identified a better way to show results to endusers presenting more precise results based on the metric used. Therefore, we can conclude that the game tested and used by children's presents useful results as feedback for parents and teachers.

Results analysis
For the test, we invite some teachers and parents (using the MIDI games series in rural schools) to evaluate the outcomes showing in the dashboard. Thanks to the application, we detected some inconsistencies in the data that need to be removed periodically. Then, we presented using NL several reports containing written information about children´s game usage, and playability. Teachers and parents clearly understood the information provided in the dashboard report or on the screen using written sentences instead of statistical graphics. Therefore, we can confirm that this is a better way to show ludic games use outcomes to end-users presenting more precise and straightforward to read results based on the metric used.

Conclusions
The results of the proposed solution meet the objective of converting crisp numeric values of metrics obtained from games apps, into a narrated common language known as NL. By using the developed module, as part of the dashboard, we carried out several tests analyzing a variety of cases. We extracted from the data the minimum and maximum of each of the metrics used, identifying the level and number of players who completed the different games by level. Favorable and unfavorable cases showing relevant information or inconsistent are presented in LN as written sentences for the understanding of parents and primary school teachers. Besides, any unusual data uploaded and to be monitored in a dashboard can currently be debugged and corrected after any inconsistency review period. The outcomes of this study and the developed tools help obtain more reliable results of educational games monitored in a dashboard, with the focus-on people without statistical knowledge. We have expanded the concept of scalability of using any application to be understood for a variety of users. The operation of the dashboard for MIDI apps, which was used before by people who had statistical knowledge, can now be used directly by parents and primary school teachers who might not have technical skills.

Future work
The linguistic rules used for this project are static for three levels (low, medium, high). Thus, based on the results, we recommend a new development or the improvement of the implemented module for using NL interpretations of metrics. The new module as a programming tool for NL should be able to collect and construct intermediate conclusions rules for the metric evaluated with the dashboard, but dynamically regarding the results in an expanded fuzzy system set of inferences. In this sense, it is also essential to consider the professional criteria of experts in psychology, pedagogy, and copywriters to obtain fundamental linguistic rules for the results.