Rasch Model: Quality of Test Instruments to Measure Students’ Ability in Algebra

. This research is research using a quantitative approach that aims to analyze the quality of the test instrument by using the Rasch model to measure the ability of students in junior high school mathematics courses, especially in Algebra material. The sample used in the study was 37 students. The instrument used is a multiple-choice test with four answer choices consisting of 30 questions. The results obtained from this study there are 27 valid items and 3 invalid items. The reliability of the test score (reliability) of the estimation results using the Rasch model shows that the reliability is included in the good category with a person reliability value of 0.74, item reliability of 0.84 and Cronbach's Alpha of 0.85. For the level of difficulty or difficulty of the items, 7 items are in the very easy category, 7 items are in the easy category, 11 items are in the difficult category and 5 items are in the very difficult category. Of the 30 questions, there is 1 item that must be discarded because the question is too easy. The level of ability of students with serial numbers 18 and 4 is classified as person misfit in the Rasch model. This is because students with serial numbers 18 and 4 have an unusual pattern of answers, namely not being able to answer correctly on items that are classified as easy, but able to answer correctly on items that are classified as difficult.


Introduction
Each learning process requires an evaluation process in order to know the abilities possessed by students and the results achieved by educators in the learning process. The process cannot be separated from the role of educators, as well as at the university level. Lecturers play a role in the evaluation process and assess each student's ability in each subject taught The evaluation process needs to be carried out with good and correct procedures in order to measure the ability of each student according to the actual situation [1]. A good assessment must require a good instrument as well [2]. The assessment carried out in the learning process is a step to improve the learning process and improve the quality of student learning. The results of the assessment become material for reflection for lecturers and students. In order for the purpose of the assessment to be achieved, a planned, gradual and continuous evaluation process is needed to obtain an overview of the development of the learning process of each student.
Assessment in the learning process is a process to collect, analyze and interpret information to determine the level of achievement of learning objectives from time to time [3] [4]. The assessment process carried out so far aims to collect information that is used as the basis for knowing the level of progress, development and achievement of student learning and knowing the * Corresponding author: natsir_fkip@unmus.ac.id success of a lecturer in the learning process carried out. Measurement of students' cognitive abilities can be done using the test method. The selection of the right test method is expected to measure the ability of each student well.
In this study, measurements were carried out using a test instrument. The test instrument is designed to measure the ability of students in Junior High School Mathematics, especially Algebra. The test instrument used has been designed by taking into account the construct component and the descriptive component. The construct component is a question item that can provide an indication of the answers to be obtained, starting from the low stage to the high stage. The questions in this study will produce answers that show a certain qualitative range, namely two alternative answers, true or false, known as a dichotomous response pattern [5]. Furthermore, the descriptive component is a component that explains several points on certain aspects. The aspect that is considered in this study is how the students' ability in working on algebra problems is.
The quality of the test instruments used can be analyzed using the Rasch model. Basically, the Rasch model is a measurement model based on modern test theory by considering the respondent's ability to answer the test or questionnaire given to the level of difficulty of each item making up the instrument [5][6] [7]. As a new measurement system, the Rasch model aims to overcome the limitations of the classical measurement system or Classical Test Theory (CTT) that has been used so far [8][9] [10]. In classical measurement, item parameters in the form of analysis results of item difficulty level and item discrimination index are group dependent [11]. In terms of item difficulty level, the classification of item difficulty level will change when given to different sample groups [12], whereas in terms of discrimination index, higher scores tend to be obtained from heterogeneous samples and lower scores are obtained from homogeneous samples [13]. Meanwhile, in modern test theory, item parameters do not change even though they are estimated from different sample groups [14]. This means, modern test theory provides a uniform measurement scale [12], so that sample groups can be tested with a different set of items, according to their ability level and the scores obtained can be directly compared [15].
The Rasch model has the advantage of predicting missing data based on a systematic response pattern which cannot be done in classical theory [16]. The Rasch model is based on two principles, namely the ability of the subject and the relationship between the ability of the subject. The ability of the subject referred to in this case is the ability of students to a question that can be predicted by using a set of factors called traits. Traits is a dimension of individual ability consisting of cognitive, psychomotor and verbal abilities. Where, the correlation that occurs between the ability of the subject, in this case what is meant is the ability of students on a question or question that is related to other abilities and can be described in a grain characteristic curve [17]. In the Rasch model, students with high abilities have a greater chance of answering questions correctly than other students. On the other hand, students with low abilities have a smaller opportunity to answer questions correctly where the questions have a higher level of difficulty [18].
In the Rasch model, the score generated from the test instrument aims to determine Outfit MNSQ, Outfit ZSTD, Point Measure Correlation, Item Reliability and Apha Cronbach. The MNSQ outfit is useful for seeing the suitability of the data with the model used. The expected Mean Square value is 1 (one). If the meansquare value on the infit is greater than one, then the variation of the instrument is more than the prediction made by the Rasch model. If the infit value is less than 1, then the variation in the instrument is less when compared to the predictions made by the Rasch model [1].
Referring to the explanation above, the purpose of this study is to determine the quality of the instruments used to measure students' abilities on the competencies tested on algebraic material using the Rasch model approach.

Research Methods
The approach used in this study is a quantitative approach, with a total sample of 37 students who have attended Junior High School Mathematics courses. The instrument used in this study was a multiple-choice test with four answer choices containing algebraic material that had been obtained by students at the junior secondary level. The number of test items used in this study were 30 items with easy, medium and difficult item difficulty levels. The test results obtained are in the form of scores that will be analyzed using Winsteps Rasch software [19] which aims to determine fit and misfit items. In addition, to determine the value of Cronbach's Alpha which is the result of the overall item reliability test. The value of Outfit MNSQ, Outfit ZSTD and the correlation value of the items as a whole shows the limit of items that are declared fit with the model. If the Outfit MNSQ value is 0.5 < MNSQ < 1.5; Outfit ZSTD value ranges from -2.0 < ZSTD < 2.0; and the correlation value of the total score (Point Measure Correlation) ranges from 0.4 < Pt M Corr < 0.85 [18] [20] 3 Results and Discussion

Validity Test Result
Analysis of the quality of the ability test instrument on algebraic material seen from the aspects of validity, reliability, level of problem difficulty and ability. In the Rasch Model, to see the quality of the item from the aspect of validity, it is if it meets the following criteria [18].
Based on the result of the analysis of students' ability test instruments on algebraic material in items of item validity using the Rasch model, 27 items were said to be valid and 3 items were said to be invalid because they did not meet the requirements for MNSQ outfit, ZSTD Outfit, and Point Measure Correlation (Pt Measure Correlation). The results of the reliability analysis of students' ability test instruments on algebraic material can be seen in Table 3. The reliability of the test score estimation results indicate that the reliability is included in the good category. It is based on two reliability coefficients including item reliability and person reliability. Both coefficients are interpreted to have the same provisions in determining reliability in classical measurements by looking at the Cronbach Alpha value [19]. The value of the reliability of the test score obtained, if it is close to the value of 1, it shows a more internally consistent measur [20]. In this study, the value of person reliability and item reliability are 0.74 and 0.84, respectively.

Item Difficulty Level
The level of difficulty of the items indicates the possibility of how many respondents can answer an item correctly. In the Rasch model, the difficulty level of the items is categorized based on the Logit Measure and the Logit Item Standard Deviation (SD) value and is divided into four categories [18] as follows.
Measure logit < -SD logit : Very easy items -SD logit ≤ Measure logit ≤ 0 : Easy items 0 ≤ Measure logit ≤ SD logit : Difficult items Measure logit > SD logit : Very difficult items The results of the analysis of students' ability test instruments on algebraic material for the difficulty of each it item can be seen in Table 4 dan Tabel 5.  Based on table 4 and 5, the level of difficulty or difficulty of the items in the Rasch model is seen from the measure value in the logit unit of each item. A total of 7 questions are in the very easy category, namely points 18, 11, 03, 01, 12, 08, 28. A total of 7 questions are in the easy category, namely points 10,20,19,30,16,21,4 A total of 11 items are in the difficult category, namely items 2,7,9,24,26,14,23,15,22,25,27. A total of 5 items are in the very difficult category, namely points 6, 17, 5, 13, 29.

Ability Level
In the Rasch model, an analysis of the ability level is carried out which aims to distinguish the abilities of students who are able to answer questions and those who are unable to answer questions. The results of the analysis for the level of capability can be seen in figure 1. Figure 1 shows that students with serial numbers 18 and 4 are classified as person misfit. This is because the student has an unusual pattern of answers, namely students are not able to answer correctly on items that are classified as easy, but are able to answer correctly on items that are classified as difficult. Students with lower levels of ability do not have the opportunity to solve problems that are classified as difficult, so it can be concluded that the answers given by students with serial numbers 18 and 4 are probably just guessing and the guess happens to be true or known as lucky guessing., the answer from the student is probably the result of cheating. This is in line with the opinion [21] and [22] who stated that there are 5 things that cause person misfit, namely cheating or copying answers from friends/other exam participants. Such actions refer to unfair behavior in the answers obtained because the actual item questions cannot be answered correctly. Careless responding occurs when someone answers correctly on question items that are classified as difficult/difficult and cannot answer correctly on question items that are classified as easy. Lucky guessing occurs when someone works on a given problem by guessing and the answer is correct, but he doesn't know why the answer is correct. Creative responding only happens to someone who has a high ability when responding incorrectly to a question item that is actually quite easy, this is because they interpret the item in a unique and creative way. And, random responding refers to a situation where someone chooses an answer option randomly on a question regardless of whether the selected option is true or false. Furthermore, person misfit can also occur if someone answers all items correctly which causes extreme scores so that the fit statistics cannot be measured (over fit). According to [23], person fit measurement not only identifies an impossible response pattern, but also identifies a very likely response pattern, so that if too much uncertainty is predicted, it indicates a constraint on the response.

Conclusion
The results showed that the quality of the questions about students' abilities in algebraic material through the Rasch model approach was generally categorized as good quality questions with 27 valid items out of 30 items. Reliability analysis is in good category with a person reliability value of 0.74, item reliability of 0.84 and Aplha Crobach of 0.85. Analysis of the level of difficulty or difficulty of the questions showed 7 items belonging to the very easy category, 7 items belonging to the easy category, 11 items belonging to the difficult category and 5 items belonging to the very difficult category. Of the 30 questions, there is 1 item that must be discarded because the question is too easy. The level of ability of students with serial numbers 18 and 4 is classified as person misfit in the Rasch model. This is because the student has an unusual response pattern, namely being able to answer correctly on items that are classified as difficult and answer incorrectly on items that are classified as easy