An examination of descriptive statistical knowledge of 12th-grade secondary school students – comparing and analysing their answers to closed and open questions

. In this article, we examine the conceptual knowledge of 12th-grade students in the field of descriptive statistics (hereafter statistics), how their knowledge is aligned with the output requirements, and how they can apply their conceptual knowledge in terms of means, graphs, and dispersion indicators. What is the proportion and the result of their answers to (semi-)open questions for which they have the necessary conceptual knowledge, but which they encounter less frequently (or not at all) in the classroom and during questioning? In spring 2020, before the outbreak of the pandemic in Hungary, a traditional-classroom, “paper-based” survey was conducted with 159 graduating students and their teachers from 3 secondary schools. According to the results of the survey, the majority of students have no difficulties in solving the type of tasks included in the final exam. Solving more complex, open-ended tasks with longer texts is more challenging, despite having all the tools to solve them, based on their conceptual knowledge and comprehension skills. A valuable supplement to the analysis and interpretation of the results is the student attitudes test, also included in the questionnaire.


Introduction
In Hungary, descriptive statistics has been included in curricula since the early 2000s (for public education in general).The first curricula required a low level of statistical literacy and basic conceptual knowledge.The curricular changes in the 2020s aimed at an expansion of the statistics curriculum and a higher level of statistical literacy development, adjusting to international trends.(See in more detail the theoretical background section.)In parallel, there has also been a shift in this direction in terms of output requirements.Related to this process, I aimed to investigate the descriptive statistical conceptual knowledge of grade 12 students and the extent to which they are able to apply it, in relation to the requirements valid at the given time.My motivation is based on the identification of probable gaps and their possible solutions, as well as the opportunity to design the development more effectively.In the long term, my aim is to adapt methods that enhance the development of statistical literacy and critical thinking, rather than a formula-based, calculation-focused approach.

Brief international overview
Since the early 2000s, there has been a remarkable amount of research on statistics education.In 2002, the first independent journal on statistical research was published, which facilitated further work for researchers in the field (Garfield et al., 2008).Until recently, statistics in most school curricula was limited to a formula-based approach.Students were not adequately prepared for their higher statistics studies and became adults with an almost complete lack of statistical literacy (Batanero, 2012).Statistical literacy and reasoning have been introduced into school and university curricula in a number of countries, underlining the increasing need to develop students' statistical literacy (Ben-Zvi & Makar, 2016).With a greater emphasis on teaching statistics and exploratory data analysis activities in recent curricula, more research is needed on students' reasoning about solutions to open-ended statistical problems (Batanero et al., 2003).Watson's 2003 study, which uses developmental levels of statistical literacy and observed components (text engagement, numeracy, statistical skills, and language use) to illustrate possible lesson planning activities, is fundamental to the study of teaching practice in secondary schools (Watson, 2003).Current international guidelines for the teaching of statistics focus on the development of students' literacy and thinking, and for this purpose, there is ongoing research on how to create the necessary classroom environment (Oliveira & Henriques, 2019).

Some early research from Hungary
The study of the relationship between academic and applied knowledge is not a new field in Hungary; it has been the subject of numerous studies even before the 2000s.The question of knowledge, and that of the use of acquired knowledge, is the quality with which students can interpret the phenomena around them, based on what they have learned (B.Németh, 1998).Another important aspect is the extent to which the theoretical knowledge acquired will be used in the future and the extent to which it is content-related (Csapó & B. Németh, 1995).The answers to the questions above are mostly not positive.Extensive research in the late 1990s pointed in the direction that schools do not seem to be sufficiently effective in developing thinking and that students do not perform well in solving practical problems.All this can be explained, for example, by the fact that solving and answering routine tasks in school often provides little or no provision of understanding (Korom, 1998).During this period, a major change took place affecting education in our country: the introduction of NAT (the National Core Curriculum, which is the top level of regulation) (OFI, 2009).As part of this process, statistics was introduced into primary and secondary education and has been present ever since.

Mathematics and reading skills -an overview of 2003-2018 tendencies through the PISA 2018 assessment
Considering the specifics of the subject, research on the teaching of statistics has a strong emphasis on reading comprehension skills in addition to mathematical skills.Therefore, before we discuss the current situation in Hungary further, there follows a brief overview of the last almost two decades, using the results of the 2018 PISA assessment.
One of the relevant findings of the 2018 PISA assessment is that, over a longer period, OECD countries' scores in mathematics have been declining since 2003, and reading literacy in these countries has also been on a decreasing trend since 2000 (Crato, 2021).Hungarian students also perform below the OECD average in reading, mathematics, and science.There is also a negative trend in Hungary's performance in science and (to a lesser extent) mathematics, and an increase in the proportion of low-achieving students (OECD, 2019).Hungary scores above average on TIMSS assessments, but it receives less attention in our country and its results are less well-known and promoted (Kelly et al., 2020).

The current situation in Hungary
In Hungary, the position of descriptive statistics is best summed up by the following quote: "The requirements of this chapter are a compromise between two opposing trends: the fundamental social need for the subject, while its place in teaching is still very peripheral" Tóth (2006).The majority of the output requirements for statistics (valid until the spring examination period of 2023) require conceptual knowledge of graphs, means, and dispersion indicators, and knowledge of how to calculate and define them.It also includes the visualization and comparison of data sets and the ability to read information from charts.Most of the statistics questions in the final examination are simple, reproductive, and require basic conceptual knowledge.These types of tasks are solved effectively by students with good results (Csapodi, 2016;Csapodi & Koncz, 2016).Teachers are basically satisfied with the current requirements and are willing to teach statistics at this level.As far as classroom work, in general, is concerned, the output requirements in statistics education are above average (Csapodi & Jánvári, 2022).The present survey was designed to obtain information on the extent to which the descriptive statistics curriculum is mastered and understood by the end of grade 12.As a tool for this purpose, the tasks included open (or semi-closed) problem sets requiring the students to formulate their own opinions, which is unusual compared to the everyday classroom routine.This type of task, and those aimed at formulating one's thoughts, are mostly used orally in the classroom as a supplementary activity.
In general, it can be stated that in Hungarian public education, the proportion of closed or apparently closed tasks is significantly higher than that of open tasks (Ambrus, 2004).
In all three tasks of the series analysed in detail below, there were opentype tasks requiring reasoning and statements, as well as simple, reproductive tasks similar to those in the matriculation examination, requiring basic conceptual knowledge.Open tasks, in the sense that they require a less structural, (partly) textual answer, with no opportunity for giving simply "yes" or "no" answers (Capraro, 2012).Answering some of the sub-questions in the worksheet requires backward thinking and the analysis and comparison of two options.Solving these problems highlights the cognitive abilities of individuals.However, the problemsolving process is not only influenced by cognitive abilities, but also by the learners' self-control and self-regulatory processes (Schoenfeld, 1985).Metacognition, both in terms of reading comprehension, attention, and problem-solving strategies, plays a major role in solving open-ended tasks (Flavell, 1979).When assessing the effectiveness of problem-solving, it is important to consider the individual's own belief system and experiences, both of him/herself, of mathematics, and of problem-solving itself (Schoenfeld, 2013).
In assessing and coding the questionnaires, we seek answers to the research questions outlined below.

Research questions
(RQ1) Do 12th-grade students who leave secondary school have adequate knowledge of the required statistical content in the curriculum requirements in terms of the output requirements?
(RQ2) Are the same students able to apply the knowledge they have?Can they make decisions based on their conceptual knowledge and justify their decision?

Methodology
Sampling, data collection Related to my research topic, I planned to carry out the completion of the compiled worksheet among 12th-grade students, at the end of the spring semester.
As originally planned and then prepared, we contacted four secondary schools, with a target population of about 200 students.However, due to the closures imposed by the pandemic in 2020, the completion was finally carried out in only three secondary schools (located in Budapest, Debrecen, and Gödöllő).In total, 159 students from the three involved secondary schools participated in the survey.Although the sample is not representative because of the diversity of the geographical location of the secondary schools and the heterogeneous composition of the groups, it is nevertheless suitable for drawing general conclusions.The questionnaires were completed on paper by students in a mathematics class with their teachers, under conditions similar to those for writing an essay.The completion took place in February-March 2020, which for grade 12 students is usually already a time of revision.In the immediate period before writing, students neither had learned any statistics describing new knowledge nor had they spent time on thematic revision.
In addition to the survey, the teachers who participated in the implementation of the survey (all the teachers involved in the preparation and implementation have at least 10-15 years of teaching experience) also completed an attitude test and a series of questions requiring detailed answers.Thus, the background information allows more relevant conclusions about the results and the issues raised.Also based on the experience of this small-scale teacher completion, an online questionnaire survey involving 181 teachers was designed and subsequently conducted.
After the survey had been conducted, according to the original plans, I wanted to interview some of the students who had taken part in the survey in each of the three participating secondary schools.However, this could only occur at my workplace due to circumstances.The interview experience will be used to support the final evaluation of the results in the conclusions section.

Structure of the worksheet
The worksheet is based on descriptive statistical concepts and their application as required by the intermediate-level matriculation requirements.It consists of three tasks (and several parts within each task) and a short attitude test, all of which take 45 minutes to complete.Before the large number of completions, a mock test was carried out with some students, and based on the experience gained, some minor wording corrections were made, plus we limited the time needed to complete the tasks up to 35-40 minutes.The students were asked to fill out the attitude test only if they had enough time for that after finishing the exercises.
Three tasks of the worksheet are structured around three thematic concepts: means, charts, and dispersion indicators.The structure of the exercises is based on the framework of my previous work.(Jánvári, 2020) The tasks marked "A" are very simple, short-worded tasks within the given concepts.These tasks do not require complex thinking or advanced reading comprehension skills.They are built on basic conceptual knowledge, solving routine tasks of a magnitude corresponding to the output requirements of the subject.
In the figure below, you can see part "A" of the task concerning dispersion; indicators are presented.Tasks labelled "B" are more complex, may require an intermediate step, and sometimes require comparison.The text of the problem parts is longer than that of the problems marked "A", and the wording is more practical.Solving these sections may require backward thinking and the linking of several basic concepts.In the Figure 4, part "B" of the dispersion indicators task is shown.
The third group includes "C" class of tasks.These are complex tasks, and most of them are phrased with longer wording.Their solution requires a higher, application-level knowledge of the elements of the conceptual framework of the given question.Calculating the given mean or dispersion coefficients or interpreting a given diagram or graph (or comparison of graphs) is only part of the solution, an intermediate step, but not the answer itself.The wording of the tasks marked "C" usually requires a statement, reasoning, deduction, and justification (based on calculation and/or conceptual knowledge).Figure 5 shows part "C" of the task concerning dispersion indicators.The expected knowledge and activities according to the task parts are summarised in Table 1 ('T' means task part, 'C' means classification according to the framework, 'K/A' means knowledge, and ability).
Table 1.Description of the knowledge and skills required in the different parts of the tasks

Aspects of the analysis
The evaluation and analysis of students' solutions involve both a classic scoring system and coding.I have also considered whether the task is closed or  Task parts 1(c), 3(a), and 3(c) have been further subcategorized.To develop the categorization and coding, I drew on a 2018 work on the analysis of mathematical reasoning arguments (Sukirwan, 2018).The subcategories were developed considering the following criteria: Did the student answer the question, was his/her answer good and if so, was it well supported?The categories were assigned codes 0-3 as follows:

Results
The first table summarises the scores for open and closed items per task ('TP' means total points, 'AP' means average points, and 'AP' in % means the average as a percentage of the total score available; 'SD' denotes the standard deviation of the points, and 'RSD' denotes relative standard deviation in %).The average score for closed questions related to Tasks 1 and 2 is generally very high (with relatively low variance), which suggests that the majority of students have no difficulty in solving the tasks requiring them to calculate means, write suitable series of data and interpret simple diagrams.One possible explanation for the above may be that the solution to these types of tasks is included in the detailed matura/final leaving exam requirements, so that, given the specificities mentioned above, the greatest emphasis is placed on practicing them.It is a typical characteristic of public education that classroom work is strongly determined by the output requirements; this phenomenon is strongly present in descriptive statistics (Csapodi & Jánvári, 2022).Most of the closed task parts in Task 3 require the calculation of standard deviation.The average result of 56% indicates that even for the basic calculations and computations, there are some units that are problematic.The output requirements include the calculation of variance, but experience has shown that it is difficult for students to understand dispersion indicators and to interpret the available formulae (a collection of formulae is available and can be used in the written examination).The use of calculators is also allowed, but subject to certain type restrictions -further colouring the picture.This type of use also raises several issues: not all students have access to the same quality of calculators; calculators do not work in the same way; students are not able to use them properly or are not aware of the functions available.
The average score of the open-ended parts of the tasks is significantly lower and their standard deviation and the relative standard deviation are significantly higher, indicating the variability and heterogeneity of the results.Using these types of tasks is very marginal and almost non-existent in the classroom (Csapodi & Jánvári, 2022).In addition, understanding longer-form tasks require more advanced reading comprehension skills and a more complex conceptual understanding.All these factors might be behind the results obtained.
The following table shows the total scores that can be obtained in the worksheet, for the categories A, B, and C of the framework.('TP' means total points, 'AP' means average points, and 'AP' in % means the average as a percentage of the total score available; 'SD' denotes the standard deviation of the points and 'RSD' denotes relative standard deviation in %).The data suggest that more complex tasks, which require more autonomy and are more unusual in their wording, are less successful and significantly more heterogeneous.The lower mean and higher standard deviation for C-type tasks may indicate that students do not have the same level of basic knowledge and basic routine for these task types as for the task types that are closer to the output requirements.The varying level of the output may reflect the need for greater autonomy and the diversity of their reading comprehension and other competencies.
In this section, we examine the proportion of students who are successful in A-type and/or B-type tasks, and the proportion who are successful in C-type tasks as well.This analysis is divided into two parts.Table 6 shows the type and results of Exercises 1 and 2. For these tasks, the mean score was relatively high, and the standard deviation was low for A-type.In Table 6, the split was based in each case on scores of no less than 80% and less than 80%.(The lower limit for grade five on the final examination is 80%.Groups are divided into two sub-groups in each case according to this value.)(Let 'p' denote the point, and 'nr' denote the number of students).
Taking the results into account, we can conclude that the majority of students are successful in the basic tasks on means and graphs.However, only about half of the students who performed well on the basic A-type tasks performed at a similar level on more complex B-type tasks, and less than half of these students also performed well on the C-type subitem.
The third Exercise was examined in Table 7 separately, since on the one hand the standard deviation of the sub-scores of the task is high, and on the other hand, also based on my own experience, I have assumed more errors and point losses in Table 6.Data on Exercises 1 and 2 -dividing students into groups according to their success in sub-items A-, B-, and C-type the parts of the task that require basic calculation.In this case, I defined the limit of the split at 75% instead of 80%, so that the maximum score of 4 points does not mean that only a completely correct solution could be considered a successful solution.
Table 7. Data on exercise 3 -dividing students into groups according to their successfulness in sub-items A-, B-, and C-type In the A-type tasks, just over 50% of students scored 4-5 points -a significantly lower proportion than in Tasks 1-2.For the B-type tasks, just over a third of students scored highly.For C-type, the previous trend can be observed, with about half of the students scoring well in this category as well.More than half of the students who score low in A will not perform well in either B-or C-type.The figure 23 in the last column is an interesting, perhaps surprising result.These 23 students perform average or weakly in A-and B-type, while the sub-question requiring no direct calculations, based on conceptual knowledge, and asking for a decision and justification, are solved without error or with minimal loss of points.This phenomenon suggests that many students have problems calculating the standard deviation, i.e., they have mathematical, technical (lacking a calculator or experience in using one), or formula-related difficulties, but they can still do reasoning well based on their conceptual knowledge.Of course, the time factor must also be taken into account, since those who spent little time on calculations in Task 3 had more time to solve item c).
To get further information about the answers to C-type items, consider the analysis of some of the items requiring textual answers.A summary of the coding for items 1. c), 3. a), and 3. c) is given in the following table.The number of omitted answers is in no case less than 10%.This is a high value considering the completion rate of the other parts of the question.Both items 1. c) and 3. c) are complex tasks requiring the expression of an individual opinion.Their coding results are similar.The number of incomplete and nonassessable answers is around 30% in both cases, while the number of correct but incomplete answers is around 50%.Task 3 required a short, one-sentence answer to the question of how well the calculated standard deviation (with a given mean), describes the data set.The question and the expected answer are neither extensive nor complex, but rather unusual.Although the non-response rate is still 12%, the surprising thing is the almost 50% incorrect response rate.The reasons for this (among other factors) can be explained by the quality of the available task set and the specificity of the classroom activities in terms of statistics.
Among the colleagues involved in the completion of the questionnaire, Tasks 1 and 2 are the most popular, and there is general support for a greater emphasis on the interpretation of the calculated values in the classroom.In the longer, more complex tasks, it is considered that there is an over-emphasis on reading comprehension difficulties, which can create additional difficulties for students.An interesting additional piece of information is that many disagree with gaining the knowledge and with the accountability of calculating standard deviation, both in terms of curriculum and output requirements.
The student attitudes test consisted of 9 statements, which were rated on a 5-point Likert scale according to how much the students agreed with them.The statements concerned their attitudes towards learning statistics, their selfassessment and achievement in relation to certain criteria, and their classroom habits related to learning statistical content.Based on the summary and evaluation of the tests, it can be concluded that the majority of students like and find mathematics lessons with statistics content useful (S1, S2, S8), understand the types of problems they are asked to solve, and solve them with good results (S3, S4, S7).Nearly 40% of the students who responded said that they interpret and analyse the values calculated in class (S5, S6).A surprising result is that nearly 50% of the students said that they could apply statistics only to a limited extent or not at all in other lessons.The experience of the interview, which was organized with a small number of students, showed that the students' overall impression of the worksheet was positive.They felt that they had all the necessary conceptual knowledge to answer the questions; however, it was still unusual for them that in a mathematics problem they not only had to do the arithmetic, but also had to give several textual answers and express their point of view.They also reported that, in addition to the experiences mentioned above, they had problems with the calculator and the function table, and found it difficult to solve problems involving standard deviation.But in conclusion, they said that they had prepared for the exam using the previous years' exam papers and that they had no problems solving the types of exercises they found therein.

Conclusions
To summarise the above findings, we can say that grade 12 students have the knowledge required in the curriculum to meet the outcome requirements (RQ1).They solve the basic type tasks with almost full completion and with very good results (RQ1).However, interpreting the concepts learned and formulating an opinion or reason based on the knowledge acquired is already a difficulty for many (RQ2).The student interview mentioned in the introductory section revealed that questions such as type C are not included in the lessons, and students neither have the opportunity to express their opinions and reasons nor are they expected to do so.Regarding the conceptual knowledge needed to fill in the worksheet, students said that they felt they had all the concepts and knowledge needed to solve the problems.With regard to standard deviation, it was pointed out that several had difficulties in the calculation: they could not interpret and use the formulas (given in the formulas collection), and had no simple method for doing the calculation.Many do not even get to the point of understanding it for the above reasons.The non-completion and incorrect answers in the case of tasks requiring text answers also support the idea that the specific didactic design of descriptive statistics lessons, with their output focus, provides less time available for the inclusion of, for example, interpretation and evaluation in the lesson (RQ2).
Both the short-and longer-term objective is to introduce and teach descriptive statistics by a significantly broader set of tasks beyond the output requirements, which will appear in the framework curriculum.The calculation of the standard deviation could be greatly facilitated by teaching additional indicators of dispersion, such as the mean absolute deviation, which is also required in the framework curriculum, and exploring their relationship with the standard deviation.In the area of using tools, there are further questions about whether students can actually treat them properly, and whether access to the permitted tools can be ensured for all.

Figure 1 .Figure 2 .
Figure 1.An example of a task in part I of the intermediate level matura exam (2022)

Figure 3 .
Figure 3. Example of an "A" class task

Figure 4 .
Figure 4. Example of a "B" class task (semi-) open and which levels of the framework (A, B, C) the knowledge is required.A split of the above is shown in Table 2. ('T' means task part, 'C/O' means closed or open.)The columns below the classification show the available score for each class.

Table 2 .
Distribution of the scores according to the A, B, and C classifications of the framework

Figure 6 .
Figure 6.Total of the scores for the statements of the student attitudes test

Table 3 .
Criteria for grouping answers with reasoning

Table 4 .
Summary of the results of the open and closed items

Table 5 .
Data on the summary of the answers by categories A, B, and C

Table 8 .
Data on the summary of the coded answers