Chance and data : New opportunities
provided by the graphics calculator
Barry Kissane
School of Education Murdoch University
kissane@murdoch.edu.au
Abstract
The graphics calculator has found many uses in secondary school algebra and calculus, but possibilities in relation to chance and data have been less well publicised. This paper provides an analysis of the potential for using graphics calculators in the chance and data components of school curricula. Differences between scientific and graphics calculators are highlighted, most notably concerned with an aspect of the storage of data. A number of kinds of graphics calculator use are described and exemplified: graphical and numerical data analysis, data transformations, simulation, statistical inference and automatic data collection by calculator. Implications of these for school curricula and for teaching practice are suggested.
Introduction
Although many more sophisticated forms of technology are available for mathematics, it has been argued that the graphics calculator is the most important in many school settings, because it is the most likely to be accessible to the great majority of students (Kissane 1995). Certainly this is the case in Australian secondary schools, in which computers are generally only available in laboratories. It similarly seems likely to be the case in many parts of South-East Asia, where resources for technology are stretched even more thinly than they are in affluent countries like Australia and the USA. Even where microcomputers are available in larger quantities, only rarely are they available in important examinations, so that curricula usually are not designed on an assumption that students will have access to them. This is not the case with graphics calculators, which are now officially sanctioned for use in examinations at the end of secondary school in a number of Western countries, including the UK, the USA and parts of Australia.
Perhaps because of the word used to describe them (whether 'graphics' or 'graphing'), graphics calculators have been often regarded as mainly devices that allow students to explore graphs of functions as well as engage in appropriate calculations. While they do serve such useful purposes, their capabilities are not restricted to these, and a major purpose of this paper is to survey the kinds of ways in which graphics calculators might be used to enhance, and even to influence, curriculum and teaching in the particular areas of chance and data.
The terms 'chance' and 'data' are carefully chosen, and have been in common usage in Australia for some years now. They are not quite synonyms for probability and statistics, the main difference being with the level of formality of approach expected. The formal study of probability is very difficult for many secondary school students, and all too frequently reduces to the use of an algebra of probabilities, with little intuitive feel for the content, and many misconceptions. The term 'chance' acknowledges the importance of students acquiring some understanding of chance processes and some experience with random phenomena, with a view to a more thorough study of formal probability possibly being undertaken at a later point. Similarly, the term 'data' reflects a view that collecting, summarising, displaying and evaluating data are increasingly important components of the school curriculum, but the more formal study of statistics (including mathematical statistics) occurs later.
Although students can (and do) encounter elements of chance and data elsewhere in their schooling, the most focussed attention occurs in the mathematics curriculum, reflected in the inclusion of a strand called 'Chance and data' in A National Statement on mathematics in Australian schools. (Australian Education Council 1990). The statistician David Moore has emphasised the importance of chance and data in school curricula:
Probability is a field within mathematics. Statistics, like physics or economics, is an independent discipline that makes heavy and essential use of mathematics.
Statistics has some claim to being a fundamental method of human inquiry, a general way of thinking that is more important than any of the specific facts or techniques that make up the discipline. If the purpose of education is to develop broad intellectual skills, statistics merits an essential place in teaching and learning. Education should introduce students to literary and historical methods; to the political and social analysis of human societies; to the probing of nature by experimental science; and to the power of abstraction and deduction in mathematics. Reasoning from uncertain empirical data is a similarly powerful and pervasive intellectual method.
Why teach about data and chance? Statistics and probability are useful in practice. data analysis in particular helps the learning of basic mathematics. But, most important, it is because statistics is an independent and fundamental intellectual method that it deserves attention in the school curriculum. (1990 pp 134-5)
Merely a better scientific calculator?
At first sight, the most conspicuous difference between a scientific calculator and a graphics calculator as far as chance and data are concerned might be regarded as the graphics screen. This is understandable, and it is certainly the most visible difference. However, of more significance than the display capabilities of the screen is the fact that the data themselves are stored, rather than summaries of them. Scientific calculators use about five or six memory locations to store key data summaries (such as sums of squares and cross-products). If data are incorrectly entered into the calculator (which is quite easy to do, with a small keyboard and large fingers), they can be corrected by a complex process of the calculator removing the data from the relevant summaries. Of course, such errors can only be corrected if they are first detected; unless gross errors are made, they may well not be detected at all. Thus, a scientific calculator has a separate process for removing data as well as one for entering data.
In contrast, the graphics calculator stores all data, with profound implications for data analysis. In the first place, data entered can be checked against original data, and readily altered if errors are detected. Secondly, data for which significant doubt is raised (such as outliers, which may be original recording errors) can easily be identified and removed. But most importantly, the graphics calculator can be used to undertake a variety of analyses of data, rather than being restricted to a single analysis, which is characteristic of a scientific calculator. That is, a graphics calculator allows (and expects) the user to make decisions about the most appropriate form of analysis.
With respect to activities concerned with chance phenomena, there may also not appear to be much difference at first between the capabilities of (most) graphics calculators and their scientific calculator ancestors. However, the critical difference is the easy way in which random numbers can be transformed to suit particular purposes, and the fact that a lot of data can be generated fairly quickly.
So, the graphics calculator is much more than a better version of the scientific calculator - it is really a different species of device. The next section contains some descriptions of how the more powerful capabilities can be used.
New possibilities
This section contains brief descriptions of the main ways in which graphics calculators can be used to enhance the teaching and learning of data analysis and chance. Space precludes an exhaustive analysis of these or the provision of many examples, but it is hoped that the main ideas are catalogued.
Numerical and graphical data analysis
The first benefit of the graphics calculator over its predecessors is that data entry can be checked, and thus outliers can be more easily detected and corrected. Whether on a histogram, a box plot, a line graph or a scatter plot, outliers by their nature are physically obvious - they lie 'outside' the pattern formed by the rest of the data. Having a graphical representation of data may encourage students to question their data more carefully, and check to see whether the outliers appear to be genuine data that is surprisingly different from other data for an unknown reason, or whether they are merely measurement, recording or data entry errors, which should be corrected before any analysis of the data is undertaken.
As already noted, a graphics calculator can exploit its graphics screen to represent data graphically as well as numerically. In some cases, the most appropriate form of analysis involves a graphic representation, such as when the shape of a distribution is of interest (with, say, a histogram being used) or the shape of several distributions being compared (with, say, a number of box plots on the screen at once).
In other cases, the graphics representation is used to help guide the analysis. The best example involves a scatter plot for bivariate data. The major value of a scatter plot is to allow the user to see what kind of relationship, if any, there appears to be between two variables, so that an appropriate representation of the data can be sought. Although students have long been advised to draw a scatter plot of data, in order to check linearity before fitting a least squares regression line, for example, only rarely will they do so by hand, since the process is too tedious. On a graphics calculator, in contrast, the scatter plot can be obtained extremely quickly, and so is likely to become part of the information used to decide on the next step in the analysis. Many graphics calculators also allow for more than one regression line to be drawn at once, so that the visual fit of data to a curve can be compared more easily. For example, the data below clearly fit the exponential model graphed better than the linear one:
Students relying on the numerical value of the correlation coefficient (r = 0.982 in this case) may be misled into thinking that the linear model is a good fit, without access to the alternative model drawn on a scatter plot.
The storage of data also allows for data transformations to be undertaken, for a variety of purposes. One useful kind of transformation may involve taking logarithms of one set of data from a bivariate pair, to linearise an exponential relationship. Modern graphics calculators have easy procedures available for effecting such transformations, without the need to tediously extract the logarithm of each data point. The screen below, taken from Kissane (1997), shows an example of this, for the Casio fx-7400G:
In the case above, a new transformed variable, List 1, has been created by the command log List 2. (Note that a new variable is created rather than replacing the original data with their logarithms. Although the latter option is also available, it is rarely wise to lose one's original data.) The effect of such a transformation is to reduce an exponential relationship to an apparently linear one, as shown below:
Other transformations may be germane to the situation to be analysed, such as adding scores on two or more variables to give a total score, rather than analysing each variable separately, or transforming data from one metric to another (say, from temperature measured in Fahrenheit degrees to temperature in Celsius degrees). A particularly useful transformation involves the production of residuals from a regression model. Once a regression model is chosen, predicted scores can be obtained with a suitable transformation, and residuals obtained by finding differences between observed and predicted scores. The resulting analysis of residuals can help decide whether a model appears to account for the data, or whether there are other effects operating.
For example, the screen above, taken from Kissane (1997), shows a plot of original scores against residuals from a regression model, exposing a clear pattern in the residuals, and thus clear evidence that the model does not yet fit the data well. This has both practical and conceptual importance for students.
Data loggers
As well as analysing data, modern graphics calculators from each of Casio, Hewlett Packard and Texas Instruments can be used to automatically collect and store physical data, using electronic probes connected to separate data collection units, which can in turn be connected to the calculators. Various kinds of probes are available, both from calculator manufacturers themselves and from third parties; common ones include those for measuring temperature, voltage, resistance, light intensity and motion, but there are many others as well. The screen below shows temperature data collected (by a Casio EA-100) from a hot cup of coffee cooling through contact with the ambient air.
The characteristic exponential shape provides some support for the Newton Law of Cooling, and students can undertake the necessary transformation to study more carefully the way in which the coffee cools. (The transformation involved requires calculating the difference between the temperature of the coffee and the ambient air temperature at each point.)
The significance of these data loggers (as they are sometimes called) is that students can be involved in setting up experiments to collect data, rather than merely being involved in analysing existing data. The context of the physical sciences is an obvious one for such data collection activities, and links between mathematics and science can be strengthened by activities that clearly highlight the significance of each. A popular use of this sort of equipment involves collecting data to verify physical science laws with real data; in many cases, the realities of data collection are such that data will not faithfully reflect the physical laws, of course, thus encouraging students to consider more critically than might otherwise have been the case, the design of their experiment, the data collection procedures, the nature of random errors, and so on. Such things are very important for students to learn in the physical sciences, and will help also in the mathematics teacher's quest to encourage students to develop a healthy scepticism for any data.
As well as collecting and analysing data in the classroom, data loggers can be used in two other ways. One involves classroom demonstrations, useful both when equipment or time are in short supply, or when a teacher wants to deal with the whole class as a unit. Some of the demonstrations using motion detectors are especially effective in this regard, allowing students to see relationships between velocity, time, location, distance travelled and graphical representations of these. Both mathematics and science learning can be enhanced by such experiences. The demonstrations are best dealt with effectively using an overhead projector attachment, but not all calculator models allow for this to happen.
The other potential use of the data logger is outside the classroom, as a stand-alone unit for data collection, with the possibility of downloading the data into a calculator for later analysis. This is most clearly useful for biological and ecological studies, where field measurements (such as the pH of a local river) are needed; the data logger has clear advantages over computer versions of the same idea, which are more expensive, and much less portable. The facility to store data for downloading means that data can be easily shared among students in a class, with each person having the same raw data available for analysis.
Statistical inference
In the secondary school curriculum of most countries, data analysis rarely involves statistical inference, but is rather restricted to descriptive statistics. However, this is not always the case, and certainly the curriculum of the lower undergraduate years routinely involves statistical inference as a major organising idea. Once appropriate analysis of data has been undertaken, it is usually possible to extract enough information from the calculator to perform statistical tests by hand, together with suitable statistical tables. For example, statistical information such as means, standard deviations and sums of squares are always retrievable for this purpose. Alternatively, short programs can be written to use this kind of statistical information in the calculator to construct test statistics for comparison with tabulated values.
Some scientific calculators provide access to various points of the standard normal probability distribution. Graphics calculators provide access to some probability distributions too. Two examples are shown below for the Hewlett Packard HP-38G, which provides for upper tail probabilities of particular normal, Students-t and chi-squared probability distributions.
There are graphical versions of distributions available, too, such as those shown below from a Casio cfx-9850:
A bigger range of distributions is provided by the Texas Instruments TI-83, both numerically and graphically:
With the probability distributions available in the calculator itself, a limited amount of statistical inference can be undertaken without reference to anything other than the data and the calculator. Short programs could be written to do this, or the analysis can be conducted with the summary data in the calculator in normal computation mode.
A very interesting recent development of these ideas is the incorporation by Texas Instruments of some hypothesis testing procedures into a calculator menu, linked with the generation of relevant probability distributions for use in conducting the tests. The most used parametric tests, involving inferences about means, standard deviations and contingency tables, as well as ANOVA are included. Results can be provided either with reference to a graphical display of the probability distribution or numerically, as shown below:
These impressive additions represent an important step forward for the graphics calculator, and it is hard to continue to defend the many courses in elementary statistics, especially in the early undergraduate years, which place so much emphasis on computation of statistics from raw data, and all too frequently leave too little learning time for the statistical ideas involved to be developed or understood. The personal technology of the graphics calculator can now deal with most of the computations expected of beginning students, in a manner that is frequently more accessible than that provided by microcomputers or mainframe terminals in a computer laboratory.
Of course, a calculator will not deal with the important conceptual ideas underlying inferential statistics. At some point, these must be developed and understood theoretically by students. What the calculator might do, however, is to remove the tedium of arithmetical calculation that has dogged the study of elementary statistics for too long. For example, it is difficult to defend the viewpoint that the calculation and manipulation of sums of squares adds much, except frustration, to a student learning about the analysis of variance. It seems better to leave some of the drudgery to the calculator, and hence free up some student time to think about the meaning of the statistical tests involved and the interpretation of their results.
Simulation
Although there are elements of probability associated with statistical testing, the major contribution of the graphics calculator to the study of random phenomena comes about through the use of simulation. All graphics calculators have commands that enable (pseudo-) random numbers to be generated. Adroit use of this kind of command can help students acquire important practical experience with chance phenomena and even address problems in new ways.
The critical attributes of the calculator here are the capacity to transform uniform random numbers on the interval (0,1) to more useful data, and the capacity to generate many repetitions quickly. For example, a calculator command such as Int (6Ran# + 1) can be used to transform a random number (Ran#) into a random integer in the set {1,2,3,4,5,6} and thus to simulate a toss of a standard die. With each press of an execute key, a new toss is simulated, so that a student can quickly produce data for analysis purposes. The screens below show an example of this, using the Casio cfx-9850 calculator to simulate six successive dice tosses.
One powerful use of such a process in a classroom is for individual students to compare their results with each other, and for the whole class's data to be aggregated. Such practices provide the potential for students to see both the unpredictable nature of a random process and also the sense in which such data are quite predictable in the aggregate. For example, a teacher can confidently 'predict' that the mean dice score for the whole class put together will be close to 3.5, even though individual student distributions are not quite as they might expect.
With only a small change to the command, different data can be generated. For example, the first command on the screen below shows how to simulate a random digit in the range from 0 to 9, while the second command shows how to simulate tossing a pair of coins and counting how many head appear.
In recent years, there has been some emphasis on practical activity when learning about elementary notions of chance, prior to more formal treatments of probability as students become older and mathematically more sophisticated. The use of graphics calculators as one of the ways to collect information will assist this trend, without some of the practical difficulties of providing students with physical materials to use.
The simulation capabilities of calculators also permit Monte Carlo procedures for addressing mathematical problems to be used, at least in simple cases. For example, if cards of six pop stars are randomly distributed inside boxes of chocolates, students can simulate the number of boxes that need to be purchased in order to collect a complete set of cards, by randomly simulating dice rolls until all six scores have appeared, as shown in the two screens below:
In this case, 13 tosses were needed. Experiments of this kind can be undertaken easily by all students in a class, and a whole class distribution of results constructed. The formal analysis of such problems is well beyond most high school curricula, but plausible answers to such interesting questions can be readily suggested in this way.
More sophisticated simulation capabilities have been provided on Texas Instruments' TI-83, including generating random integers in a certain range, as well as observations that are normally or binomially distributed, rather than being restricted to uniform distributions. The calculator also contains a command for generating sets of observations. For example, the screen below shows sets of six observations from a binomial distribution with n = 10 and p = 0.5. Together the screen provides a sample of 30 observations from this distribution, to help students get a fell for the likely results.
These capabilities extend still further the potential of the calculator to provide students with an experiential basis for their study of random phenomena. as well as the means to simulate more complex events fairly easily.
Finally, all of the simulation capabilities of calculators can be readily used in calculator programs, so that students are not restricted to repeated key pressing to generate data for analysis. The features of the graphics calculator as data analysis device and as simulation device can be combined together: random data can be generated and stored for later analysis in a few seconds, so that each student can obtain sufficient data for important trends to become clear. Much can be gained by a class of students each having access to a personal program of this kind.
Analysis of sequences
Although most data dealt with by calculators will consist of either real data or simulated data, it is worth noting that there is another use of inbuilt data analysis capabilities, to analyse data generated for a mathematical purpose. One example involves dealing with sequences and series. The screen below shows in the first column the positive integers, and in the second column the sequence of their partial sums - the corresponding series, automatically generated by the Casio cfx-9850:
The data analysis capabilities of the calculator canb be used to study the relationships between these two sequences, as shown below:
It seems from the graph that there is a quadratic pattern involved, and the coefficients of the quadratic function that models these data (perfectly) allow students to see that the formula for the nth term of the series is given by S(n) = (n^2)/2 + n/2.
While this certainly does not constitute a proof of the relationship, it does serve the useful purpose of letting students see the connection in a new way. Indeed, from a practical point of view, the same idea allows students to find the (unique) parabola that fits three points, as shown below.
The screens show that the points (1,6), (2,11) and (-1,8) all lie on the parabola given by y = 2x^2 - x + 5.
Implications
These examples establish that the graphics calculator has the potential for use in the mathematics curriculum for more purposes than dealing with elementary algebra and calculus, which are the areas most commonly discussed. In the case of data analysis, the calculator may help to shift the focus of student attention towards understanding and analysing data, rather than more mechanistic treatments of them which have all too often predominated in the past. One useful consequence of such a change may be that students can collect their own data more often than they tend to do at present, and use them to address issues and questions that they find to be important.
As far as chance phenomena are concerned, the graphics calculator may open a new realm of possible kinds of activities in schools, focussing on the generation and careful inspection of random data of various kinds. The present focus in many school curricula on the algebra of probabilities and on elementary combinatorics have frequently been found wanting. It seems opportune to take advantage of the graphics calculator at least to enhance these aspects of the mathematics curriculum, if not to encourage more substantial changes of curriculum focus.
Any mathematics curriculum at any level of education must rest upon assumptions about the technology that is available to all students concerned. However, it is only fairly recently that school mathematics curriculum statements have begun to make explicit references to technology. For curricula in which the graphics calculator is assumed to be accessible to students, we may adopt some fresh approaches to the introductory study of both chance and data, of the kinds outlined here. It may even be that the curriculum itself is influenced by these possibilities, with a view to strengthening the mathematics education students receive. One example of this might be the acknowledgment that many relationships between variable are not linear, so that alternative models need to be explored.
Some years ago, one of the characters in Charles Schulz's Peanuts cartoons observed that 'There is no heavier burden than a great potential'. The graphics calculator seems to have great potential to influence curricula in chance and data and to improve the experiences to which students have access. Realising that potential will not be easy, however, since it will require some fresh thinking by both classroom practitioners and also by curriculum developers, and real change is never an easy matter. Despite the difficulties, it is important to reconsider our educational practices in the light of the available technologies.
References
Australian Education Council 1990. A national statement on mathematics for Australian schools, Melbourne: Curriculum Corporation.
Kissane, B. 1997. Mathematics with a graphics calculator: Casio fx-7400G, Perth: Mathematical Association of Western Australia.
Kissane, B. 1995. 'The importance of being accessible: The graphics calculator in mathematics education:', Proceedings of the First Asian Technology Conference in Mathematics, Association of Mathematics Educators: Singapore, 161-170.
Moore, D. 1990. 'Uncertainty', In L. A. Steen (Ed) On the shoulders of giants, Washington: National Research Council, 95-137.
© Asian Technology Conference in Mathematics, 1997. |