The Importance of the History of Evaluation
When thinking about evaluation, we remember the words testing and examination. The tradition of linking evaluation with examination dates back to the seventh century, when in China a practice to evaluate knowledge of the people on the writings of Confucius was introduced. This evaluation format, such as an examination or test, would have appeared within the hierarchy of mandarins1 in order to select the state bureaucrats. The practice of imperial examinations would have started in the year 605 in the Sui Dynasty and lasted until the year 1905 when the Qing dynasty ended. The tests were reproduced in epochal prints showing individual cells for its realization and the figure of mandarins delivering the approval document to each distinguished candidate. Since then, the evaluation would carry in its luggage the character ofmerit examination, which, in its turn, would lead to social mobility. According to Judges (2000, p. 34), it was Confucianism, with its moral and political perspective, that spread the testing program to ensure a stability among members of the hierarchy of power “while maintaining a tradition of scholarship of Chinese bureaucrats for many centuries. ” In the sixteenth century, the word evaluations (aualiacoes) and the verb to evaluate (aualiar) appear in the text of Ordenagoes Manuelinas, Carta Regia2 in “Livro I, Titulo LXVII, Do Juiz dos orfdos, e coufas que a feu Officio pertencem,” which deals with goods or things that should be valued for purposes of inventory and sharing of properties in ways that at the time of delivery to the rightful owner would not have suffered damages due to “loss or mistake” (Leite, 2011, p. 271). An example of the use of evaluation linked to justice and clear procedures is given in the case of the orphans court judge of the sixteenth century (Royal Charter 1521) that should judge the goods and proceed the share, would have an actuary’s company and two or three people sworn to follow the procedure (the committee). From this document, it appears that the evaluation performed owed its sense of to give faith (make public and evaluate the credit) in the presence of witnesses (the commission) and the sense of divide with justice (law and value).
The painting Le preteur et sa femme from Flemish painter Quentin Metsys in exhibition at the Louvre Museum, conveys the same idea of evaluation. This painting shows a scale (trebouchet) that establishes a kind of equivalence with a range of similar objects to assess the exact weight of some pieces (coins). The objects are being evaluated, compared, measured, and weighed for an exchange. In the scene evoked by the painter, there is a material wealth and at the same time an allusion to the spiritual wealth—the souls will also be evaluated upon arrival at heaven for the final judgment. In the framework of the picture, there was a biblical quote no longer visible today, which read: Statura justa e ponderada Aequa sint, meaning “scale must be fair and with equal weights.”
The same values would also be the basis for school evaluation introduced by Jesuitic Order in the European colonies. Still the sixteenth century, around 1599, the order set out its rules for educational practice in the Ratio Studiorum. This document, ruling education in the American colonies, regulated and detailed the order of rules for the organization of oral exams, whose text should be written in Latin by the pupil. The regulation consisted of the Ratio Atque Institutio Studiorum Societatis Jesu and corresponds to what can be found and understood today (except Latin) as student evaluation and disciplines. There was the physical punishment of students who did not adapt to the rules and incentive and reward the most skillful. We face, then, even in the sixteenth century, the sense of punishment and reward and competitiveness linked to the evaluation; the trial of the preceptors and the mix of moral, spiritual, religious, and cognitive review by normativity evaluation.
Next, we will refer to Comenius position on evaluation. In Didactica Magna, 1657, Comenius drew up a treaty on how a teacher should behave to get the attention of students. He suggested to warn and punish students who did not pay attention and to maintain constant vigilance to comply with duties. Being selective, Comenius saw that the examination of some was enough to know what others would know. With the examination of some, the preceptor would know the result of many. To review individual books was also part of the advice of Didactica Magna.
To admonish defaulters and to praise in public the good students would also be evaluative-valuable stimuli. It would be didactic both to warn, to punish, and to praise in public! Such were the values linked to evaluation.
In the nineteenth century, in 1836, the evaluation, namely the establishment of a body of examiners, became a reason for granting a license (charter) to the institution that we know today as the University of London (then without teachers or students). The evaluation through exams had also been held in Oxford (Examination Statute) and Cambridge since the previous century for all who had studied methodically under the supervision and to submitted evidence. It seems that such evidence could even be understood as performance measurement standardization, a new mechanic of education for the time, says Judges (2000). In this case, the tests became part of the studies and were applied in front of examiners (the commission).
In America, in the late nineteenth century, the written testing form, and not the oral examination, no longer in force until then, received the connotation of scientific evaluation. The evaluation began to be understood as a scientific procedure since 1894, when Joseph Rice, an American professor who had studied in Germany, applied the first macro-scale tests.
The first half of the twentieth century was replete with studies on evaluation and measures, especially with the research conducted in North America. In the beginning of that century, the evaluation subject was highlighted with names today still famous as Edward Thorndike, who in 1903 wrote a book on Educational Measures. Since then, Thorndike came to be called the father of the modern science of educational measurement as for that he has gained numerous followers. The so-called scientific evaluative measures began to be widely used by professionals mainly from the area of Psychology. Several names stand out, among them Alfred Binet, with the creation of intelligence tests and the construction of scales.
The evaluation field was growing. We can cite the work of Starch and Elliot on test reliability studies, 1912; factor analysis, correlation tests by Spearman in 1914; development of objective evidence by McColl in 1920; degree of ability for entrance examination by Bridham in 1926; measurement techniques—selection, purchase, and distribution of tests by the Educational Records Bureau from 1927; and educational performance studies, attitudinal scales, by Thurstone in 1929. In 1931, Ralph Tyler emerged in the field and structured what even today is known as objective evidence, that is, the evaluation of objectives, goals, or purposes. If the evaluation was by goals, behaviors were measured, so the responses should be automated and IBM entered the field. In 1935, IBM launched a machine that punctuated the sheets of student responses to tests and exams. From then on, the electronic processing of evidence in evaluation has become a reality allowing the realization of mass measures. In the same perspective, in 1953, the work of Lindquist at IBM excelled.
Another aspect needs to be mentioned when speaking about markets and companies. Since the early twentieth century, with emphasis on the end ofthe century and advancing into the twenty-first century, the evaluation, said to be scientific, began to serve the purposes of accrediting institutions, especially universities and programs from all fields of knowledge, with emphasis on education and health. While it is known that the accreditation had been occurring since the beginning of the twentieth century, the year 1950 is taken as a mark of accreditation agencies emergence in North America.3
Even in the middle of the century, in 1956, it is worth mentioning the known figure of Benjamin Bloom with his famous Taxonomy of Educational Objectives. Working alongside Tyler, Bloom developed the rational sequence of cognitive objectives; and later worked with affective and psychomotor domain objectives, from which a new progress in drafting questions of evidence for testing and exams was produced, which came to give a new understanding of student learning to the pedagogical field. Since then, educational evaluation has been buoyed by objectives previously determined and described. Later on, Bloom, Hastings, and Madaus wrote the Evaluation Manual, which would fixate an evaluation central meaning, that “evaluation is a method of collecting and processing the necessary data to improve learning” (1983, p. 8).
The development of the field of evaluation intensified from the second half of the twentieth century. New studies and research were funded and allowed considerable advances. Comparative studies at the international level and implementation of large-scale assessments at the national level with a view to improving schools, curricula, and student learning came about all around the world. Several names could be cited as highlights in the field of knowledge and research on evaluation. One must remember that in 1967 Robert Stake coordinated the Monograph Series on Curriculum Evaluation. Somehow, it was criticized by Bloom because he said that evaluation should, indeed, be being about discrimination among students! One must also remember its opposite: Bloom himself used the term formative assessment to differentiate from discriminatory or competitive evaluation.
Further historical analysis has shown that evaluation as a science was mature by terms at the end of the twentieth century. Knowledge on evaluation has grown remarkably in the direction of its scientific scope, the number of researchers in the field, the number of specialized agencies for funding research into the evaluation, the number of associations and professional evaluators (from 1976 in the USA), and the number of journals on the subject. On the other hand, the advancement of knowledge served to strengthen values for selecting people, their classification, and also their exclusion. To use Bourdieu’s expression, evaluation has become a piece of symbolic violence because it also served for the elimination of the weakest and encouraged the rise of those who enjoy social positions of power or special talents. Interestingly, at the end of the twentieth century, a time when Bourdieu discussed the symbolic violence in France, in England, Bernstein studied the contradictory implications between micro and macro social processes and noted the role of evaluation in the reproduction of the principles of distribution of power. According to Basil Bernstein, the evaluation rules as a function of a pedagogical discourse to demonstrate which knowledge is valid. Its position differentially positions the subject toward his/her consciousness of social class. That is, evaluation interferes with the production, reproduction, and transmission of cultures and reinforces social positions.