Assessment, Diversity, Leadership, Policy

Standards, Accountability, and Student Assessment Systems

An international overview

The development and implementation of accountability systems has, arguably, been the most powerful trend in educational policy in the last 20 years.[i] The setting of academic standards for what students should know and be able to do can be traced to Prime Minister Margaret Thatcher’s British government during the 1980s. A national curriculum was adopted in 1988 that outlined core competencies that students should master in areas such as reading, writing, and arithmetic. Through Standard Achievement Tests (SATs), students’ and schools’ achievement results could be compared. Naturally, teachers and school administrators would also be judged for the performance of their students. The underlying message conveyed to parents was that they should be relatively satisfied with schools that improve their test performance from year to year and begin to question the quality of instruction for those that have poor performance.

This type of educational reform model and corresponding zeitgeist spread very quickly to other parts of the world including the rest of the United Kingdom, Europe, North America, Australasia, as well as parts of Asia. Policies mandating the institution of curriculum requirements and standardized tests are often associated with both neoconservative and neoliberal ideologies that apply market logic to the realm of social institutions such as schools.[ii]

This brief survey of assessment systems across various industrialized nations is meant to provide the reader with a general understanding of how standards are assessed in parts of Europe, North America, Australasia, and Asia. In some cases, standards are assessed in relation to national/regional external tests, while in others schools rely on internal assessment methods to reach judgments of educational quality. The reader should take note of the diversity in assessment systems, since different models place unique demands and expectations on school leaders.

North America

External testing has spurred considerable debate in Canada. In Ontario, Canada’s largest province, testing is conducted under the direction of the Education Quality and Accountability Office (EQAO). Results are disseminated in a manner that invites comparisons across schools and districts. Parents are able to check their school’s performance relative to other schools, districts and the provincial average. Similar standardized testing programs operate throughout Canada’s ten provinces, each garnering media attention. At the national level, external agencies such as the Fraser Institute publish report cards that rank individual schools according to their performance on provincially administered tests. Despite the publication of test results, it is important to note that the Canadian landscape is markedly different from that of their American neighbours. For the most part, external test results are used to facilitate school improvement and do not carry high-stakes consequences for teachers or students in Canada.

In the United States, the federal No Child Left Behind Act (NCLB) requires every state to develop standards, standardized tests, and accountability systems. In addition, by mandating the option for students to transfer from schools with low test performance to those with higher performance, NCLB promotes competition between schools. Not surprisingly, the expansion of the testing industry has continued unabated in the U.S. Although the current federal government has signaled its desire to reauthorize and strengthen NCLB, the initiative [MC1] has provoked a high degree of controversy and has resulted in countless legislative debates and criticisms from parents, teachers, and academics. Overall, proponents and critics of NCLB have debated the appropriateness of high-stakes testing in the American education system – tests that are used for important decisions such as promotion to the next grade, graduation, merit pay for teachers, and/or school rankings reported in the popular media.

United Kingdom

In England, the trend since the late 1980s has been toward total accountability in the education system.[iii] England measures progress against national standards when students reach the ages of 11, 14, and 16 years. League tables that summarize the performance of schools are published by local and national newspapers, attracting a considerable amount of political and public attention. This testing and accountability framework has undergone significant revisions in recent years. For example, England’s national tests for 14-year-old students were dissolved and replaced by a system of assessment by teachers in 2008. This decision was announced by the Children’s Secretary Edward Balls, who was quick to point out that the decision was not a “U-turn” and would not affect the tests taken by 11-year-olds, which continue to be used for the accountability system.

Other parts of the United Kingdom have also seen significant changes to their assessment and accountability frameworks. For example, Scotland in 2003, followed by Wales in 2007, abolished national testing for five-to-14-year-olds and replaced them with teacher assessments. At that time, the Scottish Education Minister, Peter Peacock, said the change was precipitated by the desire to create a “seamless” curriculum with an emphasis on teaching rather than testing. Collectively, these changes suggest a fundamental shift in the policy and practice of assessment that is taking root in the United Kingdom. The implications of these changes for school leaders is profound and an ongoing area for research and focused study. 


It is no small task to describe the diversity in assessment systems across continental Europe, given the large number of countries that occupy this continent. Fortunately, an important European organization named Eurydice provides information on and analyzes European education systems and policies. Currently, 31 countries fall within the Eurydice Network, including the previously discussed United Kingdom (England, Wales, Scotland, and Northern Ireland). Overall, testing has become a common practice across Europe since the early 1990s. Assessment methods may be internal or external, formative or summative, and are assigned various levels of importance.[iv] (In this article we are primarily concerned with assessment methods used to assess progress against preset standards.)

Countries such as Sweden, France, Ireland, Hungary, and the United Kingdom have a long history of national testing to monitor and evaluate the quality of public education, particularly in relation to standards. Presently, Eurydice reports that most European countries have introduced and implemented national testing in relation to education standards. In some cases, the legal basis for the inclusion of standards and standardized tests has been established through legislative acts. While for the most part national testing continues unabated in Europe, it is also important to note that some countries have taken steps to limit and/or abolish external summative assessments. For example, in four countries – Belgium (Dutch-speaking community), Czech Republic, Greece, and Liechtenstein – schools carry out assessments internally and rely on formative and summative measures on a continuous basis. Nevertheless, the Eurydice Network is quick to point out that despite the variations in approaches to pupil assessment, the process of assessing learning outcomes is an instrumental factor in improving the quality of education in all European nations.


Australasia comprises Australia, New Zealand, the island of New Guinea, and neighbouring islands in the Pacific Ocean. This section summarizes standards-based reform in the two largest nations – Australia and New Zealand. Australia has six states and two major mainland territories, each developing and administering their own achievement tests to monitor educational progress. Although there was a fair degree of diversity in assessment approaches, national tests were recently introduced so that each state and territory could be judged against common criteria. As with assessment results in North America and parts of Europe, these national test results are published in a way that invites comparisons between schools.

New Zealand is divided into two main islands (North and South). Like Australia, New Zealand has a national curriculum that sets a direction for what students should know and be able to do in reading, writing, and arithmetic at different points of compulsory schooling. Interestingly, New Zealand relies on Overall Teacher Judgments to determine the degree of progress toward national standards. Observations and examples of students’ classroom work are very important in forming Overall Teacher Judgments. Popular assessment tools in reading, writing, and mathematics are also recommended to teachers to improve the reliability of their Overall Teacher Judgments. The Ministry of Education also makes it abundantly clear that no one assessment tool is sufficient to make a definitive judgment against a standard. Thus, the New Zealand model advances the use of a range of student assessment methods for accountability purposes.


Asia comprises a diverse range of assessment and accountability frameworks. We have chosen to highlight two educational jurisdictions in this region – Japan and Hong Kong. In Japan, standards-based reforms and a national curriculum have a well-established tradition. Assessments have particularly important consequences as a student progresses through the system. For example, high-stakes examinations determine student suitability for particular high schools and later for higher education institutions. Nevertheless, it is important to acknowledge that the only national examinations in the Japanese public system are those used for college entrance admissions.

In Hong Kong, educational standards are implemented through both self-evaluations and external reviews. Self-evaluations are based on key performance measures in the following areas: management and organization, learning and teaching, student support and school ethos, and student performance. The latter element, student performance, includes external measures such as the Hong Kong Attainment Test and the Tertiary-wide System Assessment (TSA). Collectively, external assessments, such as the TSA, provide the government and school management with information on school standards. TSA results are meant to inform teaching and learning and ultimately facilitate school improvement planning.

Distinguishing Features

A review of the various international jurisdictions suggests that no particular model of assessment is dominating the standards-based landscape. Rather, diversity exists with respect to a variety of interrelated features, such as whether student assessments are:

  • Low versus high stakes for students;
  • Low versus high stakes for schools (teachers and principals/school administrators);
  • Internally versus externally developed;
  • Nationally versus regionally oriented;
  • Geared toward all ages versus key developmental points;
  • Geared toward a variety of subject areas or a select few;
  • Geared toward academic versus non-academic domains;
  • Traditional paper-based modes versus technology-enhanced delivery modes;
  • Reported at the student, school, and/or district level;
  • Focused on assessment of learning versus assessment for learning.

Most systems have diversity in relation to each of these elements. For example, some systems use a combination of internally developed teacher assessments as well as more centralized external assessments. Other systems might reserve low-stakes consequences for students in elementary grades but have more pronounced high-stakes consequences for students in the senior grades – as evidenced through graduation examinations.

The most contentious issue related to high or low stakes is not associated directly with students, although student results are the measure. In some jurisdictions, schools are judged on the basis of student achievement on large-scale tests and receive sanctions or rewards on this basis. Thus, no particular system can or should be classified according to single features. To do so would misrepresent the unique character of their standards-based assessment model. Instead, each jurisdiction has made choices on all of these dimensions and sometimes blended them to create their own unique assessment processes.


Accountability is a charged word that is deeply embedded in the history and culture of a nation. It carries with it expectations for action among various educational stakeholders. In 1994, Linda Darling-Hammond described two different views of educational change and accountability:

One view seeks to induce change through extrinsic rewards and sanctions for both schools and students, on the assumption that the fundamental problem is a lack of will to change on the part of educators. The other view seeks to induce change by building knowledge among school practitioners and parents about alternative methods and by stimulating organizational rethinking through opportunities to work together on the design of teaching and schooling and to experiment with new approaches. This view assumes that the fundamental problem is a lack of knowledge about the possibilities for teaching and learning, combined with lack of organizational capacity for change.[v]

The countries described in this paper provide nuance and shading to these polarized views and show the range of perspectives that standards, accountability, and student assessment systems can take.

First published in Education Canada, June 2013



This research was funded by the Social Sciences and Humanities Research Council of Canada (SSHRC).


Portions of this article have been adapted from L. Volante (Ed.), School Leadership in the Context of Standards-Based Reform: International perspectives (Dordrecht, Netherlands: Springer, 2012).

[i] M. Barber, (2004). “The Virtue of Accountability: System redesign, inspection, and incentives in the era of informed professionalism,” Journal of Education 185, no. 1 (2004): 7–38.

[ii] D. Hursh, “Neo-liberalism, Markets, and Accountability: Transforming education and undermining education in the United States and England,” Policy Futures in Education 3, no. 1 (2005): 3–15.

[iii] W. Harlen, Assessment of Learning (Thousand Oaks: Sage, 2007); and C. Whetton, E. Twist and M. Sainsbury, “National Tests and Target Setting: Maintaining consistent standards,” Paper presented at the Annual Meeting of the American Educational Research Association (New Orleans: April, 2000).

[iv] Eurydice. National Testing of Pupils in Europe: Objectives, organisation and use of results (Brussels: Education, Audiovisual & Cultural Executive Agency, 2009). http://eacea.ec.europa.eu/.

[v] D. Darling-Hammond, “Performance-based Assessment and Educational Equity,” Harvard Educational Review 64, no. 1 (1994): 23.

Meet the Expert(s)

Dr. Louis Volante

Professor, Brock University & UNU-MERIT

Louis Volante, PhD, is a Professor at Brock University and a Professorial Fellow at UNU-MERIT/Maastricht Graduate School of Governance. His current research is focused on multi-level education governance, comparative policy analysis, impact evaluation of policies and programs, politics of education reform, international large-scale assessments and transnational governance, and cross-national educational inequalities.

Read More

Lorna Earl

Retired Associate Professor from the Ontario Institute for Studies in Education at the University of Toronto

Lorna Earl, PhD, is a retired Associate Professor from the Ontario Institute for Studies in Education at the University of Toronto. Her work has focused on leveraging policy and program evaluations as a vehicle to enhance learning for pupils and for organizations. Her research also examines assessment for and as learning in the classroom.

Read More

1/5 Free Articles Left

LOGIN Join The Network