Assessing Learning

One of the most challenging tasks for language instructors is finding effective ways to determine what and how much their students are actually learning. Instructors need to think carefully about what kinds of knowledge their tests allow students to demonstrate.

This section provides guidance on ways of using traditional tests and alternative forms of assessment. Popup windows on the Alternative Assessment page illustrate the use of checklists and rubrics for evaluation. The page on the ACTFL Guidelines includes popup windows on specific languages.

Assessing Learning

Peer and Self Assessment

Peer Assessment

One of the ways in which students internalize the characteristics of quality work is by evaluating the work of their peers. However, if they are to offer helpful feedback, students must have a clear understanding of what they are to look for in their peers' work. The instructor must explain expectations clearly to them before they begin.

One way to make sure students understand this type of evaluation is to give students a practice session with it. The instructor provides a sample writing or speaking assignment. As a group, students determine what should be assessed and how criteria for successful completion of the communication task should be defined. Then the instructor gives students a sample completed assignment. Students assess this using the criteria they have developed, and determine how to convey feedback clearly to the fictitious student.

Students can also benefit from using rubrics or checklists to guide their assessments. At first these can be provided by the instructor; once the students have more experience, they can develop them themselves. An example of a peer editing checklist for a writing assignment is given in the popup window. Notice that the checklist asks the peer evaluator to comment primarily on the content and organization of the essay. It helps the peer evaluator focus on these areas by asking questions about specific points, such as the presence of examples to support the ideas discussed.

For peer evaluation to work effectively, the learning environment in the classroom must be supportive. Students must feel comfortable and trust one another in order to provide honest and constructive feedback. Instructors who use group work and peer assessment frequently can help students develop trust by forming them into small groups early in the semester and having them work in the same groups throughout the term. This allows them to become more comfortable with each other and leads to better peer feedback.

Self Assessment

Students can become better language learners when they engage in deliberate thought about what they are learning and how they are learning it. In this kind of reflection, students step back from the learning process to think about their language learning strategies and their progress as language learners. Such self assessment encourages students to become independent learners and can increase their motivation.

The successful use of student self assessment depends on three key elements:

Goal setting
Guided practice with assessment tools
Portfolios

Assessing Learning

Traditional Tests

Traditional pencil-and-paper tests ask students to read or listen to a selection and then answer questions about it, or to choose or produce a correct grammatical form or vocabulary item. Such tests can be helpful as measures of students' knowledge of language forms and their listening and reading comprehension ability.

However, instructors need to consider whether these tests are accurate reflections of authentic language use. The tests usually do not present reading comprehension and listening comprehension questions until after students have read or listened to the selection. In real life, however, people know what information they are seeking before they read or listen. That is, they have specific information gaps in mind as they begin, and those gaps define the purpose for reading or listening.

To make language tests more like authentic listening and reading activities, instructors can give students the comprehension questions before they listen to or read the selection. This procedure sets up the information gaps that students will then seek to fill as they listen or read.

Instructors also need to be careful about what pencil-and-paper tests are actually testing. A quiz on which students listen to a selection and then respond to written questions is testing reading ability as well as listening skills and will give a lower-than-appropriate score for students whose oral comprehension is stronger than their reading comprehension. A test on which students read a selection and then answer multiple-choice questions is testing their knowledge of the language used in the questions as well as that used in the selection itself. If the language used in the questions is not keyed to students' proficiency level, the test will not reflect their ability accurately.

Language instructors also encounter students who do well on pencil-and-paper tests of grammar and sentence structure, but make mistakes when using the same forms in oral interaction. In such cases, the test is indicating what students know about the language, but is not providing an accurate measure of what they are able to actually do with it.

When the goal of language instruction is the development of communicative competence, instructors can supplement (or, in some cases, replace) traditional tests with alternative assessment methods that provide more accurate measures of progress toward communication proficiency goals. This can be done by combining formative and summative types of assessment.

Summative assessment

Takes place at the end of a predetermined period of instruction (for example, mid-term, final)
Rates the student in relation to an external standard of correctness (how many right answers are given)
Is the approach taken by most traditional and standardized tests

Formative assessment

Takes place on an ongoing basis as instruction is proceeding
Rates the student in terms of functional ability to communicate, using criteria that the student has helped to identify
Helps students recognize ways of improving their learning
Is the approach taken by alternative assessment methods

Assessing Learning

Alternative Assessment

Alternative assessment uses activities that reveal what students can do with language, emphasizing their strengths instead of their weaknesses. Alternative assessment instruments are not only designed and structured differently from traditional tests, but are also graded or scored differently. Because alternative assessment is performance based, it helps instructors emphasize that the point of language learning is communication for meaningful purposes.

Alternative assessment methods work well in learner-centered classrooms because they are based on the idea that students can evaluate their own learning and learn from the evaluation process. These methods give learners opportunities to reflect on both their linguistic development and their learning processes (what helps them learn and what might help them learn better). Alternative assessment thus gives instructors a way to connect assessment with review of learning strategies.

Features of alternative assessment:

Assessment is based on authentic tasks that demonstrate learners' ability to accomplish communication goals
Instructor and learners focus on communication, not on right and wrong answers
Learners help to set the criteria for successful completion of communication tasks
Learners have opportunities to assess themselves and their peers

Designing tasks for alternative assessment

Successful use of alternative assessment depends on using performance tasks that let students demonstrate what they can actually do with language. Fortunately, many of the activities that take place in communicative classrooms lend themselves to this type of assessment. These activities replicate the kinds of challenges, and allow for the kinds of solutions, that learners would encounter in communication outside the classroom.

The following criteria define authentic assessment activities:

They are built around topics or issues of interest to the students
They replicate real-world communication contexts and situations
They involve multi-stage tasks and real problems that require creative use of language rather than simple repetition
They require learners to produce a quality product or performance
Their evaluation criteria and standards are known to the student
They involve interaction between assessor (instructor, peers, self) and person assessed
They allow for self-evaluation and self-correction as they proceed

Introducing alternative assessment

With alternative assessment, students are expected to participate actively in evaluating themselves and one another. Learners who are used to traditional teacher-centered classrooms have not been expected to take responsibility for assessment before and may need time to adjust to this new role. They also may be skeptical that peers can provide them with feedback that will enhance their learning.

Instructors need to prepare students for the use of alternative assessments and allow time to teach them how to use them, so that alternative assessment will make an effective contribution to the learning process.

Introduce alternative assessment gradually while continuing to use more traditional forms of assessment. Begin by using checklists and rubrics yourself; move to self and peer evaluation later.
Create a supportive classroom environment in which students feel comfortable with one another (see Teaching Goals and Methods).
Explain the rationale for alternative assessment.
Engage students in a discussion of assessment. Elicit their thoughts on the values and limitations of traditional forms of assessment and help them see ways that alternative assessment can enhance evaluation of what learners can do with language.
Give students guidance on how to reflect on and evaluate their own performance and that of others (see specifics in sections on peer and self evaluation).

As students find they benefit from evaluating themselves and their peers, the instructor can expand the amount of alternative assessment used in the classroom.

POPUP: SAMPLE PERFORMANCE TASKS FOR ALTERNATIVE ASSESSMENT

Alternative assessment methods

Effective alternative assessment relies on observations that are recorded using checklists and rubrics.

Checklists

Checklists are often used for observing performance in order to keep track of a student's progress or work over time. They can also be used to determine whether students have met established criteria on a task.

To construct a checklist, identify the different parts of a specific communication task and any other requirements associated with it. Create a list of these with columns for marking yes and no.

For example, using a resource list provided by the instructor, students contact and interview a native speaker of the language they are studying, then report back to the class. In the report, they are to

Briefly describe the interviewee (gender, place of birth, occupation, family)
Explain when and why the interviewee came to the United States
Describe a challenge the person has faced as an immigrant
Describe how the person maintains a connection with his/her heritage

Students are told that they will need to speak for a minimum of three minutes and that they may refer only to minimal notes while presenting. A checklist for assessing students' completion of the task is shown in the popup window.

Checklists can be useful for classroom assessment because they are easy to construct and use, and they align closely with tasks. At the same time, they are limited in that they do not provide an assessment of the relative quality of a student's performance on a particular task.

POPUP: CHECKLIST FOR ORAL PRESENTATION OF INTERVIEW

Rubrics

Whereas a checklist simply provides an indication of whether a specific criterion, characteristic, or behavior is present, a rubric provides a measure of quality of performance on the basis of established criteria. Rubrics are often used with benchmarks or samples that serve as standards against which student performance is judged.

Rubrics are primarily used for language tasks that involve some kind of oral or written production on the part of the student. It is possible to create a generic rubric that can be used with multiple speaking or writing tasks, but assessment is more accurate when the instructor uses rubrics that are fitted to the task and the goals of instruction.

There are four main types of rubrics.

1. Holistic rubrics

Holistic scales or rubrics respond to language performance as a whole. Each score on a holistic scale represents an overall impression; one integrated score is assigned to a performance. The emphasis in holistic scoring is on what a student does well.

Holistic rubrics commonly have four or six points. The popup window shows a sample four-point holistic scale created for the purposes of assessing writing performance.

A well-known example of a holistic scale is the American Council on the Teaching of Foreign Languages (ACTFL) Proficiency Guidelines (1986). However, the ACTFL guidelines are not appropriate for classroom use, because they are intended for large-scale assessment of overall proficiency and are not designed necessarily to align with curricular objectives or classroom instruction.

Holistic scoring is primarily used for large-scale assessment when a relatively quick yet consistent approach to scoring is necessary. It is less useful for classroom purposes because it provides little information to students about their performance.

2. Analytic rubrics

Analytic scales are divided into separate categories representing different aspects or dimensions of performance. For example, dimensions for writing performance might include content, organization, vocabulary, grammar, and mechanics. Each dimension is scored separately, then dimension scores are added to determine an overall score.

Analytic rubrics have two advantages:

The instructor can give different weights to different dimensions. This allows the instructor to give more credit for dimensions that are more important to the overall success of the communication task. For example, in a writing rubric, the dimension of content might have a total point range of 30, whereas the range for mechanics might be only 10.
They provide more information to students about the strengths and weaknesses of various aspects of their language performance.

However, analytic scoring has also been criticized because the parts do not necessarily add up to the whole. Providing separate scores for different dimensions of a student's writing or speaking performance does not give the teacher or the student a good assessment of the whole of a performance.

POPUP: HOLISTIC SCALE FOR ASSESSING WRITING

POPUP: ANALYTIC SCALE FOR ASSESSING SPEAKING

3. Primary trait rubrics

In primary trait scoring, the instructor predetermines the main criterion or primary trait for successful performance of a task. This approach thus involves narrowing the criteria for judging performance to one main dimension.

For example, consider a task that requires that a student write a persuasive letter to an editor of the school newspaper. A possible primary trait rubric for this task is shown in the popup window.

This kind of rubric has the advantage of allowing teachers and students to focus on one aspect or dimension of language performance. It is also a relatively quick and easy way to score writing or speaking performance, especially when a teacher wants to emphasize one specific aspect of that performance.

4. Multitrait rubrics

The multitrait approach is similar to the primary trait approach but allows for rating performance on three or four dimensions rather than just one. Multitrait rubrics resemble analytic rubrics in that several aspects are scored individually. However, where an analytic scale includes traditional dimensions such as content, organization, and grammar, a multitrait rubric involves dimensions that are more closely aligned with features of the task.

For example, on an information-gap speaking task where students are asked to describe a picture in enough detail for a listener to choose it from a set of similar pictures, a multitrait rubric would include dimensions such as quality of description, fluency, and language control, as the example in the popup window shows.

POPUP: PRIMARY TRAIT RUBRIC

POPUP: MULTITRAIT RUBRIC

Incorporating alternative assessment into classroom activities

Instructors should plan to introduce alternative forms of assessment gradually, in conjunction with traditional forms of testing. Using a combination of alternative assessments and more traditional measures allows the instructor to compare results and obtain a more comprehensive picture of students' language performance than either alternative or traditional measures alone would provide.

At first, the instructor should use checklists and rubrics to evaluate student performance but not ask students to do self and peer evaluation. When creating checklists and rubrics, instructors can ask students to provide input on the criteria that should be included in each. This approach gives the instructor time to become more comfortable with the use of alternative assessments, while modeling their use for students. The process helps students understand how they will benefit from alternative assessment and how they can use it effectively.

Because alternative assessment depends on direct observation, instructors can most easily begin to use it when evaluating students' writing assignments and individual speaking tasks such as presentations. Once an instructor has reached a level of comfort with checklists and rubrics, they can also be used when observing students interacting in small groups. When doing this, however, the instructor needs to be aware that group dynamics will have an effect on the performance of each individual.

Once students are familiar with the use of checklists and rubrics for evaluation, they can gradually begin to assess their own learning and provide feedback to their peers. This aspect of alternative assessment can easily be included in the evaluation segment of a lesson (see Planning a Lesson). In classrooms where traditional forms of assessment are required, this gives the instructor multiple ways of measuring progress without increasing the time students spend taking traditional tests.