Category Archives: Assessment

Assignments and Accountability

I’ve just been reading Howard Rheingold‘s case study on the Connected Learning website on Jim Groom’s “DS106” course, and was inspired to write down a couple of ideas. Coincidentally, one of the central elements to the success of the DS106 course (and Connected Courses in general) seems to be blogging – or maybe simply writing. Or maybe even more simply, producing materials/texts/ideas to share with the world – regardless if anyone is going to read it. I was particularly inspired to write down these thoughts when I got to Rheingold’s section on the “Assignment Bank” – a repository of various assignment types from which students could select to “[model] their learning for others.” What I found interesting about this wasn’t just that Groom had handed over logistic (and epistemic?) authority over to his students by letting them come up with their own assignments (and assignment genres), but the purpose of so doing was to encourage students to be accountable to their own learning, as well as to the larger learning community (i.e., the course). In my own teaching and research experience at the K-12 level, and perhaps even more so in higher ed, assignments seem to take on a weird role that straddles 1) the maintenance of a tradition of rigor (sometimes for rigor’s sake), and 2) getting more stuff “into the heads” of individuals (this is, presumably, important in formal educational contexts because of limited class periods – or in other words, limited access to “instruction”). But in the context of DS106, assignments seem at least to have a different, and arguably more impactful purpose. Assignments are meant to draw upon relevant themes and the production of digital artifacts, and additionally, to serve as content/material for exploring the ideas and concepts that are central to the course. It’s kind of meta, but it’s also an insanely awesome feedback loop, where the topics of the course are explored through student-produced artifacts. The success of the course as a learning experience is therefore dependent on the participation of those taking the course. In other words, the students are accountable for making the course what it is, and what it can be.

Stepping back a bit, it seems to me that one huge advantage of this is that the purpose of the assignment is to create and maintain two levels of accountability. Assignments that are interest-driven in this way are a vehicle for encouraging students (maybe we should just call them “participants”?) to be accountable for their own learning (i.e., they learn by participating in the creation of a digital artifact), as well as to be accountable to the knowledge community – their digital artifacts are, in a way, levers for collective knowledge construction. They support the group’s learning discourse. This seems particularly difficult to do in learning settings where all the decision making regarding assignments, assessments, and activities lie with a privileged authoritarian or institution. Just some thoughts…


Why Massachusetts Should Waive No Child Left Behind

In a report recently released by the Boston Globe, Massachusetts will be applying for a waiver from the No Child Left Behind law provision that requires 100 percent of public school students test at the “proficient” level or above on state exams by 2014.  According to the Globe, 80 percent of Massachusetts’ schools (and 90 percent of its districts) “missed proficiency targets” on this past year’s MCAS exams.  To the casual reader, these numbers seem astronomically high.  Absurd even.  How can 80 percent of Massachusetts’ schools – considered one of the “smartest” states in the nation – be underperforming?  Perhaps the answer lies in our definition of “performance”.

From the Mass DOE’s website, you can review the entire MCAS report by grade level and student group, and what you’ll notice if you spend some time slogging through all of the data and sorting options, is that although there is a wide range of proficiency levels reported across the various academic areas tested (English language, Mathematics, Biology, Physics, Chemistry, and Technology) the results do seem overwhelmingly positive, until you compare these results to the bar set forth by NCLB.

Navigating the DOE’s Adequate Yearly Progress (AYP) summary, one get’s a very interesting picture.  Most noticeably, a quick glance down the “Met Target” columns for performance and improvement lists a long series of “No”s.   Not one single group met proficiency or improvement targets in 2011.  More striking are the scores of non-dominant students (i.e., limited English proficiency, non-White, low income, special education, etc.).  The Asian/Pacific Islander group is the only non-dominant demographic to have met annual performance and improvement targets since 2007.  So, what’s going on here?

Definitions of Terms

Let’s consider some of the definitions of terms on the DOE’s website.

Students and schools are measured against a “Composite Performance Index” (CPI), which the DOE defines as:

A 100-point index that assigns 100, 75, 50, 25, or 0 points to each student participating in MCAS and MCAS-Alternate Assessment tests based on their performance. The total points assigned to each student are added together and the sum is divided by the total number of students assessed. The result is a number between 0 and 100, which constitutes a district, school or group’s CPI for that subject and student group. The CPI is a measure of the extent to which students are progressing toward proficiency (a CPI of 100) in ELA and mathematics. CPIs are generated separately for ELA and mathematics, and at all levels – state, district, school, and student group.

A school’s (or subgroup’s) CPI is used to indicate not just performance, but improvement as well, which the website describes as:

Descriptive term corresponding to the amount of aggregate CPI gain a school or district achieved in one year, from 2010 to 2011. The improvement that a school or district is expected to make from one year to the next is expressed not as a single numeric target, but as a target range, also called the “on target range.” The size of the target range varies depending on the size and score distribution of the particular group being measured. (The standard target range is plus or minus 2.5 CPI points, but may be as large as plus or minus 4.5 CPI points for groups smaller than 100.) The five improvement rating categories are: Above Target (improved above target range), On Target (improved within target range), Improved Below Target (improved above the baseline but below the target range), No Change (gain was equivalent to baseline plus or minus the target range), and Declined (gain was below baseline and below the target range).

Okay, so we have a quantified value for “improvement” – 2.5 points.  That makes sense, considering that school districts must have some measure of proof to bring to the state to demonstrate adequate yearly progress (another stipulation of NCLB).  Let’s withhold criticism of the 2.5 and 4.5 values for now, simply for argument’s sake.

So what are the performance targets for 2011?

The 2011 state performance target for ELA is a CPI of 95.1 points; for mathematics, 92.2.

Keeping in mind that NCLB requires 100 percent of all student groups score at or above the “proficient” level, these numbers aren’t very surprising, and are in line with the high standards put forth by the law.  But here’s where NCLB and the Mass DOE’s program of accountability starts to really break down.

Statistical Unliklihood

If any single group had met the minimum target rate of performance for the English Language exam (95.1), a 1.63 point increase per year would place that group at 100 percent “proficient” for 2014. For Math, had any group met the target score (92.2), that group would therefore need to improve at a rate of 2.6 points each year up to the 2014 deadline.  However, the highest performing groups for English and Math (White and Asian/Pacific Islander, respectably) scored 90.9 and 89.2 on those exams.  Since 2007, Whites have only improved a total of 1.2 points on English, while Asian/Pacific Islanders have demonstrated a 3.7 point gain – total.  The highest one-year increase in CPI for either English or Math was shown in the 2007 results. Hispanics experienced the most dramatic increase in both English and Math that year, with a 2.9 point increase and a 5.3 point in the subjects, respectably.  Their performance scores?  English: 70.2, Math: 57.7.  On the 2008 exams, no groups showed an increase in English scores, and on the Math test, Hispanics improved the most, but only by 2.4 points (Asian/Pacific Islanders improved the least of any group in Math that year – 1.2 points).

On its face, this is reason enough why Massachusetts should apply for a waiver for the 2014 proficiency deadline: the statistical probability of not only the highest performing groups, but every segmented student group, experiencing the mandated increase in performance by 2014 just isn’t realistic.  At the risk of facing sanctions that could further eliminate federal funding, mandate school restructuring, or threaten the progress of some districts in achieving legitimacy in the eyes of NCLB,  it is clear that the state needs to step up to protect its schools.

Unrealistic Assumptions

Advocates of the ‘No Child’ law frequently argue that the high standards set forth by the provision are benchmarks to which schools should strive for, regardless of their unattainability.  But these same advocates seem to miss the forest for the trees, or in this case, the numbers for the assessment.  The assumptions put forth by the NCLB provisions are simply unrealistic.  100% proficiency implies an almost Orwellian vision of unquestioning obedience and devotion to a system that clearly struggles to serve the interests of a large portion of its non-dominant constituency.  It assumes that on the day of any given MCAS exam, every student is present and able to give 100 percent of his or her own attention and effort to an exam that still suffers from issues of cultural bias, and is still being tweaked by test makers.  It assumes that those students, despite any experience of economic hard ship, difficulties in family relationships, or cognitive/neurological challenges, are able to demonstrate an arbitrary measure of “competence” for one week out of the school year.

High standards are fine, as long as those same standards are not used to punish the same people they are meant to lift up.  A provision of 100% proficiency assumes a “buy in” to the law by the same communities and cultures that the education system continuously underserves.  Even without considering the issues that arise from trying to convince a disenfranchised population to embrace a system carved not from its own hand, but rather from a social institution that promotes the same elitism and stratification of social class that led to its disenfranchisement in the first place, a 100% proficiency provision simply ignores the historical factors that contributed to the “need” for a law like No Child Left Behind at all.  By threatening to revoke the funding that these schools need to recruit and retain teachers, provide access to the same resources that their more fortunate peer communities enjoy, and build or refurbish crumbling infrastructure, NCLB does more to raise the hairs on the necks of our educators (and students) than it does to empower the heads on top of those necks.

It’s Not About the Numbers

If the 100% proficiency clause was intentionally rhetorical, simply a political gesture meant to signal “We’re serious about high standards”, then advocates of NCLB have interpreted its meaning too literally.  Numbers hold no sacred symbolic value in our educational culture.  We have developed innumerable ways to game the system, from blatant cheating to test prep programs.  If NCLB needs to portray an image of what “success” looks like for our schools, using numbers clearly isn’t doing the job.  If we consider that 100% proficiency means holistically – that all of our students are able to achieve in school – it becomes a little clearer that schools and students alone aren’t the problem.  The problem rather, is how we define what the “problem” is.  That our students are performing better each year is a good sign, regardless if that improvement comes at a rate of 2.5 CPI or not.  That some Massachusetts communities still struggle to retain their best teachers is more of an issue than a test score.  So, instead of telling schools “Do this, or else”, lawmakers should be asking schools “What do you need from us so that you can do this?”.

Alternative Assessments and Multimedia

I’ve been lucky to witness a new wave of inspired, progressive (or perhaps simply bored) instructors experiment with multimedia production as a form of summative assessment over the past couple of years.  For a long time, it seemed multimedia-production-as-assessment was only a viable option for technology-based courses with both access to media production tools and skills-driven curricula.  Perhaps due in part to the consumer electronics industry’s penchant to continuously roll out newer, shinier, cheaper toys and software specifically targeting what was once considered a population of technology illiterates, or perhaps because consumer technologies seem to be ever occupying every possible space in our professional and personal lives and the lives of our children, or even PERHAPS because education and psychology research has bent enough policy makers’ ears towards the theory that people actually learn differently , there seems now to be a critical momentum of instructors who are at least open to the idea of moving away from traditional forms of assessment and embarking into the vast abyss that is creative expression.  This is a good thing.  For many reasons.  Least of which is that, despite what some complain as a lack of emphasis on content, media production can touch on a number of learning objectives (e.g., collaboration, demonstration of competence in a specific domain, skills-development and acquisition, higher order thinking, etc.) while enabling students to try on new identities, explore different voices, and connect to their socio-historical selves.  Of course, media production doesn’t cause these outcomes. But it can do a lot to make these potential outcomes accessible to students.

All that being said, even the most gung-ho teacher will face possibly the biggest challenge standing in the way of instructional media production’s ubiquity in the education toolkit: assessment.  If the point of alternative assessment in general, and media production in particular, is to offer students a flexible, adaptive way of engaging with- and demonstrating knowledge for a topic, then one must often come to grips with a grading system that is rigid, high-stakes, and completely unindividualized.  Creative expression is, almost by definition, the antithesis of such a system.  Hence, the bane of all my instructional experience: the rubric.  Rubrics, and complicated, time-consuming to draft (especially if you plan to have a different rubric for each assignment), and can be confusing – even abstract – for students.  Actually, I’m sure they are confusing and abstract for most teachers.  But simply put, if media production is a valuable activity for students’ learning development, equally important is providing detailed, explicit feedback on students’ work.  I found a link to this rubric for a digital storytelling project at the University of Houston (it wasn’t hard to find – it was the first link that came up in a Google search for “digital storytelling rubric”), which stands out as an excellent example of what I’m talking about.  On its face, the rubric seems vast, with several criteria for evaluation, and a sophisticated checklist detailing everything from the project’s purpose to the student’s choice of production software.  But herein lies the value of this particular rubric: it leaves room for flexibility, decision making, and expression, yet clearly outlines any expectations, assumptions, and learning objectives.  Attributing a numeric value to each of the categories (though unsavory to those of us already opposed to the rigid constraints of the score-based grading system) maintains both the integrity of the evaluation, and lends an element of objectivity to a subjective exercise.

Of course, the use of a rubric doesn’t eliminate any of the other hiccups and roadblocks that disrupt the always smooth process of integrating technology into instruction, but in my experience, it has been a good touchstone for those teachers who might feel a little uneasy about taking their first steps into media production.