Friday, February 10, 2017

The Problem With Average

One of the topics that has been discussed a lot in our work moving from a traditional grading system to one that is standards-based has to do with using an average.  For as long as I can remember teachers have used averaging to determine student’s grades.  There have been different methods for doing that, but the most common is most likely something you will recognize, at least in high school classes.  Each assignment would be given a point value and over the course of a grading term the teacher would record the total number of points possible as well as the total number of points that each student accumulated.  When it came time to determine a grade, the teacher would divide the number possible into the number the student accumulated — average — to determine a percentage.  This percentage would then be compared to a grading scale to determine what letter grade the student had earned. 

Years ago some teachers would adhere to “the curve” which was also based on average and in a bell shape.  A close friend and mentor of mine once told me that most of life is based on the bell curve, everything from intelligence to height.  There are more people in the middle — average — than at either extreme.  I once heard that 5 foot 10 inches is the average height for American men, meaning most of us are that tall, give or take an inch.  In the classroom for years it was said that a letter grade of C is average, and in the bell curve classroom, regardless of what all of the students scored, the instructor would assign most students a C.  Eventually, however, educators tended to move away from this toward what is called a criterion-referenced system, which basically meant a criterion was established and students would get the grade based on how they scored compared to the criteria.

While using the average has been common practice for years, we will be for the most part setting it aside as we go forward with standards based grading.  A major reason for that is because there are problems with average.  Rick Wormeli, one of the leaders in the grading reform movement, points out a number of reasons. One of the most obvious is because society’s definition of normal or average changes over time.  I mentioned above that in the traditional grading system, C was considered average in the A-B-C-D-F system.  However, talk to a teacher who has been in the classroom for thirty years and they will tell you that their “average” has shifted.  In that system the dominant number of grades should be C, however, many teachers saw an inversion with more high grades and more low grades than C’s.  And, if you ask those same teachers whether the quality of C was the same thirty years ago as it is today, to a person they will tell you “No.”  The quality has declined.  In reality, today’s A looks a lot more like yesterday’s C.  There are numerous reasons for this grade inflation, but the fundamental reason is tremendous pressure from various forces on teachers.  Another example of this in a non-school setting are ribbon placings at the county fair.  In the Danish ribbon system of blue, red, and white, red was the average and white was below average.  However, due to hurting young people’s feelings judges started giving virtually no white ribbons and thus the curve was shifted with the introduction of the purple ribbon being better than a blue.  Technically white ribbons can still be rewarded, but unless a project does not meet clearly defined minimum standards, they are not given.  Today, a blue ribbon is average whereas they were once rewarded to the best.  

A very important reason that average is no longer of use is because standards based grading is criterion referenced.  Averaging is used to rank and separate people.  We have taken steps to eliminate that in the past six years in other ways because fundamentally our goal is to assess students on what they have learned and how well they have learned it, not rank them.  Our goal is that everyone completes the standards and for that reason it makes no sense to average and rank students.  Learning is not a competition.  Some will learn at a higher level than others, and that’s fine.  Some will learn faster than others, and that is fine.

Wormeli also points out that averaging was invented in statistics to get rid of the influence of any one sample error in experimental design.  It is a way to eliminate the outlier result in order to get an overall general idea about performance.  When it comes to learning, more specifically measuring learning, it is more important to look along a continuum and to determine what a student ultimately learns.  For example, on a four point scale, one being low, over five assessments on a learning target a student scores 1-1-2-4-4.  Would you say that the student has successfully learned the target?  Would you give them a score of 4?  Or, does the average of 2.4 better reflect the student’s learning?  With two consecutive scores of 4 I would say that the student has learned it quite well.  If we were to award the average, that does not indicate how well he has learned.  It would be misinformation.

A common application of average that many of us tune in to is sports, and no sport is more obsessed with numbers and statistics than baseball.  Isn’t it interesting that the most common average used in baseball is batting average, which is basically the number of hits a player gets divided by the number of official at-bats.  Even in at the highest level of play — Major League Baseball — the very best hitters have an average a bit over 30%, or in baseball terms, .300.  As of September 12, 2016, of the hundreds of players on major league rosters, 22 players have an average of .300 or better, with Daniel Murphy of the Washington Nationals leading with .344.  Now certainly they use average in a different manner than we typically do in the classroom, but in recent years the decision-makers in baseball have decided that this statistic does not tell us enough about the value of a player.  General managers, managers, and coaches have started paying a lot more attention to other statistics, and experts in “saber metrics” have been hired by many teams to crunch data to help make decisions to improve the team’s odds of winning.   OPS, or “On Base Plus Slugging Percentage” is a statistic that many believe is more important and accurate in determining a player’s value to the team.  The same with WAR — Wins Above Replacement — which is a statistic used to summarize a player’s total contributions to a team in one statistic.  Would it surprise you that Daniel Murphy ranks quite a bit lower in both of those last two statistics than he does with batting average?  The point being made is that there is a lot more to the story than an average, which is very limiting and captures just a very small snapshot of the big picture, not only on the baseball field, but  in the classroom as well.



As we move into this new phase of grading at NFVHS our goal is to provide a more comprehensive view of what a student has learned and what they can do.  There is a lot more to it than looking at a grade average in a class, and as we start to use this system I am sure that we will be much better prepared to focus on those things a student does not know so that he/she does have the knowledge base they need for the next stage in their life after high school.

No comments:

Post a Comment