Computer Rankings

March 27, 2016

While this site contains a lot of reference material and a small bit of commentary, its main purpose is to support analysis of rating systems. We're interested in objective (necessarily computer-based) systems, and in particular advanced systems.

Computer-based ratings are necessarily objective in that they treat all input equally. Their only problem is that until there is enough input to be treated at all their results as an unreliable as subjective humans who can only process the smaller subset of data available to them. We are nearly to the point in the 2016 season where the computer rankings start "making sense" so I am now including the meta-ranking analysis of the computer rankings Dr. Massey reports at http://www.masseyratings.com/cbase/compare.htm.

Through week six all but 6 team-pairs are connected by no worse than an A played B played C played D played E chain. By week seven all team-pairs will be connected by no worse than an Opponents' opponents' opponents' opponent relationship. To see how far we've come in terms of inter-regional scheduling, note that as late as 1999 more team-pairs were at distance five after the season was over than are now less than half-way through the season.

It was Boyd Nation's Breadcrumbs Back to Omaha column A Look At the Distance Matrix that piqued my interest in schedule topology and its impact on advanced rating systems.

Composite Rankings

Dr. Massey orders the teams in his list by a "concensus average" defined as

The "average" or "consensus" ranking for each team is determined using a least squares fit based on paired comparisons between teams for each of the listed ranking systems. If a team is ranked by all systems, the consensus is equal to the arithmetic average ranking. When a team is not ranked by a particular system, its consensus will be lowered accordingly.

and also reports the median team rank.

There are many ways to combine ordinal ranks into one, and there's a famous theorem that proves that which one is "right" depends upon your definition of "right." In other words, there is no unambiguous way to combine lists of ordered ranks into a composite list that will meet every definition of "right." So to supplement Dr. Massey's composite I calculate these meta-rankings:

Buck: This is the Bucklin Rank. It is the best ranking for which more than half the ratings rank the team at least this rank. When there is an odd number of computer rankings it will be the same as the median.
Borda: This is the sum of Borda Counts for each rating. The Borda Count for each rating is the number of teams ranked below the team for that rating. For a given team, the ranking by Borda Count will usually be the same as the ranking by average of all ratings' ranks.
GMean: This is the geometric mean of the ordinal computer rankings. This essentially gives more weight to better ranks.
IRV: Instant Runoff Voting. Suggested by Dr. Massey, this implementation is essentially Bucklin with a tiebreaker.
Pairwise: This is the Condorcet approach to counting ranked ballots. For each team pair count the number of ratings in which one is ranked better than the other.

These composites are displayed along with constituent computer ratings as Computer Rankings Summary by... pages, each of which has a link to a "details" page that displays the values used to assign the ranks. The link from the home page is to the Pairwise comparison ranking.

These composite "meta-rankings" are included (with the computers from Dr. Massey's list) in the Computer Ratings by Retrodictive Ranking Violations report. For each team this report lists:

#RV: the total number of series for which a ranking has the winning team ranked worse than the losing team. Split series with an even number of games do not count against the rating or team.
#Cons: the total number of series for which the results are consistent with the team ranks.
#Ser: the number of opponents with which the team has played at least one game.
team name: links to the team's schedule.
rankings: by all of the computer rankings from Dr. Massey's page that rank all teams (meta rankings are italicized.)

One can compare two ordinal lists by counting the number of team-pairs that are in the same order in both lists ("concordant" pairs) and those which are in opposite order ("discordant" pairs.) The number of discordant pairs is called the distance between the two lists. I report the distance for each of the composite ranks compared to the computers' rankings and the computer rankings to each other. To give perspective, I also include the complement of that metric as the percentage of team-pairs that are concordant with respect to the rankings. See Computer Ranking Correlations.

In memory of
SEBaseball.com

Paul Kislanko