The Directed Games Graph

March 15, 2014

Imagine that each team is a point and for each game played draw an arrow from the winner's point to the loser's. With 302 teams and thousands of games you wind up with something really, really complicated. This is the raw material for any advanced rating system, so it is worth some attention.

In Too Early Ratings I described a "second order winning percentage" that is partially based upon the DGG. Partial because it considers only the shortest path(s) from one team to the other and defines "winning paths" based upon series results. Now we're going to consider every game and all paths between teams.

Well, not literally all - you can always find more paths between teams that are longer than those most-recently counted. I chose to stop counting at pathlength n if all the team-pairs that can be connected by an A beat B beat … chain are. As of this writing that occurs for the last team pair at 10 "beats."

The 2182 edges of the graph (games) form about 40.5 billion paths of up to 10 "beats" long, so it's pretty much impossible to visualize the entire graph, but there are lots of ways to look at parts of it. To begin with, we'll try to see how a vertex (team) looks from the perspective of all teams and how the collection of teams looks from the perspective of a vertex. We can summarize those several ways too.

Having counted the number of paths we just make 302 lists using #paths A⇒B − #paths A⇐B for each team A, listing all teams B in ascending order. Treating these as "votes" we list the teams based upon the average ranking by all teams (actually, the "Borda Count" which is a little better measure than the mean), noting along the way the rank for which at least 152 teams rank the team at least that highly (the "Bucklin rank.")

Of more interest is the rank each team assigns to itself. Since A⇒A − A⇐A is zero, this tells us how many teams have a better position in the graph than the team, as viewed from the team's perspective.

The Directed Games Graph Analysis includes the (Borda) rank, Borda count, best, Bucklin, and worst ranks, the rank the team assigns itself, and the count of #1 ranks, #2 ranks, and so on through #302 ranks. The team name links to a page for each team that includes the list of teams voting the team each rank, along with how the team ranked all the teams.

The team's "ballot" shows the (#paths A⇒B#paths A⇐B) value used to assign the ranking, along with the counts for each component. Then the individual counts in each direction for PathLengths from 1 to the radius of the graph are shown.

These are not ratings in the usual sense, but as the season progresses the comparisons between teams provide insight into how advanced ratings like the ISR are affected by game results.


40,556,633,859 to be precise. After the games of 15 March, the radius had been reduced to 9 "beats" and there were 13,251,861,917 paths between distinct teams.

In memory of
SEBaseball.com

Paul Kislanko