A Statistical Analysis of Opening Influence Upon Game Result

With Emphasis Upon The Scandinavian Defense

 

By Mark Moore

 

Rigorous accounting reveals an unambiguous pattern in the influence of the Opening upon the game Result.  Interpretation of the statistics should compel the scientifically motivated tournament player to embrace specific Openings for their numerically affirmed superiority and discard others entirely.  Results show, for example, that Black should never play the Scandinavian Defense, not in a serious game with a similarly rated opponent.  Because it is fully Black's choice to play the Scandinavian or avoid the temptation, and because the Scandinavian performs worst of all among the 20 most frequently played openings, Black's confidence in this opening is hollow.  Nevertheless, the Scandinavian remains irrationally popular, and is currently the seventh most frequently played opening in the significant games of the past year, featuring players rated 2400 and above exclusively.

 

Choice of opening and variation may alternate from White to Black several times before the appropriate ECO code is correctly applied.  White plays 1. e4, forcing Black to choose a reply from among the Sicilian, Scandinavian, Caro-Kann, Petroff, and French, to list a few examples in the order that they are most frequently played.  After Black replies with 1. ... c5, by far the most frequent reply, the choice returns to White, who often decides upon the Alapin variation of the Sicilian.  But 2. c3 is yet another confounding choice according to the numbers.  The Alapin Variation of the Sicilian is nearly as bad for White as the Scandinavian for Black.  Indeed, of the 20 most frequently played, the Alapin is the worst for White of those wherein White controls the decision of Opening or Variation.  Astonishingly, the Alapin is the fourth most frequently played Opening.  Again, perhaps the most curious aspect is that this decision is purely White's.  Black may know that the Alapin is best for Black, but while unable to force White to play it, statistics demonstrate that it is reasonable for Black to expect the Alapin, and to be easily well prepared.

 

The elements of choice and volition contain the paradox at the crux of the present exploration.  The mystery is thus:  why does Black play the Scandinavian when six primary alternatives in the top 20 are statistically superior?  White's inclination  to the Alapin is equally baffling.  While the present purpose does not include an exploration of players' motivations,  there shall be a brief glance at two specific players who escaped the odds.  Did they succeed by specializing in these otherwise losing Openings?  We shall see a bit later.  Now, let us look at some of the figures.

 

Two particularly abysmal Openings are in focus at the moment, the worst in Black's control being the Scandinavian, also called the Center Counter, whose ECO code is B01 in the tables that follow.  The worst Opening under White's control, the Alapin Variation of the Sicilian, is B22 in the ECO column of the tables.  Thus, we begin the analysis with the worst of both worlds.  Let us begin the comparative analysis with a look at the top 20 most frequently played openings, shown in Table 1. 

 

 

Count

 

ECO

 

White

Black

Draw

368

 

B90

Sicilian Defense Najdorf Variation

124

71

173

348

 

E15

Queen's Indian Defense Accelerated Fianchetto

108

44

196

318

 

B33

Sicilian Defense Sveshnikov Variation

95

55

168

314

 

B22

Sicilian Defense Alapin Variation

72

65

177

291

 

D15

Queen's Gambit Slav Defense Geller Gambit

89

41

161

274

 

D45

Queen's Gambit Declined Anti-Meran Defense

81

42

151

237

 

B01

Scandinavian Defense

101

44

92

235

 

E11

Bogo-Indian Defense

71

41

123

232

 

B12

Caro-Kann Defense 3.c5 Attack

76

47

109

230

 

E12

Queen's Indian Defense

57

47

126

227

 

C42

Petroff's Defense

52

19

156

220

 

C10

French Defense Rubinstein Variation

50

34

136

203

 

B07

Pirc Defense

60

47

96

194

 

E32

Nimzo-Indian Defense Classical Variation

56

43

95

193

 

B30

Sicilian Defense Rossolimo Variation

67

44

82

192

 

D11

Queen's Gambit Slav Defense

50

40

102

191

 

B42

Sicilian Defense Paulsen Variation Kan System

64

46

81

168

 

C78

Ruy Lopez Moeller Attack

57

36

75

168

 

D27

Queen's Gambit Accepted

47

18

103

167

 

C45

Scotch Game

39

33

95

 

Table 1.

 

Here, the games are sorted in order of the most frequently played, thus revealing the most popular Openings.  The priority of these 20 openings is substantial, for of the 500 total primary Openings in the ECO list, over 22% of all the games in the dataset of 21,293 games were played in 4% of the Openings.  Nearly everyone will recognize a variety of her or his favorite Opening in this list.  Notice the Alapin at fourth from the top of the list, and the Scandinavian at seventh.

 

Despondent Black finds upon this chart no vestige of supremacy over White.  Black never wins more games than White in the preferred Openings.  The half move tempo gains the expert White player a significant numerical advantage in the game.  For this reason, Black must consider success to include wins and draws.  Perhaps this is well known and commonly practiced by the expert.  However, if Black Wins and Draws are added together and Table 1. is resorted by the numerical success of Black, a completely new order arises in the hierarchy of Openings.

 

 

Count

 

ECO

 

White

Black

Draw

Black %

220

 

C10

French Defense Rubinstein Variation

50

34

136

0.77

227

 

C42

Petroff's Defense

52

19

156

0.77

314

 

B22

Sicilian Defense Alapin Variation

72

65

177

0.77

167

 

C45

Scotch Game

39

33

95

0.77

230

 

E12

Queen's Indian Defense

57

47

126

0.75

192

 

D11

Queen's Gambit Slav Defense

50

40

102

0.74

168

 

D27

Queen's Gambit Accepted

47

18

103

0.72

194

 

E32

Nimzo-Indian Defense Classical Variation

56

43

95

0.71

203

 

B07

Pirc Defense

60

47

96

0.70

274

 

D45

Queen's Gambit Declined Anti-Meran Defense

81

42

151

0.70

318

 

B33

Sicilian Defense Sveshnikov Variation

95

55

168

0.70

235

 

E11

Bogo-Indian Defense

71

41

123

0.70

291

 

D15

Queen's Gambit Slav Defense Geller Gambit

89

41

161

0.69

348

 

E15

Queen's Indian Defense Accelerated Fianchetto

108

44

196

0.69

232

 

B12

Caro-Kann Defense 3.c5 Attack

76

47

109

0.67

191

 

B42

Sicilian Defense Paulsen Variation Kan System

64

46

81

0.66

368

 

B90

Sicilian Defense Najdorf Variation

124

71

173

0.66

168

 

C78

Ruy Lopez Moeller Attack

57

36

75

0.66

193

 

B30

Sicilian Defense Rossolimo Variation

67

44

82

0.65

237

 

B01

Scandinavian Defense

101

44

92

0.57

 

Table 2.

 

In Table 2, the desolate Scandinavian Defense falls from it's popularity ranking of number seven to it's success ranking of number twenty.  Meanwhile, notice that the Alapin rises a notch on the Black list!  But let us not dwell entirely in the negative space.  What optimism does the chart contain for Black?  What Openings in the control of Black, even for an instant, offer hope? 

 

In the top four, a French, a Petroff, a Sicilian, and a Scotch are appealing, but somewhat deceptive, because Black must depend upon White to play into these Openings.  Black is not in full control of the helm when the Rubicon is crossed.  The Pirc Defense, however, affords Black great liberty and choice in the first few developing moves, and is very appealing with a 70% success rate.  In a tournament where a player has 20 games with the Black pieces, and playing the Pirc according to these numbers, in conjunction with accurate Opening knowledge, Black could reasonably expect to earn 8.9 points.  Whereas, if Black applies the same to the Scandinavian Defense, the score averages 1/2 point less, and thus make the difference between first and second in the standings.

 

Now let us provide an equal and opposite treatment of the issue with White.  Greater strictness must be demanded of the definition of success for White, owing to the aforementioned half move tempo advantage.  And thus, success shall be limited to wins, except where White's opponent is rated more than 200 points higher.  To obtain a chart for White we must simply resort Table 1 by White wins to produce Table 3.

 

 

Count

 

ECO

 

White

Black

Draw

White%

237

 

B01

Scandinavian Defense

101

44

92

0.43

193

 

B30

Sicilian Defense Rossolimo Variation

67

44

82

0.35

168

 

C78

Ruy Lopez Moeller Attack

57

36

75

0.34

368

 

B90

Sicilian Defense Najdorf Variation

124

71

173

0.34

191

 

B42

Sicilian Defense Paulsen Variation Kan System

64

46

81

0.34

232

 

B12

Caro-Kann Defense 3.c5 Attack

76

47

109

0.33

348

 

E15

Queen's Indian Defense Accelerated Fianchetto

108

44

196

0.31

291

 

D15

Queen's Gambit Slav Defense Geller Gambit

89

41

161

0.31

235

 

E11

Bogo-Indian Defense

71

41

123

0.30

318

 

B33

Sicilian Defense Sveshnikov Variation

95

55

168

0.30

274

 

D45

Queen's Gambit Declined Anti-Meran Defense

81

42

151

0.30

203

 

B07

Pirc Defense

60

47

96

0.30

194

 

E32

Nimzo-Indian Defense Classical Variation

56

43

95

0.29

168

 

D27

Queen's Gambit Accepted

47

18

103

0.28

192

 

D11

Queen's Gambit Slav Defense

50

40

102

0.26

230

 

E12

Queen's Indian Defense

57

47

126

0.25

167

 

C45

Scotch Game

39

33

95

0.23

314

 

B22

Sicilian Defense Alapin Variation

72

65

177

0.23

227

 

C42

Petroff's Defense

52

19

156

0.23

220

 

C10

French Defense Rubinstein Variation

50

34

136

0.23

 

Table 3.

 

The foundation for the title of this study appears in position number one of Table 3.  Of all the Opening moves that Black might play,  it is the Scandinavian Defense that White hopes for most, has statistical reason to expect, wins the largest number of games in, and is best prepared for.  Who plays the Scandinavian?  

 

Since we have dwelt in darkness and doom, let us begin the answer with a success story.  Tiviakov played the largest number of games in B01 and was one of only two, in a field of 117 players of the Scandinavian (with 5 or more games in B01), who scored more points than their white opponents overall (the second being Milanovic).  Tiviavkov scored 5 wins, seven draws, and 3 losses with Black pieces, for a total point score of 8.5 to 3.  However, all of the wins were scored against opponents of considerably lower rating.  Sadly, several defeats were also delivered by lower rated players.   But play of B01 appears to be aberant, even for Tiviakov, though, for chessgames.com reports over 200 games in the Sicilian for Tiviakov and only 19 in the Scandinavian.  Perhaps it is a surprise defense reserved for certain opponents.  This conditional success is swamped dramatically by the appalling performance of many of the other 115 players of the Scandinavian.

 

The most stubborn advocates of this Opening are Muse and Tomczak, who each have seven games played in this dataset.  Muse scored one point of a possible seven, while Tomczak scored 2.  The middle ground was held by Laylo, Prie, and Savic who scored nearly even in seven games.  The vast majority played fewer than 4 games in B01 in this dataset representing the most recent 13 months of important games.  What exactly is the problem with the Scandinavian Defense?

 

Black is rarely more than a point behind in the first 15 moves, even in games that ended 1-0, indicating that a pattern of middlegame positions have emerged foreboding problems perhaps residual of the reversal of natural development of the pieces.  The broadest generalization is that Black is annoyed throughout the game by the lurching, vulnerable Queen and by the deferral of normal piece development resulting from the early deployment of her majesty in the center board on move two. 

 

Suppose we presume that the opening predisposes the middlegame to problems.  Next, we feed the games to Fritz for analysis, and according to our presumption, count the positions where Black is rated more than a point behind White by the computer.  If a positional pattern emerges, we may be on the way to linking the tempting Opening to its farther flung ailments.  But alas, that endeavor is for another time and another study.  Until such time as exact analysis is revealed, the statistics bid you look to the Pirc.

 

 

 

Comments

 

If you wish to comment on this article, please write to the address embedded in the picture below.

 

 

 

 

Notes

 

The dataset for the present study consists of 21,293 games representing most of the important results from the last 13 months, including the calendar months April, 2006 through May, 2007.  Constraints were applied to the data such that only those games wherein both players had obtained ratings of greater than or equal to 2400 Elo were included.  Games qualifying for inclusion in the set contained sufficient moves to achieve an ECO code as an Opening descriptor, and thus no BYEs or forfeitures could influence the tallies.  Game data was obtained in pgn format and downloaded from Internet sites.  The calculation of results was carried out by a Visual Basic program designed by the author of the study.   

 

Various sources, especially Internet sites, publish statistical results on Openings.  However, these are less focused.  There is little interpretation of the statistics, no analysis of a particularly problematic, or anomalous Opening.  These sources present statistics from a dataset with too many games ranging over a time period that is too lengthy, and thus cannot be especially germane to recent trends.  It is not clear that these sources are in use by individuals in development of Opening repertoire for tournament play.  The vast body of data requires bounded scope, focus, and an interpretation if any of the statistical results are to be useful in development of play.

 

The use of statistics presumes that the most highly rated players are acutely aware of the most recent theory for the openings they regularly practice, and furthermore, that they are using computers to accurately appraise the Openings into the middlegame.  Thus, the statistics of wins, losses, and draws represent the best current evaluation of the merit of these Openings in the field.  The interpretation of the author is that practitioners will remove unsuccessful Openings from their repertoires and those remaining will be played at their best theoretical level.  This constant pruning will affect the statistics, which will require recalculation, giving rise to a feedback loop. 

 

As for the relation between statistics and the games, do these results confirm theoretical analysis of the individual Openings?  Are the results the same if the games are played only by computers rated 2400 and over?  Can statistical analysis be applied to the game score calculated by chess engines at key points, such as when one player goes a point behind, as a means of searching for standard points where something goes wrong or right in an Opening?  These are questions which the author will explore in future work with the program developed for the present article.

 

 

Keywords

 

Chess Theory

Chess Statistics

Opening Theory

Openings Theory

Chess Opening

Chess Openings

Opening Statistics

Openings Statistics

Opening analysis

Openings analysis

Scandinavian Defence

Encyclopedia of Chess Openings

 

 

 

 

 

 

 

web counter