The Application of Technology to the Judging Process

Part 2 - Point Model Examples

This is the second of three articles on the use of computer technology in the judging of figure skating competitions.  Part 1 discusses a rigorous approach to investigating the potential role of computer technology in the process of judging competitions, and describes the development of a computer model of the judging process as it currently exists.  Part 2 describes many of the specific details of this computer model for the various figure skating events.  Part 3 describes a software implementation of the model and includes a downloadable version of the annotation software developed with the model.

This article further describes a research activity begun in 1997 to understand the way judges use the marks and the point model they apply when assigning marks to a performance.

The purpose of this research was to understand the judging process as it currently exists and to determine the extent to which it can be captured in a computer algorithm.  Comparing this research effort to the ISU's proposed judging system is instructive because it highlights the many deficiencies in the ISU's effort.  I view successfully modeling the current rules of skating in a computer program a necessary first step in determining whether a computer based point model should be considered for adoption in figure skating; a step the ISU has not taken.  A successfully model of the current rules would provide an established foundation from which one could more confidently develop new views of evaluating skating, were that one's goal.

The point model and software coming out of this effort I term Computer Assisted JudgingTM, or CAJTM for short.  CAJ is not a new method of evaluating or judging competition.  It is a computer program that models the current judging process in accordance with the current rules and standards of skating; i.e., it is the name I use to refer to the current judging process coded into a computer program.

In researching the CAJ point  model, it was found the current judging process, and rules and standards of skating, can almost entirely be captured in a computer based point model with only minor departures from current standards and practices.  In this article the details of the point model specific to each skating event will be discussed to show to what extent the current skating standards can be implemented in software.  Places where the point model departs from current practices are noted.

Point Model Basics

Constructing a point model involves two major issues.  First, the correct mathematical form of the point model must be deduced; and second, the numerical values of the parameters that specify the point model must be determined.  If the wrong mathematical form of the point model is used, it is mathematically impossible to determine point model parameters, no matter how hard one tries, that will produce a completely accurate representation of how performances should be rated.

The proposed ISU point model is a linear point model that makes use of linear point increments for quality assessments.  Applying this approach to programs which span a large range of difficulty, or attempting to extrapolate this approach to significantly lower level events indicates it is not an accurate representations of the point model judges actually use.

In attempting to deduce a point model that is valid for the juvenile through senior level it was found that a non-linear point model with fractional increments for quality assessments better represented how judges actually evaluate performances.  That approach, however, is not the only choice one might consider.  Another approach would be to use both the quantity and quality of elements within a class of elements to determine a point value for each class of element, and then to combine the point values for each class of elements according to fixed weighting factors.  In other words, come up with a jump score, a spin score, a footwork score, and then combine the three scores with fixed weights reflecting the relative contribution of each type of element to the total score.  This article describe a point model based on the first approach.   It is hoped that at some point in the future the second approach can be investigated in more detail and compared to the one described here.

Software

The CAJ software began its life in 1998 as a program to allow a user to annotate a program as a judge would in marking an event.  The user can enter the element identifications and a quality assessment for each element.  For short programs and other programs with deductions for required elements the deduction values can also be entered.  Originally, the user could annotate specific quality factors and errors but when the program began to evolve into a research tool this was eliminated to simplify the interface, which had become somewhat overwhelming with its detail.  In its original form, the CAJ software was simply a tool to replace the hand written notes judges made during an event.  In its current form it can also be used as a scoring tool or to do statistical analysis of program content to investigate the mathematical properties of a point model, and the mathematical properties of combining the assessments from a panel of judges.

The user interface consists of a screen on which the user enters the identification of each element in a performance.  Depending on the event, there are five to seventeen possible elements.  Next to each element label there is a text box where the user can enter a quality assessment for the element.  Quality assessments are labeled +3 through -3 and Failed.  Under the proposed ISU system even an element on which the skater falls receives points.  In some cases many points.  Under current standards such an element is considered failed and gets no credit.  The CAJ software implements the current practice.  Any element with a major error is marked Failed and gets no credit.

For programs in which there are required elements, there is a deduction text box next to the assessment text box in which deduction values are entered.

The upper right third of the screen is devoted to a text box that is currently use for diagnostic purposes.  Were CAJ ever converted to a scoring tool this would be replaced with real time video of the performance and replay of the elements captured by the replay technician.  In that application, there would be a replay button associated with each element on the main screen.

Below the diagnostic text box, the user can enter two additional technical merit marks.   The identities of these marks depends on the event, and correspond to aspects of a program beyond the elements alone that the rules currently say are to be judged in the first mark.  For example in singles free skating, the element list is for tabulating all jumps of more than one rotations, all spins of more than three rotations and the two required footwork sequences.  The first of the two auxiliary technical marks is for connecting steps, all other content, and basic skating ability.  The second is for the overall speed of the performance.

From the main screen the user calls up a screen to enter the presentation marks.   Depending on the event, the user enters two or three marks that include all the factors that are currently considered in the second mark.  Under the CAJ point model the assessment of the factors in the second mark does not change except that the assessment is captured in two or three numbers instead of just one.

The two auxiliary technical marks and the presentation marks are entered on the standard 6.0 scale in 0.1 point increments.  This provides a precision of 1.6% in specifying the marks.  Statistical considerations indicate this is the minimum level of precision needed to separate skaters in successive places in the results.  The proposed ISU system, in contrast, uses a 10.0 scale in steps of 0.5 providing a precision of only 5% which is inadequate for the task at hand.

The user interface is best used with a touch screen, but can be operated with a mouse.  A Windows version of the CAJ software, limited to its annotation functions, can be downloaded from this site in the near future.  Playing with the program will make its operation and features far clearer than any description provided here.  It is the intent of the CAJ software to allow the user to enter all the factors that are currently considered in the judging of a program.  If it does not you are welcome to send us a flaming e-mail with suggested improvements.

For now, this version of the CAJ software will be made available free of charge to interested users who would like a software tool to enter program content and assessments.  You can use this tool to record assessments at home in front of the TV or on a laptop at a competition if you are really hardcore.  A more detailed description of how to use the tool will be provided in Part 3, when the software becomes available for distribution..

Spotters

In the proposed ISU system spotters identify all the elements.  For jumps they identify the jumps and any major jump errors.   For spins and footwork they identify the element and decide the base point values for them.  The spotters also captures a digital video clip of each element for the judges to replay at the end of each performance.   The name "spotter" is misleading because the spotter in effect is another category of judge.  The decisions of the spotters are as important to determining the placements of the skaters as the assessments of the traditional judges -- and in some respects perhaps more important.

The spotters add unnecessary complexity to the judging process and introduced additional opportunity for someone to misbehave.  Because the goal of CAJ was to model the current judging process, the CAJ software has the user identify each element in a performance, as is the practice in competitions today.

Were CAJ used as a scoring tool, it would, of course, make use of the current replay technicians who capture the video clips used in the current replay system.  The replay technician would capture the video clips and cue the the judges for the start and end of each trick, and no more. Each judge would identify the elements, assesses their quality and note any deductions.  If there were disagreement among the judges as to the identification of an element, then the majority identification of the panel would be used.

Point Model

The specific numerical values determined for the CAJ point model will not be provided here, but the basic characteristics of the model are as follows.

Superficially the CAJ software has a passing resemblance to the ISU user interface.   It is different in several details, most of which relate to the amount of detail the user can enter for consideration in determining the score.  The main difference between CAJ and the proposed ISU system is the point model.  The CAJ point model is fundamentally different from the ISU point model in both the mathematics employed and the underlying philosophy of the point model.

The ISU point model greatly oversimplify the complexity and variety of content in a skating program.  Some movements have ceased to exist in the ISU system.   All spins are lumped into three levels of difficulty with base values all less than the simplest triple jump.  Footwork sequences are also lumped into three levels of difficulty that contribute at only a minor level to the point total.

This oversimplification is unnecessary and is detrimental to the sport.  One ISU official has defended the points assigned to the spin elements, saying that the spins the skaters currently execute are mundane and do not deserve a large number of points.   In the CAJ point model ordinary spins receive fewer points, but the point model leaves open the possibility of very difficult spins that would receive much greater credit.  The ISU point model closes the door on spin complexity.  No matter how difficult a spin, it still will receive only a trivial number of points.   Consequently there is no motivation in the proposed ISU system to execute difficult spins or to invent new ones.  The same holds true for footwork sequences.

There are hundreds of possible jump combinations, tens of thousands of combinations spins, and footwork sequences come in multiple levels of difficulty.  Rather than try to specify every possible combination and assign a base marks to every detailed element, the CAJ software interface takes the following approach.

For jump combinations, spin combinations, lift combinations and footwork sequences the user inputs a description of the element using a building block, or keyword, approach.   For example, a spin could be specified "camel position, sit position, change of foot, sit position."  Using the same building blocks one could just as well have entered "sit position, camel position, change of foot, camel position, sit position."  In each case the CAJ software decodes the description and applies an algorithm that determines the base difficulty for the element based on the description of the element.  Algorithms have been developed for the four types of elements noted above.

With this approach there is complete flexibility in annotating the elements, there are no restrictions limiting the skaters to a handful of elements programmed into the computer, the ability to adapt to advancements in the sport without having to revise the entire point model is built in, and base point values that truly reflect the range of difficulty possible for each element are included.  The spin algorithm, for example assigns base values that span the same point range as the simplest single jump through the most difficult triple jump.

A second major difference between the CAJ point model and the ISU point model is that the ISU point model seems to have no basis in reality compared to the marks skaters currently get for programs of varying content.  The CAJ point model was normalized so that the base values for all tricks within a given type of element are self consistent, so that the base values for tricks of one element type compared to another are consistent, and so that the point model will reproduce to the greatest extent possible the marks skaters currently receive for programs of content spanning the difficulty of juvenile through senior level programs.  In other words, the CAJ point model was normalized so that the base values for all single through quad jumps make sense relative to each other, all spins values makes sense relative to each other, all jump values make sense relative to all spin values, etc., for all types of elements.  The ISU point model does not do this and does not attempt to do this.

The third major difference between the CAJ point model and the proposed ISU point model is the application of quality factors.

The ISU point  model assigns quality factors of +3 through -3 points added to the element base values.  In this approach several squirrelly things happen.  A trick with a base value of less than 3 points ends up with negative points for a -3 quality factor.  A triple Axel or quad on which the skater falls ends up with more points than a successful lesser triple.  A poorly executed triple may earn more points than a well executed double, contrary to the current rules.  A simple trick with a base value of 1 when executed at quality level +3 is increased in value by 300% while a difficult trick also done at a quality level of +3 is increased in value by only 33% to 66%.  Finally, the relative value of elements varies with quality factor.  For example, if one jump has a base value of 8 points and other 4 point then the more difficult jump is worth 100% more than the easier jump if both jumps are executed with a quality rating of 0.  If both jumps are executed with a quality rating of +3 then the more difficult jump is worth only 57% more, while if both are rated -3 the more difficult jump is worth 500% more.  This makes no sense and is contrary to the way judges currently evaluate program content.

The CAJ software designates quality assessment of +3 through -3, and also includes a Failed designation, in which case the element earns no points.  In the CAJ point model, however, the quality factors are not additive but are multiplicative.   For a quality factor of 0 the multiplier is 1.0 (i.e, just the base value).   For a quality factor of +1 the element has X% more value, for +2 Y% more value, and for +3 Z% more value.  Similarly, a -1 quality factor results in X% less value, -2 Y% less value, and -3 Z% less value.

In this approach, an element can never produce negative points regardless of the base point value.  Further, two elements of different base value have the same point ratio when rated at the same quality level for all quality levels. In the jump example above, the more difficult jump is always twice as valuable as the lesser jump when the two are rated at the same quality level.   This approach also prevents triple axels, quad jumps and poorly executed triples from dominating the point model.  Further, it prevents the very simplest elements of low base value earning excessive points if done at the highest quality level.

The values of the quality factors must be (and were) determined carefully.  These factors must be normalized consistently with the overall calibration of the point model.  For example, not only must the base values of all jumps be self consistent with each other for average execution, they must be self consistent for all possible quality assessments of each jump.  It is not enough that the point value for the average triple toe loop make sense compared to the point value for the average triple loop.  The point values must also make sense for a +3 triple toe loop compared to a -3 triple loop, and every other possible combination.  Experience researching CAJ suggests this may not be possible, but at least the CAJ point model takes a reasonable stab at it, as opposed to the ISU point model which doesn't even try.

The problem of trying to maintain complete consistency with respect to all quality factors suggests that the quality factor values judges intuitively use may be different for different elements.  Both the proposed ISU system and the CAJ point model assume the same numeric values may be used for all elements; but this is just an assumption, made in the case of CAJ for the sake of simplicity.  If the quality factors are instead element dependant that greatly increases the complexity of the analysis.  Further research is required in this area.

One final aspect of the overall calibration of the CAJ point model is that the 6.0 scale is retained to the greatest extent possible.  Retaining the 6.0 scale makes the CAJ results easier to understand and allows for historical continuity with past results, and in fact was necessary to allow calibration of the point model by comparison with recent past results.  In the CAJ point model, however, the scores are calculated on finer than the current 0.1 point difference between successive possible scores.   Scores in the CAJ point model are calculated to three decimal places.

In the first mark, the score is calculated from the base value of the identified elements, the quality assessments of each element, a weighting factor for each element, two auxiliary technical merit marks, and any required deductions.  The point model is normalized so that the most difficult programs currently executed will result in first marks of 6.000 or greater.  The point model is open ended above 6.000 but is currently calibrated only for jumps through quad Axel (something I never expect to see).   In singles, the point model is normalized to the difficulty level of the men, so first mark scores for the ladies in the CAJ point model will be less than 6.000 until such time the ladies start doing triple Axels and quad jumps.

The CAJ point model breaks the second mark up into two or three sub-marks, depending on the event.  Each sub-mark is given its own weighting factor and the maximum combination of the sub-marks is 6.000, with the smallest increment between any two successive second marks of 0.025 or 0.050.  The CAJ point model makes no changes to the criteria for judging of the second mark in any of the events.

The total mark for each skater is the sum of the first and second marks, as is currently the case.  Either mark can continue to break ties, but in a point system calculated to three decimal places, ties are highly unlikely.  For most levels of most events the 50-50 balance between the first and second mark is retained.   However, if the level of difficulty in some senior events increases significantly beyond what already exists, first marks greater than 6.000 will become more common and the emphasis in those events will shift to the first mark compared to the second mark, which remains limited to a maximum value of 6.000.

Perfect Program

The proposed ISU judging system abandons the 6.0 scale and abandons the idea of a perfect program.

In the CAJ point model, the concept of perfect sixes still exists for the second mark.  For the first mark, however, the concept of perfect sixes is replaced with the concept of a Technically Perfect Program.  A technically perfect program is one in which there are no failed elements, no deductions, all elements assessed as +3 quality, and the two auxiliary technical merit marks both assessed as 6.0.  A program of any base difficulty can potentially be scored as a technically perfect program.

Program Content

The proposed ISU judging system heavily rewards jumps at the expense of all other skating skills, and encourages the skaters to perform lots of them.  The CAJ point model was constructed to model the current mix of skating skills included in programs, and to allow the skaters the current level of flexibility in determining the content of their programs without penalizing their chance to place well.

The use of a point model with unrestricted program content encourages skaters to include as many tricks as possible, and allows skaters to advance in the standings by including many mediocre or poorly executed elements.  In order that quantity not overwhelm quality, the simplest way to maintain a level playing field in a point based system is to restrict the content in some way so that all the competitors have an equal chance to score points.  The proposed ISU system attempts to do this by developing new "well balanced" program requirements that have yet to be revealed.

The CAJ point model takes a similar approach, but within the context of the current rules.  With but one exception, discussed below, the current well balanced program requirements together with a limit to the total number of elements a skater can attempt creates the desired level playing field without modifying the current standards of skating in a significant way.  The ISU approach has been to redefine the program requirements in secret.  The approach described here has been to code into the software the current standards for program content.

In each event, using the CAJ point model, the skaters are permitted to attempt a maximum number of elements in accordance with the current well balanced program rules.  The number chosen is based on an analysis of program content going back several years.  There is basically a stock formula for program content that skaters and choreographers use in each event.  In constructing the CAJ point model, these standard program types were simply adopted as a uniform standard for the number of elements a skater can attempt.  In this point model, falls and other major failures count as attempts, and count towards the maximum limit on the number of elements that are scored.

The current well balanced program requirements allow the skaters more choices of elements than the maximum number they can (and currently do) attempt.  In this way the CAJ point model conforms to the way skaters currently choreograph their programs in accordance with current standards and maintains a level playing field, where skaters will not be blind-sided by a competitor attempting 24 elements with 10 double Axels, or throwing themselves all over the ice till they land something.

The following table summarizes the way CAJ models the first and second marks in each of the events.

 

First Mark

Second Mark

Elements

A

B

A

B

C

Singles SP

Combination of individual assessments Connecting Moves
Other Content
Basic Skating
Speed Originality
Expression of Music
Composition
Carriage and Style
Timing
Variation of Speed
Use of the Ice

Singles FS

Combination of individual assessments Connecting Moves
Other Content
Basic Skating
Speed Originality
Expression of Music
Composition
Carriage and Style
Timing
Variation of Speed
Use of the Ice

Pairs SP

Combination of individual assessments Connecting Moves
Other Content
Basic Skating
Speed Originality
Expression of Music
Composition
Carriage and Style
Timing
Unison
Variation of Speed
Use of the Ice

Pairs FS

Combination of individual assessments Connecting Moves
Other Content
Basic Skating
Speed Originality
Expression of Music
Composition
Carriage and Style
Timing
Unison
Variation of Speed
Use of the Ice

Dance CD

Combination of individual assessments Conformity of Steps
Placement of Steps
Carriage, Style and Form
Edges and Turns
Unison
Correct Timing
Skating to Beat of Music
Expression of Character of Music Not Used

Dance OD

Combination of individual assessments Connecting Steps
Originality and Variety
Cleanness, Sureness and Edges
Speed
Pattern
Use of Ice
Selection of Music
Choreography
Expression
Timing
Carriage, Style and Unison
Not Used

Dance FD

Combination of individual assessments Connecting Steps
Other Content

Cleanness, Sureness and Edges
Speed Selection of Music
Choreography
Expression
Timing
Carriage, Style and Unison
Variation in Speed
Use of the Ice

 

Singles Short Program

The CAJ point model makes no changes to the current required elements in the ladies and men's short programs.  The point model is calibrated so that the most difficult short program currently attempted by the men, if executed perfectly, receives a first mark of 6.000.  A program of lesser content will have a lesser maximum score, and a more difficult program will have a greater possible maximum score. 

The ladies short program will be marked using the same point model as the men, and consequently a championship ladies short program with the content currently performed by the ladies will not receive a first mark of 6.000 even if perfectly executed.

Regardless of the point total possible for the content chosen, however, the concept of a technically perfect program is retained in the short program, as described above.

Pairs Short Program

The CAJ point model makes no changes to the current required elements in the pairs short programs.   The point model is calibrated so that the most difficult short program currently attempted by the pairs, if executed perfectly, receives a first mark of 6.000.  A program of lesser content will have a lesser maximum possible score, and a more difficult program will have a greater maximum possible score. 

Compulsory Dance

The marking of compulsory dances by a point model is currently more of a concept than a final implementation, and would require the input of persons more expert in dance than myself for completion of the point model.  As a preliminary implementation, the CAJ software handles each compulsory dance like a short program in which there are a number of required elements to be assessed, and deductions taken for errors.  For each compulsory dance the required elements would consist of a number of steps, step segments or sequences appropriate to each dance, as determined by the appropriate dance gurus.  It is currently assumed this would consist of five to seven "elements" for each dance, each of equal importance.

Original Dance

CAJ makes no changes to the current required elements in the original dance.  The point model is calibrated so that a dance that includes the most difficult dance elements currently attempted in the original dance, if executed perfectly, receives a first mark of 6.000.  A program of lesser content will have a lesser maximum score, and a more difficult program will have a greater possible maximum score.  The CAJ software basically handles the original dance in the same way it handles the short programs.

Single Free Skating

The CAJ point model implements the current well balanced program rules for singles free skating, with minor modification.  Currently there is no limit on the number of solo jumps or spins that can be included in a program, except for a limitation on the repetition of triple and quadruple jumps, and there is no limit on the repetition of double jumps.

ISU Well Balanced Program Requirements
  Jumps Jump Comb./Seq. Spins Footwork
Minimum No limit 1 4 2
Optional No limit 2 No limit 0
TOTAL No limit 3 No limit 2

Using these requirements without modification, a straight point accumulation scoring system pushes the skaters towards a program with as many elements as possible, and primarily as many jumps as possible.  This allows the possibility that a skater with a large number of lesser difficulty or mediocre elements can outscore a skater with a fewer number of better elements.

To maintain a level playing field, the CAJ point model uses the following modified program requirements, which are consistent with the content skaters currently include in their programs.  In addition, jump combinations/sequences are limted to no more than three jumps of one or more rotations.

CAJ Well Balanced Program Requirements
  Jumps Jump Comb./Seq. Spins Footwork
Minimum 4 1 4 2
Optional 3 2 0 0
TOTAL 7 3 4 2

Currently, most singles programs typically include 12-14 elements in the routines.   To maintain a level playing field, the CAJ point model allows the skater to attempt up to14 elements in accordance with the above table.

Unlike the pairs free skate and the free dance where improvisation in a program is rarely if ever attempted, singles skaters frequently hold in reserve an additional jump to attempt if they miss a jump earlier in the program.  In accordance with the long standing philosophy in the rules that a fall is no bar to winning, and consistent with the current skating standards, the CAJ point model allows singles skaters to attempt a 15th element.  That element must be a solo jump and is only included in the score if the skater fell on one of the earlier jump elements.

In the CAJ point model, the most difficult examples of each type of element have nearly equivalent values and contribution to the point total so skaters can continue to show the current variety of content without being penalized in the standings.  In addition, the CAJ point model also discourages the excessive repetition of double jumps.  In the CAJ point model, the first repetition of a double jump gets full credit if it is included in a combination or sequence, or half-credit if not.  The third attempt gets one-third credit, and any further attempts gets no credit, though still counts towards the program limit of 14 attempted elements.

Unlike the proposed ISU system, which renders spins and footwork nearly worthless, spins and footwork of maximum difficulty have the same value as a difficult triple jump in the CAJ point model.   The placements of good spinners but weak jumpers in actual competition is modeled well by the CAJ point model.  On the other hand, it is expected that good spinners will see their placements decline under the proposed ISU system, which does not value those elements.

Nevertheless, as is currently the case, the CAJ point model is still weighted towards the jumps since more jump elements are permitted than spin or footwork elements.   With a maximum of 14 attempted elements, of which 6 must be spins and footwork elements, the skaters may attempt a total of 8 jumps and/or jump combinations/sequences, with those elements making up 57% of the element mix.  This conforms to the way skaters currently construct their programs.

Pairs Free Skating

The CAJ software implements the current well balanced program rules for pairs free skating.  In the pairs long program the teams are required to attempt a minimum of 11 elements, and may include a maximum of 17 elements according to the following table:

ISU Well Balanced Program Requirements
  Lifts Throws Pair Spin Solo Jump Jump Combination SbS Spin Death Spiral Footwork
Minimum 3 1 1 1 1 1 1 2
Optional 2 1 1 1 0 0 1 0
TOTAL 5 2 2 2 1 1 2 2

Currently, most teams include no more than 14-15 elements in their routines, though a few sometimes include 16.  To maintain a level playing field, the CAJ point model allows the teams to attempt up to 15 elements in accordance with the above table to determine the team's score.  The most difficult examples of each type of element have nearly equivalent values and contribution to the point total so teams can continue to show the current variety of content without being penalized in the standings.

Free Dance

The CAJ software implements the current well balanced program rules for free dance.  In the free dance the couples are required to attempt a minimum of 7 elements and may attempt a maximum of 14 elements according to the following table:

ISU Well Balanced Program Requirements
  Lifts Spins Twizzles Footwork
Minimum 2 1 2 2
Optional 5 2 0 0
TOTAL 7 3 2 2

To maintain a level playing field, the CAJ point model allows the couples to attempt up 12 elements in accordance with the above table to determine the couple's scores.  The most difficult examples of each type of element have nearly equivalent values and contribution to the point total so couples can continue to show the current variety of content without being penalized in the standings.

The point model for the free dance is less well developed than for singles and pairs free skating.  The main problem being that, unlike for pairs, there is no formalized classification system and nomenclature for dance lifts, nor an agreed upon standard for difficulty.  Further, the marks dance judges currently assign and the placments given in inernational competition are statistically suspect.  A great deal of further effort is required to validate a realistic point model for dance.

Computer Vision

Several aspects of skating performances lend themselves to being judged in a completely automated way using machine vision.  These include the speed of the skating (the second auxiliary technical merit mark), and the variation in the speed of the skating and the use of the ice (the third presentation sub-mark).  The current version of the CAJ software has the user input assessments for these items, however, the assessment of these items would probably be improved if they were evaluated in a completely automated way.

To assess these two marks, a computer vision system could be used to analyze programs in real time.  The motion of the skaters on the ice could be captured by a machine vision system and various statistics related to speed, range of speed and use of the ice calculated.   An algorithm would be applied to determine the two marks.  Algorithms have be developed to accomplish this task, normalized to current skating standards in these areas.  The details of this algorithm and its hardware implementation, however, will not be discussed here.  One current area of research, is the study of ways other aspects of a skating performance can be evaluated in an automated way without the use of human judges.

Combining Results from a Panel of Judges

The CAJ point model was constructed to conform to the current judging process and the current rules and standards of skating, when applied to the assessments from a single judge.  In combining the results from a panel of judges, however, the CAJ software takes a new approach that makes use of all the tabulated assessments from all the judges, making use of the power of modern computers.

The assessments of the panel of judges are combined together in a way to minimize the impact of outliers (bogus marks) on the results.  A panel can consist of any number of judges, but statistically a minimum of nine judges is preferred.  A larger panel is desirable but at a certain point (around 15 judges) it becomes logistically cumbersome to have so many judges involved.  Statistics suggest that a panel of 11 to 15 judges is somewhat preferable to the current nine.

Prior to calculating a skater's score, the procedure is to first analyze the content of the program to apply the well balanced program rules, to eliminated illegal elements and to eliminate illegally repeated elements from consideration.  A computer recorded timing of the program duration is used to automatically determine if timing deductions must be taken.

The skater's point total is then calculated in a two-step and two-dimensional approach. 

    Step 1

For each judge, the point model is applied to the judge's assessments and a score is calculated for that judge's assessments alone.  The assessments are also checked to see if the judge has awarded a technically perfect program.

For each element, auxiliary mark and all deductions, the inputs of all the judges are combined together to obtain a panel-wide assessment for each element, auxiliary mark and deduction.  In this process the assessments are filtered to eliminate outliers.  The panel-wide assessments, auxiliary marks and deductions are then combined according to the point model to determine a panel-wide score for the skater.

    Step 2

The scores determined from each individual judge are compared to the panel-wide score.  An algorithm is applied to determine if any of the judges are out of line with the rest of the panel by a statistically significantly amount.  This is more complex than just throwing out the high or low marks, neither of which may be statistically out of line (in which case they would not be thrown out).  However, any judge's score that is flagged as being a statistical outlier is eliminated from the calculation and the entire process is repeated without those judges' marks.  The skater's score is the panel-wide score with the input of outlier judges removed.

This approach has several benefits compared to the current scoring system and the proposed ISU judging system.

First, deductions are not uniformly applied in the current scoring system.   A skater may be penalized by a judge who takes a deduction no one else takes, or rewarded by a judge who fails to take a deduction taken by the other judges.  In the approach described here the value of deductions are determined by a majority of the panel and applied uniformly.   The same concept holds true for the identification of the elements.

Second, outlier judges are identified and corrected for in real time.   Under the current system and under the proposed ISU system accountability begins after the event is over.  Perhaps long after the event is over, if ever.  A judge can misbehave or make a serious error and currently there is no way to correct for it after the fact.  In the approach just described, outlier judges are not identified after the fact with no corrective action then possible, but instead are identified in real time and their marks omitted from the calculation of the scores in real time.  Under the current system and proposed ISU system, judges know that if they misbehave and get caught their misdeeds will still stand.  Using the approach described here, judges would know that they will be caught and their misdeeds will be eliminated from the results even as they are sitting on the panel during the event.

The Fatal Flaw?

The CAJ point model and software were constructed to evaluate skating performances in accordance with the current rules and the current skating standards, for the juvenile through senior levels.  It attempts to take into account all the factors human judges currently consider (or are supposed to consider) in placing the skaters in accordance with the current rules.  Tests show that CAJ performs fairly well in meeting these goals, and is a reasonably accurate representation of the judging process in use today.  Analysis shows that the CAJ point model is less sensitive to identification errors and judgement errors than the proposed ISU system, is superior at filtering the effects of misconduct, and provides a higher level of confidence in separating successive places in an event.  Nevertheless, whether this model is the best representation of how judges actually evaluate skating performances is still open to discussion.

In constructing the CAJ software some consideration was, of course, given to how it might be used as a scoring tool.  This should not be taken to mean, however, that it is being advocated that a CAJ-like system be used for judging competitions at this time.   Whether or not a CAJ-like system is a viable scoring system remains an open question.   In several places it has been noted that further research is needed.  Some uncertainly remains in the final calibration of the point model, that only a test program involving a large number of judges can resolve, something that would require the cooperation of the ISU or the USFSA.

It is strongly advocated, however, that a CAJ-like annotation system be adopted to replace the hardcopy notes judges make during a competition.  This approach would allow the collection of the amount of data needed to unambiguously determine whether a point model system is a viable approach to judging competitions, in real competition situations.  More practically, it would provide the foundation for a realistic accountability system; and if properly used, prove to be a powerful educational tool for the training of judges.  It is something the USFSA, for example, could implement in a relatively brief time period at only modest cost for use at sectional competitions, national championships and judges schools.

Just as it has been discussed elsewhere for the proposed ISU judging system, it must be shown for the CAJ point model, or any point model based system for that matter, that judges can mark with sufficient accuracy and consistency for the results to be statistically meaningful.  Any system that does not provide a confidence level of 95% or greater for every placement in an event cannot be considered an improvement, and is not worth implementing.  Some of the design decisions in the CAJ user interface were made to assist the judges in marking on an absolute and consistent scale to the degree required.  Nevertheless, it still remains as much a question for the CAJ point model as it does for the ISU point model whether human judges have the capacity to judge with the accuracy and consistency required for a point model based system to work as needed.

Return to title page

Copyright 2003 by George S. Rossano