SAT

analysis of the test

colin fahey

1. abstract

The SAT is a test designed to predict how well a high school student would perform as a freshman at a college or university in the United States of America (USA).  More than 1000 (80%) of colleges and universities in the United States of America (USA) accept or require SAT scores from people applying for admission.  The SAT was first administered on 1926 June 23 to 8040 people.  In the 2003-2004 SAT testing year, 1419007 high school seniors took the SAT.  On 2005 March 12, a new version of the SAT was administered for the first time to approximately 300000 people.

2. trust only the College Board for information about the SAT

The College Board creates all versions of the SAT, and is the only authority on all aspects of the SAT.  The College Board web site describes the content of the SAT and various testing administration conditions and policies.

http://www.collegeboard.com

3. SAT question types

This section describes all question types on the SAT, for each of the three divisions of the SAT: mathematics (M), critical reading (CR), and writing (W). All sample questions shown here appeared on the SAT version with form code BWBA that was administered on 2005 March 12th.  Most of the sample questions shown here have a difficulty rating of "5", the highest difficulty rating on a scale from 1 through 5.  Answers to the sample questions appear at the end of this section.



3.1 math (m)

3.1.1 introduction

This section describes questions types appearing in the math (m) division of the SAT.



3.1.2 general instructions

This section shows the general math instructions, and the instructions for "student-produced response" grids.



sat_sample_01_m_instructions.gif

sat_sample_02_m_spr_instructions.gif

3.1.3 number and operations

The following question is an example of the "number and operations" type of math question.



sat_sample_03_m_n_s8q15.gif

3.1.4 algebra and functions

The following question is an example of the "algebra and functions" type of math question.



sat_sample_04_m_a_s2q18.gif

3.1.5 geometry and measurement

The following question is an example of the "geometry and measurement" type of math question.



sat_sample_05_m_g_s8q16.gif

3.1.6 data analysis, statistics, and probability

The following question is an example of the "data analysis, statistics, and probability" type of math question.



sat_sample_06_m_d_s2q7.gif

3.2 critical reading (cr)

3.2.1 introduction

This section describes questions types appearing in the critical reading (cr) division of the SAT.



3.2.2 passage-based reading

The following question is an example of the "passage-based reading" type of critical reading question.



sat_sample_07_cr_r_s4q8.gif

3.2.3 sentence completion

The following question is an example of the "sentence completion" type of critical reading question.



sat_sample_08_cr_c_s4q4.gif

3.3 writing (w)

3.3.1 introduction

This section describes questions types appearing in the writing (w) division of the SAT.



3.3.2 identifying sentence errors

The following question is an example of the "identifying sentence errors" type of writing question.



sat_sample_09_w_e_s3q20.gif

3.3.3 improving sentences

The following question is an example of the "improving sentences" type of writing question.



sat_sample_10_w_s_s10q14.gif

3.3.4 improving paragraphs

The following question is an example of the "improving paragraphs" type of writing question.



sat_sample_11_w_p_s3q30.gif

3.3.5 essay

The following question is an example of the "essay" type of writing question.



sat_sample_12_w_essay_s1q1.gif

3.4 answers to the sample questions



math (m) -------- number and operations answer: C algebra and functions answer: 5/2 or 2.5 geometry and measurement answer: A data analysis, statistics, and probability answer: C critical reading (cr) --------------------- passage-based reading answer: B sentence completion answer: E writing (w) ----------- identifying sentence errors answer: A improving sentences answer: D improving paragraphs answer: C essay answer: Yes



4. SAT structure

4.1 SAT question raw score points by format

question format raw points
if wrong
raw points
if omitted
raw points
if correct
5-choice (-1/4) 0 +1
12700-choice 0 0 +1
essay 0 0 +2 ... +12

4.2 SAT score structure by division

division question
format
total
questions
minimum
raw
score
maximum
raw
score
math

(M)

5-choice 44 -11 +44
12700-choice 10 0 +10
critical

reading

(CR)

5-choice 67 -17 +67
writing

(W)

5-choice 49 -12 +49
essay 1 0 +12

4.3 SAT question totals by format

question format total
questions
5-choice 160
12700-choice 10
essay 1

4.4 SAT question subjects by division

division question
subjects
questions division
questions
math

(M)

number and operations 11 ... 13 54
algebra and functions 19 ... 21
geometry and measurement 14 ... 16
data analysis, statistics, and probability 6 ... 7
critical

reading

(CR)

passage-based reading extended reasoning 36 ... 40 48 67
literal comprehension 4 ... 6
vocabulary in context 4 ... 6
sentence completion 19
writing

(W)

improving sentence errors 18 49
improving sentences 25
improving paragraphs 6
essay 1 1

4.5 SAT chronological structure



(1) There are ten, independently-timed sections, with the following sequence of durations in minutes: {25,25,25,25,25,25,25,20,20,10}, for a total testing duration of 225 minutes (3 hours, 45 minutes).

(2) There is a five-minute break (leaving the room to go to the bathroom is allowed) after section #2, and a one-minute "stretching break" (leaving the room is not allowed) after section #4, and another five-minute break (leaving the room to go to the bathroom is allowed) after section #6.

(3) Section #1 is always the essay section of the Writing (W) division.

(4) Section #10 is always a 14-question section of the Writing (W) division.

(5) Sections #8 and #9 always include a 16-question section of the Math (M) division, and a 19-question section of the Critical Reading (CR) division, but in either of the two possible orderings.

(6) Sections {2,3,4,5,6,7} always include: two 24-question sections from the Critical Reading (CR) division, one 20-question section from the Math (M) division, one 18-question section from the Math (M) division, one 35-question section from the Writing (W) division, and one "variable" section that has the same format as one of the other sections in the set of these six sections.  The order of the section kinds is "random", and the identity of the "variable" section is intended to not be discovered while taking the test.

I took the SAT on 2005 March 12th.  The following was the chronology of my particular test day experience:

section duration
(minutes)
division total
questions
comments
1 25 min Writing (W) 1 (essay) essay is always first
2 25 min Math (M) 18 8(5-choice);10(12700-choice)
(BREAK) 5 min ---- ---- long/bathroom break
3 25 min Writing (W) 35 sent. errors, imp. paragraphs
4 25 min Critical Reading (CR) 23 passages and sentence comp.
(BREAK) 1 min ---- ---- short/stretch break
5 25 min Math (M) 20 ----
6 25 min Critical Reading (CR) 25 long reading passage!
(BREAK) 5 min ---- ---- long/bathroom break
7 25 min ***VARIABLE*** ???? ----
8 20 min Math (M) 16 geometry; number and op.
9 20 min Critical Reading (CR) 19 (1-6;7-19)
10 10 min Writing (W) 14 always last; improve sentences




The book entitled "The Official SAT STUDY GUIDE: For the New SAT", published by the College Board, copyright 2004, has eight practice SATs.  Here are the chronologies of those eight practice SATs:

practice SAT index section number
#1 #2 #3 #4 #5 #6 #7 #8 #9 #10
#1 WE CR24 M20 VAR CR24 M18 W35 CR19 M16 W14
#2 WE CR24 M20 VAR CR24 M18 W35 CR19 M16 W14
#3 WE M20 CR24 M18 VAR W35 CR24 M16 CR19 W14
#4 WE M20 CR24 M18 VAR W35 CR24 M16 CR19 W14
#5 WE CR24 M18 W35 CR24 VAR M20 CR19 M16 W14
#6 WE CR24 M18 W35 CR24 VAR M20 CR19 M16 W14
#7 WE M18 W35 CR24 M20 CR24 VAR M16 CR19 W14
#8 WE M18 W35 CR24 M20 CR24 VAR M16 CR19 W14




The chronologies of these eight practice SATs only illustrate possible orders of the sections, given the constraints.  One should not try to form other conclusions based on these chronologies of the practice tests; this sample size is very small relative to the large number of possible chronologies, and the College Board has no incentive to describe any additional constraints they might use to form an acceptable chronology.  For example, although the variable section appears in sections {4,5,6,7} in the practice tests listed above, there is no basis for concluding that it is not just as likely that the variable section can appear in section #2 or section #3.  Also, there is no basis for concluding that sections within the same division won't ever appear in consecutive sections in the chronology.  For example, there might be a version of the SAT with two consecutive sections in the Math (M) division.

4.6 determining the "variable" section while taking the SAT

One of the sections {2,3,4,5,6,7} will be for research purposes only and will not be given a score.  The section used for research purposes is called the "variable" section.

Consider the non-variable sections that must appear in the set of sections {2,3,4,5,6,7}:

=======================================================
division              questions       total by division
=======================================================
Math (M)                 20              38
Math (M)                 18
-------------------------------------------------------
Critical Reading (CR)    24 (+\-1)       48
Critical Reading (CR)    24 (+\-1)
-------------------------------------------------------
Writing (W)              35              35
=======================================================

        

Therefore, the division with the variable section well be known as soon as one encounters:

(1) a third section in the Math (M) division;

(2) a third section in the Critical Reading (CR) division;

or, (3) a second section in the Writing (W) division.

This will happen while taking the SAT, and thus, one will know, before starting work on a section that matches one of the three cases, that there is a (1/3) chance (cases 1 and 2) or a (1/2) chance (case 3) that the current section is the variable section and will not be given a score.

Regardless of the sequence of sections, when a test taker encounters a section that proves which division has the variable section, the test taker has a (1/3) chance (cases 1 and 2) or a (1/2) chance (case 3) that the current section is the variable section.  Before gaining this information, the probability was only (1/6).  Also, the probability for all subsequent sections becomes zero.

Furthermore, if a math section with 20 questions has been encountered, and then another math section with 20 questions is encountered, then one knows, before starting work on the second math section with 20 questions, that the variable section is in the math division, and also that there is a (1/2) chance that the current section is the variable section and will not be given a score. Similarly, encountering a math section with 18 questions, and later encountering another math section with 18 questions, leads to the same conclusions.  The probability of any subsequent section being the variable section becomes zero.

Okay, now consider this information leak from the perspective of the authors of the SAT.  The goal of the authors of the SAT is to have test takers work on the variable section with the same concern and effort given to all other sections of the SAT, giving the authors of the SAT a method of linking performance on the particular version of the SAT to performances on versions administered throughout the long history of the SAT.  Therefore, the authors of the SAT want to minimize the chance that a person taking the test will determine that a particular section is the variable section.

For example, the authors of the SAT would probably avoid having sections #2 and #3 be math sections with exactly 20 questions each, because the test taker would know, at the very beginning of section #3, that either section #2 or section #3 must be the variable section.  Also, the test taker would know that all subsequent sections will be scored.  In this hypothetical situation, the test taker gets information about the variable section at the earliest possible time in the sequence of sections in the SAT.  The test taker can use this information in a few ways.  If the test taker is consciously or tacitly "taking advantage" of the general (1/6) chance that each section, of the sections {2,3,4,5,6,7}, is the variable section, putting in only (5/6) of maximum personal effort to avoid wasting a full effort on the variable section, then the test taker would change this conservative strategy after learning that the sections {4,5,6,7} will be graded, investing full effort in to those sections. The test taker might also take advantage of the (1/2) chance that section #3 is the variable section, putting in less effort, or skipping the section entirely and instead resting or doing work on another section (against SAT rules; don't cheat!).

It is my guess that the authors of the SAT delay conclusive evidence of the division containing the variable section until section #7.  This minimizes any advantage to the test taker.  Section #7 itself need not be the variable section, but I believe that delaying conclusive evidence of the division having the variable section until section #7 is best for the test authors. (NOTE: On the 2005 March 12th administration of the SAT, and form code BWBA, section #7 itself happened to be variable section.)

In conclusion, there is a way to be certain which *division* has the variable section while taking the SAT -- and this information might, at the very least, offer some psychological relief (closure, or SATisfaction of morbid curiosity) for a person taking the SAT.

5. calculating SAT scores

This section describes how to convert total numbers of correct and incorrect responses to scaled scores, for each of the three divisions of the SAT: mathematics (M), critical reading (CR), and writing (W).

5.1 math (m)

5.1.1 introduction

This section describes the procedure to compute the raw and scaled scores for the math (M) division of the SAT.

5.1.2 calculations



INPUTS ====== m_mc_correct = number of math multiple-choice questions answered correctly; [an integer from 0 through 44] m_mc_wrong = number of math multiple-choice questions answered incorrectly; [an integer from 0 through 44] [Answers left blank are neither counted as correct nor counted as wrong.] m_spr_correct = number of correct math "student-produced responses"; [an integer from 0 through 10] CALCULATIONS ============ m_correct = ( m_mc_correct + m_spr_correct ); [an integer from 0 through 54] m_wrong = ( m_mc_wrong ); [an integer from 0 through 44] m_raw_fractional = (rational) m_correct - ( (rational) m_wrong / (rational) 4 ); [a rational number from -11 through +54] m_raw_score = nearest( m_raw_fractional ); [an integer from -11 through +54] m_scaled_score = m_raw_to_scaled( m_raw_score ); [an integer, multiple of 10, from 200 through 800] [The m_raw_to_scaled() function is shown below, as a graph.]

5.1.3 graph

The following graph shows the conversion from a multiple-choice raw score (-11 through +54) to a scaled score (200 through 800, in multiples of 10) for the Math (M) division of the SAT.

sat_m_raw_to_scaled_graph.gif

5.2 critical reading (cr)

5.2.1 introduction

This section describes the procedure to compute the raw and scaled scores for the critical reading (CR) division of the SAT.

5.2.2 calculations



INPUTS ====== cr_correct = number of critical reading multiple-choice questions answered correctly; [an integer from 0 through 67] cr_wrong = number of critical reading multiple-choice questions answered incorrectly; [an integer from 0 through 67] [Answers left blank are neither counted as correct nor counted as wrong.] CALCULATIONS ============ cr_raw_fractional = (rational) cr_correct - ((rational) cr_wrong / (rational) 4); [a rational number from -(67/4) through +67] cr_raw_score = nearest( cr_raw_fractional ); [an integer from -17 through +67] cr_scaled_score = cr_raw_to_scaled( cr_raw_score ); [an integer, multiple of 10, from 200 through 800] [The cr_raw_to_scaled() function is shown below, as a graph.]

5.2.3 graph

The following graph shows the conversion from a multiple-choice raw score (-17 through +67) to a scaled score (200 through 800, in multiples of 10) for the critical reading (CR) division of the SAT.

sat_cr_raw_to_scaled_graph.gif

5.3 writing (w)

5.3.1 introduction

This section describes the procedure to compute the raw and scaled scores for the writing (W) division of the SAT.

5.3.2 calculations



INPUTS ====== w_mc_correct = number of writing multiple-choice questions answered correctly; [an integer from 0 through 49] w_mc_wrong = number of writing multiple-choice questions answered incorrectly; [an integer from 0 through 49] [Answers left blank are neither counted as correct nor counted as wrong.] w_essay_score = Essay score; [an integer; zero, or, 2 through 12; { 0, 2..12 }] CALCULATIONS ============ w_mc_raw_fractional = (rational) w_mc_correct - ((rational) m_mc_wrong / (rational) 4); [a rational number from -(49/4) through +49] w_mc_raw_score = nearest( w_mc_raw_fractional ); [an integer from -12 through +49] w_mc_scaled_score = w_mc_raw_to_scaled( w_mc_raw_score ); [an integer, from 20 through 80] [The w_mc_raw_to_scaled() function is shown below, as a graph.] w_cs_scaled_score = w_cs_raw_to_scaled( w_mc_raw_score, w_essay_score ); [an integer, multiple of 10, from 200 through 800] [The w_cs_raw_to_scaled() function is shown below, as a graph.]

5.3.3 graphs

The following graph shows the conversion from a multiple-choice raw score (-12 through +49) to a scaled score (20 through 80) for the writing (W) division of the SAT.

sat_w_mc_raw_to_scaled_graph.gif

The following graph shows the conversion from a multiple-choice raw score (-12 through +49), and the essay raw score (0, +2 ...  +12), to a combined scaled score (200 through 800, in multiples of 10) for the Writing (W) division of the SAT.

sat_w_c_raw_to_scaled_graph.gif

Notice that the graph is a family of curves.  Thus, one finds the proper horizontal coordinate using the multiple-choice raw score, and then selects the proper curve using the essay score.  The point on that curve at the proper horizontal coordinate is the composite scaled score.

The graph is missing data for a small region of score combinations, marked by the question mark ("?") on the graph.  The College Board did not provide data for this region in the table in the Question and Answer Service (QAS) report.  I suppose the College Board doesn't think there will be many people who write competent essays (with a pair of scores adding to "6" or higher), and, at the same time, get a multiple-choice raw score less than "-2".  But, hey, it could happen.

6. SAT scaled score distributions from 2004

Although the following graphs pertain to an *old* SAT format, the distributions of scaled scores are likely to be maintained for the *new* SAT format.

The following graphs show the percentages of graduating seniors in 2004 whose scaled scores were within particular ranges, for the math and verbal divisions of the old SAT format (prior to the introduction of the new SAT on 2005 March 12). These distributions were designed by the College Board, and achieved by computing, and using, appropriate "raw score to scaled score" conversion curves.

sat_math_score_population_2004.gif

For the 2004 SAT testing year, the mean math score was 518, with a standard deviation of 114.

sat_verbal_score_population_2004.gif

For the 2004 SAT testing year, the mean verbal score was 508, with a standard deviation of 112.

7. analysis of "student-produced response" grid encodings

7.1 introduction

The following image shows the format of the "student-produced response" grid as it appears on the SAT answer sheet.  Various consequences of this response grid format are also shown.

sat_spr_intro.gif

7.2 fractions may always be avoided

It is always possible to encode a correct response value using a decimal format on the SAT.  Fractions may always be avoided.

There are encodable fractions whose exact decimal format encodings cannot fit in the space provided, but it is always acceptable to encode the values in a decimal format, as long as the decimal encoding is as precise as possible, given the limited space.  It is acceptable to "truncate" the decimal encoding, which involves simply stopping the writing of digits beyond the most-significant digits that fit in the space provided.  It is also acceptable to "round" the value, choosing the decimal encoding, among all decimal encodings that will fit in the space provided, that has a value that is closest to the exact value of the correct result.

The following image shows how a fractional value without a corresponding exact decimal format encoding may nonetheless be correctly encoded on the SAT in a decimal format after a process of either truncation or rounding.

sat_spr_trunc_and_round.gif



7.3 all 22308 encodings

7.3.1 introduction

The symbols available in the four columns of the "student-produced response" grid imply a total of ((11)*(13)*(13)*(12)) = 22308 encodings.  This section describes various classifications of the encodings.

7.3.2 hierarchy of classifications for all 22308 encodings

The following chart shows a hierarchy of classifications of subsets of all possible encodings.

All

22308

Valid

17936

Blank (No Response; Skipped; Omitted)

1

No Fraction

15568

 

(implies

Encoded

Value is

Exact)

No Decimal (implies Integer Value)

11229

With Decimal

4339

Integer Value

1243

Non-integer Value

3096

With Fraction

2367

Integer Value

474

Non-integer Value

1893

With Exact Encoding

546

Without Exact Encoding

1347

Truncated is Closest

718

Truncated is not Closest

629

Invalid

4372

Syntax Error

4199

Undefined Value

173

Division-by-zero

171

Zero-over-zero

2



7.3.3 blank

If the entire response is blank, the response is considered "omitted"; i.e., the test-taker decided not to respond to the corresponding question.  The response is given a score of zero points.  (An incorrect "student-produced response" is also given a score of zero points.)



Blank response: ( blank, blank, blank, blank ) " "

7.3.4 invalid syntax

Encoding the values of numbers requires a grammar, or syntax, so that a reader can unambiguously interpret the encoding as a specific numeric value.

The existence of a grammar implies the existence of encodings that are inconsistent with that grammar.  [This generalization is not true for the all-inclusive, non-constraining, trivial "non-grammar grammar", often exploited in advertising, online chat, and spam messages.]

The following encodings are examples of encodings that violate the implicit grammar of the "student-produced responses grid".



Examples of invalid syntax: Involving only punctuation and blanks: ( blank, blank, blank, point } " ." ( point, point, point, point ) "...." ( blank, slash, blank, blank ) " / " ( blank, slash, slash, blank ) " / / " ( point, slash, slash, point ) ".  / / .  " ( blank, slash, point, blank ) " /.  " --> News for nerds: slash-dot will never be a correct response on the SAT! Involving only digits and blanks: ( blank, 0, blank, 0 ) " 1 2" ( 1, blank, blank, 2 ) "1 2" ( 1, 2, blank, 3 ) "12 3" Involving only digits and points: ( 1, point, point, 2 ) "1..2" ( 1, point, 2, point ) "1.2." ( point, 1, 2, point ) ".12." Involving only digits, slashes, and blanks: ( blank, slash, 2, blank ) " /2 " ( blank, slash, blank, 2 ) " / 2" ( blank, 1, slash, blank ) " 1/ " Involving only digits, slashes, points, and blanks: ( point, slash, blank, 2 ) "./ 2" ( blank, slash, 2, point ) " /2."

7.3.5 division-by-zero error

Some encodings can be interpreted as requests to execute a procedure that has an ambiguous or undefined result.  Some encodings call for a value that results from dividing a number by zero.  There is no way, in commonplace mathematics, to assign a value to such an encoding without leading to some contradiction of mathematical axioms.

I did wonder what would happen if I wrote " 1/0" as a "student-produced response" on the SAT.  Would I crash the SAT-grading computer application?  I wanted to try this out, but I didn't want to jeopardize getting my score report.



Example of the division-by-zero error: ( blank, 1, slash, 0 ) " 1/0"

7.3.6 zero-over-zero error

Zero over zero is even more of a mathematical horror than division by zero. Zero over zero is at the impossible crossroads of zero, the finite, and infinity!



Example of the zero-over-zero error: ( blank, 0, slash, 0 ) " 0/0"

7.3.7 problem cases

Some encodings unambiguously specify numeric values, but might nonetheless be rejected by the computer software grading the SAT due to unconventional features.  In particular, the SAT grading software may not expect or allow for redundant or superfluous characteristics in an encoding.

The following encodings are unconventional, but unambiguous.  It seems reasonable to expect that the SAT grading software would interpret such encodings in a way that yields the numeric value intended by the human who did the encoding, but perhaps these encodings would be rejected.



Examples of unconventional encodings that are *probably* accepted: Zero: ( blank, 0, 0, 0 ) " 000" ( blank, 0, 0, blank ) " 00 " ( point, 0, 0, 0 ) ".000" ( blank, point, 0, 0 ) " .00" ( blank, 0, point, 0 ) " 0.0" ( blank, 0, 0, point ) " 00." ( blank, 0, slash, 1 ) " 0/1" ( point, 0, slash, 1 ) ".0/1" ( blank, 0, slash, 9 ) " 0/9" One: ( blank, 0, 0, 1 ) " 001" ( blank, 0, 1, blank ) " 01 " ( blank, 1, point, 0 ) " 1.0" ( blank, 0, 1, point ) " 01." ( 1, slash, 0, 1 ) "1/01" ( 1, slash, 1, point ) "1/1." ( 1, slash, blank, 1 ) "1/ 1"

7.3.8 various statistics

The following table contains various statistics relating to the 22308 encodings.



=========================================================================== Various subsets of the 22308 encodings: =========================================================================== Total integer encodings = (11229 + 1243 + 474 ) = 12946 (Note that there are only 10000 distinct integer values.) Total non-integer encodings = (3096 + 1893) = 4989 (Note that only there are only 2700 non-integer values in a non-fraction encoding format.) (Note that total integer encodings, plus total non-integer encodings, plus the one blank encoding, adds to 17936, the total number of valid encodings.) --------------------------------------------------------------------------- Total distinct values that can be encoded (where values of fractions are exact) = 13526 (Note that it is only necessary to use one of 12700 values to encode any response correctly (in a non-fraction format). Therefore, there are (13526 - 12700) = 826 distinct encodable values (via fractions) that are not exactly encodable in non-fraction format.) --------------------------------------------------------------------------- Total number of encodings with fractions whose values cannot be encoded exactly in a non-fraction format in the available space = 1347 (Note that these particular 1347 encodings with fractions only include 826 distinct values.  For example, "1/11" and "2/22" are have equal values.) --------------------------------------------------------------------------- Summary of the 12700 distinct values sufficient to respond correctly to any question requiring a "student-produced response": ---------------------------------------------------------------- Sub-Range Total Values Total Integers Total Non-Integers ---------------------------------------------------------------- 0 1 1 0 .001 --> .999 999 0 999 1.00 --> 9.99 900 9 891 10.0 --> 99.9 900 90 810 100.  --> 999.  900 900 0 1000 --> 9999 9000 9000 0 ---------------------------------------------------------------- 0 --> 9999 12700 10000 2700 ---------------------------------------------------------------- ===========================================================================

7.3.9 file with all 22308 encodings

The following file contains a complete list of all 22308 encodings, along with the truncated values, truncation error values, closest decimal values, closest decimal value errors, and full classifications.

7.4 equivalent encodings

There are 22308 encodings, of which 17936 are acceptable. The test-taker need only consider 12700 distinct numerical values, because it is always acceptable to avoid the use of encodings with fractions (digits separated by exactly one slash).  Clearly, some of the 12700 encodable values must have more than one equivalent encoding.  The average, (17936 encodings/12700 values) = 1.412 encodings per value, is informative, but the actual distribution of the number of encodings per value is very uneven.

For example, there are 77 ways to encode the value "1" exactly:



(NOTE: "_" indicates a blank space in the response.) ___1 __1_ _1__ 1___ __1.  _1._ 1.__ _1.0 1.0_ 1.00 __01 _01_ _001 _01. _1/1 1_/1 1/_1 1/01 1/1_ 1/1.  1./1 _2/2 2_/2 2/_2 2/02 2/2_ 2/2.  2./2 _3/3 3_/3 3/_3 3/03 3/3_ 3/3.  3./3 _4/4 4_/4 4/_4 4/04 4/4_ 4/4.  4./4 _5/5 5_/5 5/_5 5/05 5/5_ 5/5.  5./5 _6/6 6_/6 6/_6 6/06 6/6_ 6/6.  6./6 _7/7 7_/7 7/_7 7/07 7/7_ 7/7.  7./7 _8/8 8_/8 8/_8 8/08 8/8_ 8/8.  8./8 _9/9 9_/9 9/_9 9/09 9/9_ 9/9.  9./9

The challenge of finding all possible encodings for a specified value reminds me of the obscure game "EQUATIONS: The Game of Creative Mathematics", part of the "WFF 'N PROOF"(*) family of "well-formed formula" (wff) logic games developed by Dr.  Layman E.  Allen (currently a professor at Yale Law School).  In the EQUATIONS game, the challenge is to use numbers (e.g., "1", "2", "3", "4") and operators (e.g., "-" (subtract), "-" (subtract), "*" (multiply), "/" (divide), "/" (divide)), to form as many expressions as possible to yield a goal value (e.g., "6").

"WFF 'N PROOF" [www.wff-n-proof.com]



The high-intelligence group called the "Mega Society", which requires each member to have an IQ of at least 176, has a journal entitled "Noesis" (Greek: "understanding"; "to perceive"), which featured, in "Issue 169", the following text (in a section entitled "Part 2 - "EQUATIONS" - continued):

Dr.  Allen mentioned in a telephone conversation that NASA rocket scientists were unable to get all twelve solutions to Elementary problem E1.  Test yourself! The youngsters who have mastered Equations, however, plow through problems like these quickly and decisively.  The problem with our normal mathematics education, Dr.  Allen explained, is we are taught toreason abstractly.  The Equations game teaches the user toto reason in reverseconclusionproduce in, say, 15 minutes.  Itproblem E1 we are only using twoand the numbers 1 through 4 (one each).  If you get all 12 correct, the online GUI-based system will let you know with a blinkingFor those readers that send me their 12 correct,will send them an additional wallet-size Mega Society card with their next issue of Noesis.

Puzzle "E1" [http://cgi.wff-n-proof.com/msq-ind/I-1E.htm]



I think it would be funny to have standardized admission tests for religions, like Christianity, so that the next time I am accosted by someone proselytizing their faith, I can ask to see credentials (like a religious ID card).  Take a look at the quizzes at the "Landover Baptist Church" web site -- such as "The Bible Poop Quiz", "The Bible Logic Quiz", and the "God vs.  Allah Quiz".

"Landover Baptist Church" quiz list [www.landoverbaptist.org]



The Weekly World News, a tabloid, had a cover in 2005 June that had a drawing of Moses on Sinai mountain holding two stone tablets, and the associated headline was: "Ten More Commandments Found!  And you won't believe what they say!"

There are { 34, 77, 48, 41, 36, 33, 30, 29, 30, 29, 25 } ways to exactly encode the values { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }, respectively.

The following graph shows the number of equivalent encodings as a function of the value to be encoded.  Notice the data point on the graph for the value "1", indicating 77 encodings, just as explicitly enumerated above.

sat_spr_alias_count_graph.gif

It is a beautiful pattern, indicating the greatest degeneracy of encodings centered about the value "1", and a decrease of both degeneracy and resolution for values requiring more precision (digits).  For each integer value from 1000 through 9999, there is obviously only one encoding that will fit in the space provided. For each integer value from 100 through 999, there are three equivalent encodings: "_###", "###_", "###."; where "_" represents a blank, and "#" represents a digit.  For each of the smallest, non-zero values, ".001" through ".009", there is only one encoding.

Of the 17935 valid, non-blank encodings, only 1347 lack an exact decimal encoding.  These 1347 encodings (which have only 826 distinct exact values) are the only encodings that lead to non-zero truncation (or rounding) errors when encoded instead using one of the 12700 non-fractional values (particularly, the subset of 2700 non-integer values).

Thus, if one insists on encoding one of the 1347 encodings mentioned above in a non-fraction encoding instead, then one must choose to truncate or round (to the nearest encodable value).  Truncation means chopping off less-significant digits, and so the error that results is always negative ( Observed = (Exact + Error); thus, Error = (Observed - Exact) ).

7.4.1 truncation and closest errors

The following table contains various facts relating to the truncation and closest error values.

===============================================================================
About Error Values
===============================================================================

  Observed = (Actual + Error);   ==>   Error = (Observed - Actual);

  In this context, "Observed" corresponds to the value one interprets from
the encoding.

  In this context, "Actual" corresponds to the exact value one desires to
represent by an encoding.

  The only circumstance which leads to non-zero error is when one desires to
encode a value corresponding to an exactly encodable fraction but intentionally
avoids the use of fractions (which is always acceptable) and instead encodes
the value in a decimal format.   In this circumstance, one is forced to choose
either a TRUNCATED decimal value or CLOSEST decimal value to encode.   In this
circumstance, the truncated value (Observed) is alwa