Educating Children to Learn Out-of-Sample

March 20, 2017 - 5 minute read -
ml data policy

Machine learning algorithms, supervised or not, struggle when trained on insufficient data or the wrong features. The methods of learning still require expert guidance, appropriate feedback, and testing the mental model on edge cases before you would expect reasonable out-of-sample validity. We can apply this same framework to our children to develop stronger critical thinking skills. Give them more diverse, interesting and useful training data, appropriate expert guidance to prevent over fitting, and extensive validation of her/his mental model on out-of-sample experiences.

Machine and Human Learning

In supervised machine learning, we provide labeled training data of both correct and incorrect results for the model to learn, and then apply that model to out-of-sample data. Our education systems fail to do both these steps; instead, we provide only true statements and then ask these young minds to recognize those items among a larger out-of-sample data set.

This leaves young minds unable to reason about new information. The positive-labels-only approach does not work for computers (simple minds) and fails to teach or measure critical thinking with our students (sophisticated minds).

Critical Thinking is the Most Valuable Skill

In life long learning, like machine learning, we look for success on out-of-sample problems–is the next generation equipped to solve problems we could not? Do they have the skills and resources to tackle issues we have not anticipated, prepared for, or experienced previously? Critical thinking, given rapid changes in necessary skills, experience and powerful automation, is your most valuable characteristic.

Our current educational and motivational institutions fail to teach and reward critical thinking.

Societal Costs

Failure to prepare and validate each student’s mental model encourages low-aspiration education that poorly equips the next generation for making decisions. The well-off can easily over-fit for the established metrics through preparation classes, test-taking training and the additional rote practice. Compounding the advantage, the well-off get critical thinking interactions outside the classroom from involved parents, family, mentors and varied interactive experiences.

As a sad example, “kindergarten readiness” rubrics show a low aspiration emphasis on out of sample performance.

sources: Leap Frog: Kindergarten Skills Checklist & TeachingMama: Kindergarten Readiness Checklist (pdf)

These lists emphasize:

  • Obedience to the schedule and tasks
  • Memorizing and regurgitate letters, numbers, colors, songs, plant/animal facts
  • Having typical motor and social skills commensurate with age
  • Remixing basic patterns and relative comparisons (Leap Frog calls this “reasoning”)

MARGinal Education

In-sample testing rewards over fitting for standardized measurement. The below quiz, which is used at many preschools to test “letter knowledge” is a classic example of the key failures highlighted in the mnemonic MARGinal: Memorize, Accept, Remix and reGurgitate.

Letter Quiz

Strict memorization and repetition of in-sample alphabet letters is sufficient to pass this quiz. The child is asked to recognize previously seen input and reply with the memorized, rules-based label that was provided. You would not successfully teach a machine effective optical character recognition in this way. You would not maximize the future potential of the human mind with such narrow-minded efforts.


Though extensive repetition and inconsequential details (such a remembering a specific date in history), we require our children to memorize details rather than processing the implications and validating the mental model underneath. Does filling out 40 addition math problems demonstrate mastery of the relationship between real numbers? Does recounting the events in the Middle East in 1967 give you the right context(1)?


Students are given poor training data limited up front to only those features and aspects that will be tested later, with coverage and emphasis presented, rather than discovered. Worse, we skew feature selection to the test, or test on clearly less-critical items in detail. Precise measurement of exact details rewards over-fitting training data.

What items were taxed by the Townshed Acts of 1767?(2)

What letter is this: A(3)

Accept and memorize denies children exploration, discovery and investigation into what is the best thinking. It means messy data is not encountered in practice, leaving them poorly prepared for pattern recognition, equiped to handle only the provided training set.


Having memorized the accepted facts, we reward continued over-fitting to the training set through homework, assignments and tests that merely expect a light remixing into incremental forms–nothing new is built or created. The students are not challenged to add ideas and concepts of their own origination.


The lowest level of education and learning is regurgitation. Just like the letter quiz, the training data is again tested for exact recognition. There is no application of the concepts to parallel, unfamiliar, out-of-sample situations. There it little commingling of the concepts with other, untrained data sources that are potentially unconnected. Sometimes a concept is foundational for the next in-sample memorization, but lacks the cross mixing of ideas and concepts for multiplicative critical thinking.

A Better Letter Quiz

The letter quiz is easy to improve by introducing variable, out of sample data, and adding context.

  1. Add symbols that are not english letters – does the child guess, pick a similar letter, or admit not knowing? Add á, ñ, ☃, ש , Б and A seated pharo
  2. Repeat letters to reduce process-of-elimination and reliance on in-sample-quiz taking skills
  3. Put the letters into words and have the child identify the first, last, or any letter. This adds and alters context. Does the child understand the letter is the same, even when the sound may be different? Does the child recognize the b in bee, bologna and comb?
  4. Have the student attempt to pronounce phonetic (easiest), not phonetic (okay to be wrong–does she/he have a good mental model?) and even nonsense words (fun–why not yell it?) BLERP!
  5. Vary the typeface so letters are not presented the exact same way every time, providing variance among both the training typeface and introducing slightly different ones on the quiz.


(1) Several events lead up to the six day war of 1967

(2) For the curious who like useless facts, see Townshed Acts: Raising Revenue

(3) This is the Latin Alphabet A