The data for this analysis is a set of crossword clues and answers from New York Times crossword puzzles from 1996-2012, courtesy of Michael Donohoe:
Crosswordiness is calculated as a function of both how often a word appears in the crossword puzzle and how often it appears elsewhere. The "elsewhere" is the word's Google Books Ngram percentage from the same period, 1996-2012. For data quality reasons, only recognized dictionary words have their "crosswordiness" calculated (see Flaws below). Sorry, ARLO. The crosswordiness percentile of any answer word can be looked up from the Details page.
To calculate how strongly a given word in a clue points to a particular answer, I use term frequency-inverse document frequency. This is a measure of not just how frequently a word appears in clues for a particular answer, but how frequently it appears in those clues relative to all clues. Chopping crossword clues into individual words has a significant margin of error (See Flaws below).
To see the three clue words that most strongly point to a given answer, check out the Details page. Examples: OLEO, APSE, EWER