Stamped, Standardized, Sorted
A busy mail room sorts parcels with KNN using two features: size
in centimeters and mass
in grams. But grams overwhelm centimeters, so KNN won’t judge fairly unless all features are put on the same footing. The sorter decides to standardize (with z-scores) before letting neighbors vote.
You are asked to read some preexisting data, standardize the values, then classify new parcels with k-NN. For each feature column, subtract the column mean and divide by the column standard deviation taken from the preexisting data. If a column is flat - zero standard deviation - its z-score is 0.0 for every row. After scaling, use Euclidean distance to find the k nearest preexisting rows to each new row and predict the majority label among those neighbors. If several labels tie, print the alphabetically smallest one.
The first line of the input contains three integers n d k
representing how many rows of preexisting data there are, how many features each row has, and the number of neighbors. Each of the next n
lines contains d
floating-point numbers followed by a label (with no spaces).
The next line contains a single integer q
representing how many new rows to classify. Each of the next q
lines contains d
floating-point numbers to classify using the z-score statistics computed from the preexisting data.
The program should print q
lines. Each line should contain the predicted label for the corresponding new row.
Input | Output |
---|---|
6 2 3 | LETTER |
9 2 3 | LETTER |
4 2 4 | BOX |
Constraints
Time limit: 2 seconds
Memory limit: 512 MB
Output limit: 1 MB