==== Experiment Setup ====
Initially, 13 object surfaces were recorded using the fingertip sensor. These would become the classes to be recognized by means of a SVM (Support Vector Machine). Depending on the object, specially the size and heterogeneity of its surface, a longer or shorter time was needed to record the surface. Below you can see a list of the objects recorded and how many frames were taken for each of them:
{{ :projects:objects_surface_recognition.jpg?nolink&300|Recorded objects}}
| ^ Object ^ total frames taken ^ training frames ^ testing frames ^
| 1 | table | 200 | 160 | 40 |
| 2 | wood | 200 | 160 | 40 |
| 3 | air | 200 | 160 | 40 |
| 4 | shampoo | 400 | 320 | 80 |
| 5 | lego | 200 | 160 | 40 |
| 6 | cup | 600 | 480 | 120 |
| 7 | tomatosoup | 600 | 480 | 120 |
| 8 | breadboard | 200 | 160 | 40 |
| 9 | pancake_label | 400 | 320 | 80 |
| 10 | pancake_side | 400 | 320 | 80 |
| 11 | yogurt | 600 | 480 | 120 |
| 12 | ketchup | 600 | 480 | 120 |
| 13 | icedtea | 600 | 480 | 120 |
When running the classification experiments, 80% of the frames of each class were used for training and the remaining 20% was used as the testing set. Which frames were used for training and which for testing was chosen randomly.
==== First experiments ====
At the beginning the SVM was trained using the frames itself (900 element array) plus normalized Frame Period and Shutter parameters from the sensor. So, in total, a 902 length array was used. Several configurations of the SVM were tested (RBF and LINEAR kernels; C_SCV and NU_SVC svm_types; cost, gamma and nu parameters). The best results were achieved with the LINEAR kernel, C_SVC svm_type and cost of 4.0. Gamma and nu do not matter when using this setup. The Frame Period and Shutter were normalized in their whole range, like this:
Frame Period normalized = (Frame Period - 4000) / 21000
Shutter normalized = Shutter / 20000
Sample result:
table: right:13 wrong:27
table 13 icedtea 13 tomatosoup 5 ketchup 3 yogurt 2
wood: right:22 wrong:18
wood 22 tomatosoup 8 icedtea 3 table 2 shampoo 2
air: right:40 wrong:0
air 40 table 0 wood 0 shampoo 0 lego 0
shampoo: right:13 wrong:67
pancake_label 20 shampoo 13 ketchup 12 tomatosoup 11 cup 10
lego: right:29 wrong:11
lego 29 ketchup 5 yogurt 4 icedtea 2 table 0
cup: right:55 wrong:65
cup 55 shampoo 15 tomatosoup 8 pancake_side 7 yogurt 7
tomatosoup: right:38 wrong:82
tomatosoup 38 cup 22 shampoo 12 pancake_side 11 yogurt 8
breadboard: right:33 wrong:7
breadboard 33 icedtea 3 lego 2 shampoo 1 pancake_label 1
pancake_label: right:21 wrong:59
pancake_label 21 shampoo 18 cup 10 tomatosoup 9 pancake_side 6
pancake_side: right:23 wrong:57
pancake_side 23 tomatosoup 15 cup 9 yogurt 9 ketchup 9
yogurt: right:40 wrong:80
yogurt 40 ketchup 29 shampoo 11 icedtea 10 lego 7
ketchup: right:58 wrong:62
ketchup 58 yogurt 19 icedtea 10 pancake_side 6 lego 5
icedtea: right:44 wrong:76
icedtea 44 ketchup 23 tomatosoup 15 yogurt 13 cup 8
seed: 1318520272
cost: 4.000000
kernel: LINEAR
svm_type: C_SVC
totalright:429 totalwrong:611
good classes: 12 bad classes: 1
Time taken:
real 1m20.288s
user 1m19.880s
sys 0m0.210s
After this, we realized that the Frame Period parameter is always 4000 when the sensor is touching any surface, and therefore doesn't add any additional information for the surface recognition. On the other side, the Shutter parameter changes just slightly, between 30 and 150, so we decided to normalize this value dividing it by 200 (and not by 20000). For values greater than 200 the output after normalization is also 1. These are the results:
table: right:15 wrong:25
table 15 icedtea 11 tomatosoup 6 yogurt 3 ketchup 3
wood: right:25 wrong:15
wood 25 tomatosoup 10 shampoo 2 lego 2 yogurt 1
air: right:40 wrong:0
air 40 table 0 wood 0 shampoo 0 lego 0
shampoo: right:18 wrong:62
shampoo 18 pancake_label 15 tomatosoup 12 ketchup 10 cup 9
lego: right:27 wrong:13
lego 27 ketchup 8 yogurt 5 table 0 wood 0
cup: right:53 wrong:67
cup 53 shampoo 14 tomatosoup 10 icedtea 9 yogurt 8
tomatosoup: right:42 wrong:78
tomatosoup 42 cup 24 shampoo 16 yogurt 6 icedtea 6
breadboard: right:35 wrong:5
breadboard 35 icedtea 3 shampoo 1 yogurt 1 table 0
pancake_label: right:27 wrong:53
pancake_label 27 shampoo 14 cup 11 tomatosoup 10 yogurt 7
pancake_side: right:64 wrong:16
pancake_side 64 tomatosoup 8 yogurt 3 shampoo 2 ketchup 2
yogurt: right:37 wrong:83
yogurt 37 ketchup 23 icedtea 19 pancake_side 10 shampoo 8
ketchup: right:56 wrong:64
ketchup 56 yogurt 18 icedtea 11 tomatosoup 9 cup 6
icedtea: right:57 wrong:63
icedtea 57 tomatosoup 16 yogurt 15 cup 10 ketchup 9
seed: 1318520272
cost: 4.000000
kernel: LINEAR
svm_type: C_SVC
totalright:496 totalwrong:544
good classes: 13 bad classes: 0
Time taken:
real 1m14.695s
user 1m14.110s
sys 0m0.360s
==== Final Setup (using GLCM: Gray-Level Co-occurrence Matrix) ====
[[http://www.mathworks.de/help/toolbox/images/ref/graycomatrix.html|GLCM]]
Although the results were fine, some image processing was done in order to extract more relevant characteristics from the images and reduce the parameter length.
Very good results were obtained using GLCM with 16 shades of gray and calculating co-occurrence for all the immediate neighbor pixels (8 pixels) of the pixel in question. The implementation is Sample::glmc() in Sample.cpp.
So now, the SVM was trained using arrays with a length of 257, that is, 256 elements from the GLCM plus Shutter info.
The best results were achieved using RBF kernel, C_SVC svm_type, gamma = 1 and cost = 6. See the following file for details
{{:projects:svm_results.ods|SVM GLCM experiments}}
Here are the results:
table: right:39 wrong:1
table 39 icedtea 1 wood 0 air 0 shampoo 0
wood: right:33 wrong:7
wood 33 lego 3 table 1 tomatosoup 1 pancake_side 1
air: right:40 wrong:0
air 40 table 0 wood 0 shampoo 0 lego 0
shampoo: right:51 wrong:29
shampoo 51 cup 10 yogurt 8 pancake_label 7 tomatosoup 2
lego: right:36 wrong:4
lego 36 wood 1 breadboard 1 yogurt 1 ketchup 1
cup: right:89 wrong:31
cup 89 yogurt 12 shampoo 7 pancake_label 4 breadboard 2
tomatosoup: right:79 wrong:41
tomatosoup 79 pancake_label 7 yogurt 6 ketchup 6 wood 5
breadboard: right:33 wrong:7
breadboard 33 ketchup 4 wood 1 lego 1 yogurt 1
pancake_label: right:66 wrong:14
pancake_label 66 tomatosoup 10 shampoo 3 cup 1 table 0
pancake_side: right:72 wrong:8
pancake_side 72 wood 2 yogurt 2 ketchup 2 shampoo 1
yogurt: right:66 wrong:54
yogurt 66 ketchup 23 cup 12 icedtea 7 shampoo 5
ketchup: right:79 wrong:41
ketchup 79 yogurt 15 icedtea 8 cup 4 wood 3
icedtea: right:93 wrong:27
icedtea 93 ketchup 9 yogurt 6 table 3 wood 3
seed: 1318520272
kernel: RBF (Radial Basis Function)
svm_type: C_SVC
cost: 6.000000
gamma: 1.000000
totalright:776 totalwrong:264
good classes: 13 bad classes: 0
real 0m14.728s
user 0m14.550s
sys 0m0.130s