Table of Contents

Experiment Setup

Initially, 13 object surfaces were recorded using the fingertip sensor. These would become the classes to be recognized by means of a SVM (Support Vector Machine). Depending on the object, specially the size and heterogeneity of its surface, a longer or shorter time was needed to record the surface. Below you can see a list of the objects recorded and how many frames were taken for each of them:

Recorded objects

Object total frames taken training frames testing frames
1 table 200 160 40
2 wood 200 160 40
3 air 200 160 40
4 shampoo 400 320 80
5 lego 200 160 40
6 cup 600 480 120
7 tomatosoup 600 480 120
8 breadboard 200 160 40
9 pancake_label 400 320 80
10 pancake_side 400 320 80
11 yogurt 600 480 120
12 ketchup 600 480 120
13 icedtea 600 480 120

When running the classification experiments, 80% of the frames of each class were used for training and the remaining 20% was used as the testing set. Which frames were used for training and which for testing was chosen randomly.

First experiments

At the beginning the SVM was trained using the frames itself (900 element array) plus normalized Frame Period and Shutter parameters from the sensor. So, in total, a 902 length array was used. Several configurations of the SVM were tested (RBF and LINEAR kernels; C_SCV and NU_SVC svm_types; cost, gamma and nu parameters). The best results were achieved with the LINEAR kernel, C_SVC svm_type and cost of 4.0. Gamma and nu do not matter when using this setup. The Frame Period and Shutter were normalized in their whole range, like this:

Frame Period normalized = (Frame Period - 4000) / 21000

Shutter normalized = Shutter / 20000

Sample result:

table:	right:13	wrong:27
table 13	icedtea 13	tomatosoup 5	ketchup 3	yogurt 2	

wood:	right:22	wrong:18
wood 22	tomatosoup 8	icedtea 3	table 2	shampoo 2	

air:	right:40	wrong:0
air 40	table 0	wood 0	shampoo 0	lego 0	

shampoo:	right:13	wrong:67
pancake_label 20	shampoo 13	ketchup 12	tomatosoup 11	cup 10	

lego:	right:29	wrong:11
lego 29	ketchup 5	yogurt 4	icedtea 2	table 0	

cup:	right:55	wrong:65
cup 55	shampoo 15	tomatosoup 8	pancake_side 7	yogurt 7	

tomatosoup:	right:38	wrong:82
tomatosoup 38	cup 22	shampoo 12	pancake_side 11	yogurt 8	

breadboard:	right:33	wrong:7
breadboard 33	icedtea 3	lego 2	shampoo 1	pancake_label 1	

pancake_label:	right:21	wrong:59
pancake_label 21	shampoo 18	cup 10	tomatosoup 9	pancake_side 6	

pancake_side:	right:23	wrong:57
pancake_side 23	tomatosoup 15	cup 9	yogurt 9	ketchup 9	

yogurt:	right:40	wrong:80
yogurt 40	ketchup 29	shampoo 11	icedtea 10	lego 7	

ketchup:	right:58	wrong:62
ketchup 58	yogurt 19	icedtea 10	pancake_side 6	lego 5	

icedtea:	right:44	wrong:76
icedtea 44	ketchup 23	tomatosoup 15	yogurt 13	cup 8	

seed: 1318520272
cost: 4.000000
kernel: LINEAR
svm_type: C_SVC
totalright:429	totalwrong:611
good classes: 12	bad classes: 1

Time taken:
real	1m20.288s
user	1m19.880s
sys	0m0.210s

After this, we realized that the Frame Period parameter is always 4000 when the sensor is touching any surface, and therefore doesn't add any additional information for the surface recognition. On the other side, the Shutter parameter changes just slightly, between 30 and 150, so we decided to normalize this value dividing it by 200 (and not by 20000). For values greater than 200 the output after normalization is also 1. These are the results:

table:	right:15	wrong:25
table 15	icedtea 11	tomatosoup 6	yogurt 3	ketchup 3	

wood:	right:25	wrong:15
wood 25	tomatosoup 10	shampoo 2	lego 2	yogurt 1	

air:	right:40	wrong:0
air 40	table 0	wood 0	shampoo 0	lego 0	

shampoo:	right:18	wrong:62
shampoo 18	pancake_label 15	tomatosoup 12	ketchup 10	cup 9	

lego:	right:27	wrong:13
lego 27	ketchup 8	yogurt 5	table 0	wood 0	

cup:	right:53	wrong:67
cup 53	shampoo 14	tomatosoup 10	icedtea 9	yogurt 8	

tomatosoup:	right:42	wrong:78
tomatosoup 42	cup 24	shampoo 16	yogurt 6	icedtea 6	

breadboard:	right:35	wrong:5
breadboard 35	icedtea 3	shampoo 1	yogurt 1	table 0	

pancake_label:	right:27	wrong:53
pancake_label 27	shampoo 14	cup 11	tomatosoup 10	yogurt 7	

pancake_side:	right:64	wrong:16
pancake_side 64	tomatosoup 8	yogurt 3	shampoo 2	ketchup 2	

yogurt:	right:37	wrong:83
yogurt 37	ketchup 23	icedtea 19	pancake_side 10	shampoo 8	

ketchup:	right:56	wrong:64
ketchup 56	yogurt 18	icedtea 11	tomatosoup 9	cup 6	

icedtea:	right:57	wrong:63
icedtea 57	tomatosoup 16	yogurt 15	cup 10	ketchup 9	

seed: 1318520272
cost: 4.000000
kernel: LINEAR
svm_type: C_SVC
totalright:496	totalwrong:544
good classes: 13	bad classes: 0

Time taken:
real	1m14.695s
user	1m14.110s
sys	0m0.360s

Final Setup (using GLCM: Gray-Level Co-occurrence Matrix)

GLCM

Although the results were fine, some image processing was done in order to extract more relevant characteristics from the images and reduce the parameter length.

Very good results were obtained using GLCM with 16 shades of gray and calculating co-occurrence for all the immediate neighbor pixels (8 pixels) of the pixel in question. The implementation is Sample::glmc() in Sample.cpp.

So now, the SVM was trained using arrays with a length of 257, that is, 256 elements from the GLCM plus Shutter info.

The best results were achieved using RBF kernel, C_SVC svm_type, gamma = 1 and cost = 6. See the following file for details

SVM GLCM experiments

Here are the results:

table:	right:39	wrong:1
table 39	icedtea 1	wood 0	air 0	shampoo 0	

wood:	right:33	wrong:7
wood 33	lego 3	table 1	tomatosoup 1	pancake_side 1	

air:	right:40	wrong:0
air 40	table 0	wood 0	shampoo 0	lego 0	

shampoo:	right:51	wrong:29
shampoo 51	cup 10	yogurt 8	pancake_label 7	tomatosoup 2	

lego:	right:36	wrong:4
lego 36	wood 1	breadboard 1	yogurt 1	ketchup 1	

cup:	right:89	wrong:31
cup 89	yogurt 12	shampoo 7	pancake_label 4	breadboard 2	

tomatosoup:	right:79	wrong:41
tomatosoup 79	pancake_label 7	yogurt 6	ketchup 6	wood 5	

breadboard:	right:33	wrong:7
breadboard 33	ketchup 4	wood 1	lego 1	yogurt 1	

pancake_label:	right:66	wrong:14
pancake_label 66	tomatosoup 10	shampoo 3	cup 1	table 0	

pancake_side:	right:72	wrong:8
pancake_side 72	wood 2	yogurt 2	ketchup 2	shampoo 1	

yogurt:	right:66	wrong:54
yogurt 66	ketchup 23	cup 12	icedtea 7	shampoo 5	

ketchup:	right:79	wrong:41
ketchup 79	yogurt 15	icedtea 8	cup 4	wood 3	

icedtea:	right:93	wrong:27
icedtea 93	ketchup 9	yogurt 6	table 3	wood 3	

seed: 1318520272
kernel: RBF (Radial Basis Function)
svm_type: C_SVC
cost: 6.000000
gamma: 1.000000
totalright:776	totalwrong:264
good classes: 13	bad classes: 0

real	0m14.728s
user	0m14.550s
sys	0m0.130s