-
Notifications
You must be signed in to change notification settings - Fork 41
Fast Sparse Coding Features
The sparse coding approach is based on the estimation of the dictionary words (bases) , such that arbitrary data vector (sample) could be represented via a linear combination:
where is the sparse dictionary and are the weighting coefficients (features) for every word and sample.
The core of the sparse coding approach is the estimation of the the sparse dictionary , that we can use a linear combination of its words to represent any data vector . For better capture structures and patterns inherent in the data vectors, we use an over-complete dictionary, moreover, we want the linear combination to be sparse, which means there are only a very small part of the weighting coefficients in it are non-zero.
We turn the problem of the Sparse Dictionary Learning into the following minimization problem:
here, for convenience, we designated through the set of all weighting coefficients and through the set of all data vectors. The first term of our objective function is the error in reconstructing the data from the features using the basis; the second term is a sparsity penalty term to encourage the learned features to be sparse and the third term is the weight decay term, designed to keep the entries of small.
Please note, that the sets , and the dictionary are represented as matrices with the following dimensions:
As the minimization problem described above can be solved with respect to either dictionary or features while the other one of the two is fixed, most of the algorithms are based on the idea of iteratively updating one and the other:
There has been developed a number of algorithms to solve it, but let us concentrate on the Gradient descent algorithm, implemented in the DGM C++ library:
using namespace DirectGraphicalModels;
using namespace DirectGraphicalModels::fex;
...
Mat img = imread("D:\\Data\\SparseCoding\\test_img_org_00.jpg", 0);
CSparseCoding sc(img);
Mat X;
char fileName[256];
for (int i = 0; i < 100 /*438*/; i++) {
printf("%d\n", i);
sprintf(fileName, "D:\\Data\\SparseCoding\\test_img_org_%02d.jpg", i);
Mat _img = imread(fileName, 0);
if (_img.empty()) continue;
Mat _X = CSparseDictionary::img2data(_img, 15);
if (!_X.empty()) {
if (X.empty()) _X.copyTo(X);
X.push_back(Mat(_X));
}
} // i
//X = CSparseDictionary::img2data(img, 7);
printf("nSamples = %d\n", X.rows);
parallel::shuffleRows(X);
sc.train(X, 1024, 2000, 1000);
sc.save("D:\\Dictionaries\\dictionary_1024.dic");
...
char str[256];
CSparseDictionary sDictionary;
sDictionary.load("D:\\Projects\\DGM\\etc\\scdictionaries\\dictionary_256_7.dic");
Mat dictionary = sDictionary.getDictionary();
int r = (sDictionary.getBlockSize() - 1) / 2;
for (int m = 20; m <= 21; m++) {
for (int i = 1; i <= 10; i++) {
for (int a = 1; a < 12; a++) {
int angle = a * 30;
sprintf(str, "D:\\Data\\EMDS4\\Original_EM_Images\\t4-g%02d-%02d-%d.png", m, i, angle);
printf("%s\n", str);
Mat img = imread(str);
vec_mat_t vFeatures = CSparseCoding::get_v(img, dictionary, sqNeighbourhood(r));
Mat empty(img.size(), CV_8UC1, Scalar(0));
while (vFeatures.size() % 3 != 0) vFeatures.push_back(empty);
sprintf(str, "D:\\Data\\EMDS4\\Original_EM_Images\\t4-g%02d-%02d-%d_256", m, i, angle);
_mkdir(str);
int j = 0;
for (size_t f = 0; f < vFeatures.size(); f += 3) {
vec_mat_t::const_iterator first = vFeatures.begin() + f;
vec_mat_t::const_iterator last = vFeatures.begin() + f + 3;
vec_mat_t tmpVec(first, last);
Mat tmpImg;
merge(tmpVec, tmpImg);
sprintf(str, "D:\\Data\\EMDS4\\Original_EM_Images\\t4-g%02d-%02d-%d_256\\t4-g%02d-%02d-fv%04d.bmp", m, i, angle, m, i, j);
imwrite(str, tmpImg);
j++;
}
} // a
} // i
} // m