BlockPCA Source Code

On this page you can download a sample implementation of our BlockPCA algorithm published in the paper Parallelized Matrix Factorization for fast BTF Compression:

This implementation of the BlockPCA algorithm should compile without changes on Visual Studio 2005. For other Visual Studio versions / compilers minor adaption might be necessary.

Libraries

The following libraries have been used:

CUDA/CUBLAS 2.0externhttp://www.nvidia.com/object/cuda_get.html
boost 1.36externhttp://www.boost.org
clapack V3.0externhttp://www.netlib.org/clapack

Usage

To use the factorization algorithm, it is necessary to derive a class from CDataProviderT and overwrite its size1(), size2(), getVector() and getVectors() methods to provide the input matrix. Simple implementations for matrices in memory and for flat input files are provided (CMatrixDataProviderT, CThreadedFileDataProviderT). The actual factorization can then be performed via:

NUMERICAL::CBlockPcaT<float> clBlockPCA;
clBlockPCA.compute(clDataProvider, k, uiNumIterations, uiBlockSize, progress);

Here k is the number of singular values that is kept, uiNumIterations the number of iterations for the EM-PCA algorithm and uiBlockSize the number of columns in each block (set = 0 to automatically determine this value in dependence on the available GPU memory). progress is a pointer to a function that is regularly called with progress updates.

We recommend to use an x64 version of the program, as several large blocks of memory (each has several hundered megabytes) are allocated and this often fails on 32 bit operating systems due to limited address space.

Example program

In BlockPCA.cpp a simple example program is provided. You have to specify a input matrix (Parameter: --in, simple raw data, float) and its size (Parameters --rows, --columns). The number of singular values to calculate can be specified via --components, the number of EM-PCA iterations via --iterations and the block size via --block-size.The result will then be written to the file specified via --out. It has the following format:

Type    Meaning
uint32  Size of the scalar type = 4, the sample program only supports float
uint32  Number of components = k
uint32  Number of columns = n
uint32  Number of rows = m
float vector, k values   Singular values S
float vector, m values   Data mean m
float matrix, m x k values   Matrix U
float matrix, m x k valuesMatrix V'

All matrices and vectors are written as raw float values. For more details, take a look at CBlockPcaT::write().