# BlockPCA Source Code

On this page you can download a sample implementation of our BlockPCA algorithm published in the paper Parallelized Matrix Factorization for fast BTF Compression:

- Source Code (Zip Archive, 387 KB)

This implementation of the BlockPCA algorithm should compile without changes on Visual Studio 2005. For other Visual Studio versions / compilers minor adaption might be necessary.

## Libraries

The following libraries have been used:

CUDA/CUBLAS 2.0 | http://www.nvidia.com/object/cuda_get.html |

boost 1.36 | http://www.boost.org |

clapack V3.0 | http://www.netlib.org/clapack |

## Usage

To use the factorization algorithm, it is necessary to derive a class from CDataProviderT and overwrite its size1(), size2(), getVector() and getVectors() methods to provide the input matrix. Simple implementations for matrices in memory and for flat input files are provided (CMatrixDataProviderT, CThreadedFileDataProviderT). The actual factorization can then be performed via:

NUMERICAL::CBlockPcaT<float> clBlockPCA;

clBlockPCA.compute(clDataProvider, k, uiNumIterations, uiBlockSize, progress);

Here k is the number of singular values that is kept, uiNumIterations the number of iterations for the EM-PCA algorithm and uiBlockSize the number of columns in each block (set = 0 to automatically determine this value in dependence on the available GPU memory). progress is a pointer to a function that is regularly called with progress updates.

We recommend to use an x64 version of the program, as several large blocks of memory (each has several hundered megabytes) are allocated and this often fails on 32 bit operating systems due to limited address space.

## Example program

In BlockPCA.cpp a simple example program is provided. You have to specify a input matrix (Parameter: *--in*, simple raw data, float) and its size (Parameters *--rows*, *--columns*). The number of singular values to calculate can be specified via *--components*, the number of EM-PCA iterations via *--iterations* and the block size via *--block-size*.The result will then be written to the file specified via *--out*. It has the following format:

Type | Meaning | |
---|---|---|

uint32 | Size of the scalar type | = 4, the sample program only supports float |

uint32 | Number of components | = k |

uint32 | Number of columns | = n |

uint32 | Number of rows | = m |

float vector, k values | Singular values S | |

float vector, m values | Data mean m | |

float matrix, m x k values | Matrix U | |

float matrix, m x k values | Matrix V' |

All matrices and vectors are written as raw float values. For more details, take a look at CBlockPcaT::write().