Low-Rank Neural Network Training Framework
Configurable machine learning framework for training dense and low-rank neural networks in PyTorch. Built as part of INF202 at NMBU to investigate memory-efficient neural network architectures and software engineering best practices.
The project explored how low-rank matrix factorization can reduce the storage and computational requirements of neural networks while maintaining strong classification performance. Rather than building a single model, the goal was to develop an extensible framework capable of constructing, training, evaluating, and managing multiple network architectures through configuration files.

Low-Rank Neural Networks
Instead of training a full weight matrix, low-rank layers factorize weights into smaller matrices (U, S, V), reducing memory usage and computational cost while preserving model performance. This approach draws directly on singular value decomposition (SVD) from linear algebra.
Configuration-Driven Architecture
Neural networks are defined through TOML configuration files, allowing new architectures to be created without modifying source code. A configuration entry specifies the layer type, dimensions, activation function, and rank:
Features
- Dense, Vanilla Low-Rank, and Dynamic Low-Rank neural network implementations
- Configuration-driven architecture design using TOML files
- Automatic model construction from user-defined configurations
- MNIST training and evaluation pipeline using PyTorch
- Model checkpointing and parameter loading
- Batch execution of multiple experiments from a folder
- Training logs and reproducible experiment management
- Unit testing with PyTest
Technical Highlights
The framework implements custom low-rank layers that factorize weight matrices into smaller components, reducing parameter count and computational cost. Models are assembled dynamically from configuration files, enabling rapid experimentation without changing source code. The software was designed around object-oriented principles with separate modules for:
- Configuration parsing
- Neural network construction
- Custom layer implementations
- Activation functions
- Training and evaluation
- Experiment management
Results
Three architectures were evaluated on the MNIST handwritten digit dataset:
| Architecture | Accuracy |
|---|---|
| Dense Network | 91.12% |
| Vanilla Low-Rank Network | 95.75% |
| Dynamic Low-Rank Network | 96.23% |
The Dynamic Low-Rank architecture achieved the best performance while maintaining the parameter-efficiency advantages of low-rank factorization.
What I Learned
- PyTorch internals and custom layer development
- Low-rank matrix factorization techniques
- Object-oriented software architecture
- Configuration-driven application design
- Automated testing with PyTest
- Experiment reproducibility and model management
- Machine learning software engineering