ML/Data2025

Low-Rank Neural Network Training Framework

Configurable machine learning framework for training dense and low-rank neural networks in PyTorch. Built as part of INF202 at NMBU to investigate memory-efficient neural network architectures and software engineering best practices.

GitHub

The project explored how low-rank matrix factorization can reduce the storage and computational requirements of neural networks while maintaining strong classification performance. Rather than building a single model, the goal was to develop an extensible framework capable of constructing, training, evaluating, and managing multiple network architectures through configuration files.

Low-Rank Neural Networks

Instead of training a full weight matrix, low-rank layers factorize weights into smaller matrices (U, S, V), reducing memory usage and computational cost while preserving model performance. This approach draws directly on singular value decomposition (SVD) from linear algebra.

Configuration-Driven Architecture

Neural networks are defined through TOML configuration files, allowing new architectures to be created without modifying source code. A configuration entry specifies the layer type, dimensions, activation function, and rank:

Features

Dense, Vanilla Low-Rank, and Dynamic Low-Rank neural network implementations
Configuration-driven architecture design using TOML files
Automatic model construction from user-defined configurations
MNIST training and evaluation pipeline using PyTorch
Model checkpointing and parameter loading
Batch execution of multiple experiments from a folder
Training logs and reproducible experiment management
Unit testing with PyTest

Technical Highlights

The framework implements custom low-rank layers that factorize weight matrices into smaller components, reducing parameter count and computational cost. Models are assembled dynamically from configuration files, enabling rapid experimentation without changing source code. The software was designed around object-oriented principles with separate modules for:

Configuration parsing
Neural network construction
Custom layer implementations
Activation functions
Training and evaluation
Experiment management

Results

Three architectures were evaluated on the MNIST handwritten digit dataset:

Architecture	Accuracy
Dense Network	91.12%
Vanilla Low-Rank Network	95.75%
Dynamic Low-Rank Network	96.23%

The Dynamic Low-Rank architecture achieved the best performance while maintaining the parameter-efficiency advantages of low-rank factorization.

What I Learned

PyTorch internals and custom layer development
Low-rank matrix factorization techniques
Object-oriented software architecture
Configuration-driven application design
Automated testing with PyTest
Experiment reproducibility and model management
Machine learning software engineering

Stack

PythonPyTorchLinear AlgebraLow-Rank FactorizationTOMLPyTestOOPMachine LearningMNISTGit

GitHub →