Janea Systems enables the MKL Sparse Matrix Module on PyTorch for Windows

PyTorch is one of the most popular open source deep learning frameworks alongside Tensorflow. With a large active community and various third-party tools built on top of it, the framework is used by the likes of Meta, Uber, and Microsoft, and it’s been widely adopted in academic research and mobile/edge deployments.

For the past two years, Janea Systems has been maintaining PyTorch on Windows on behalf of Microsoft, ensuring parity across different OSs so that everyone can benefit from PyTorch. In this article, we’ll explain how we ensured a crucial module worked on PyTorch for Windows, enabling seamless sparse matrix operations and expanding the framework's capabilities for Windows developers.

What are Math Kernel Library (MKL) and Sparse Matrices?

MKL (Math Kernel Library) is a software library developed by Intel that provides optimized mathematical functions for various computational tasks.

The MKL Sparse Matrix module is a part of this library that specifically deals with sparse matrices. Sparse matrices are special types of matrices that contain mostly zero values, with only a few non-zero entries. They are commonly used in scientific computing and data analysis to represent large datasets efficiently.

The benefits of MKL Sparse:

Efficient storage: It stores only the non-zero values of a matrix, saving memory and computational resources.

Fast operations: It provides optimized algorithms for performing mathematical operations on sparse matrices, such as addition, multiplication, and solving linear equations.

Performance optimization: By using specialized techniques for sparse matrices, MKL can significantly speed up calculations compared to treating them as regular dense matrices.

Integration with other tools: It can enhance the performance of various programming languages and scientific computing environments.

Where Are Sparse Matrix Computations Used?

Sparse matrix computations are widely used in various fields, particularly where large datasets with many zero elements are involved. Common applications include:

Scientific Computing and Simulations: In physics and engineering, sparse matrices are used to model complex systems with many variables but few interactions. For instance, simulating the structural properties of materials or the dynamics of fluid flows.
Graph Algorithms: In social network analysis, sparse matrices represent connections between users. In transportation networks, they model the routes and connections between different nodes, such as cities or intersections.
Machine Learning and AI: In natural language processing, sparse matrices are used to represent text data where most word pairs do not co-occur. In recommendation systems, they model user-item interactions where only a few items are relevant to each user.
Image and Signal Processing: Sparse representations in image compression algorithms like JPEG reduce file sizes by only storing significant pixel values. In signal processing, they are used to efficiently extract important features from large datasets.
Embedded Systems and AI on the Edge: Resource Optimization and Real-Time Processing: Sparse matrix computations reduce memory usage and computational load, making them ideal for devices with limited resources. For example, AI model compression uses sparse matrices to minimize the size and increase the efficiency of models deployed on edge devices, such as smartphones or IoT devices. Sparse matrices also enable fast computations necessary for real-time applications. Autonomous vehicles use sparse matrices for real-time object detection and path planning, ensuring quick and efficient decision-making.

Bringing MKL Sparse to PyTorch on Windows

MKL Sparse wasn't supported on windows, leading to problems using some functions. Specifically, ‘mkl_sparse_d_create_csr’ and ‘mkl_sparse_destroy’ use ‘malloc/new’ and ‘free/delete’ for memory management, which requires the Universal C Runtime (UCRT). PyTorch links against the static variant of MKL with the ‘/MD’ option, meaning it uses a version of the runtime library designed for dynamic linking. However, with Microsoft's Visual Studio Compiler (MSVC), you cannot mix different runtime library options (/MD and /MT) in the same program. This conflict resulted in errors and prevented sparse matrix operations from working correctly. In simpler terms, you can't have two different versions of the same foundational software trying to work together, which caused problems.

Solution

We ensured that both PyTorch and MKL use the same C runtime library (/MD), avoiding conflicts that previously led to linker errors. Specifically, we resolved the "LNK2038: mismatch detected for 'RuntimeLibrary'" error, where the value 'MT_StaticRelease' didn't match the value 'MD_DynamicRelease'. This consistency is crucial for stability and successful compilation.

Consistent use of the same C runtime library (/MD) for PyTorch and MKL resolves linker errors and streamlines the build process, making it easier to compile and maintain projects on Windows.

By removing conditional compilation for Windows-specific cases related to sparse operations, our pull request makes the code more maintainable and reduces the need for platform-specific workarounds, enhancing the overall development workflow and cross-platform consistency.

Conclusion

We’re incredibly proud of our continued work maintaining PyTorch on Windows. By enabling sparse matrix operations, this fix unlocks new potential, from self-driving cars to advanced physics, enhancing computational efficiency and performance. As we continue to push the boundaries of software engineering and data science, such advancements are crucial in driving innovation and solving complex problems. The journey of integrating MKL sparse operations into PyTorch on Windows underscores the collaborative spirit of the open-source community and the relentless pursuit of excellence in software development.

Take a look at the original issue and the fix on GitHub here:

https://github.com/pytorch/pytorch/issues/97352

https://github.com/pytorch/pytorch/pull/102604

Ioan Manta is a Senior Software Engineer at Janea Systems. Learn more about him here.

Janea Systems Enables the MKL Sparse Matrix Module on PyTorch for Windows