Accelerating R using oneMKL and Windows Subsystem for Linux
The default BLAS and LAPACK libraries used by R are rather slow, meaning installing and switching to alternative libraries can result in a significant increase in performance. One such alternative is Intel’s oneAPI Math Kernel Library (oneMKL) which is available as a standalone component or part of the Intel oneAPI Base Toolkit. MKL was also part of Microsoft R Open, which has unfortunately been discontinued. As a result, installation of and using oneMKL on Windows does not seem to be straightforward at this time. As a workaround, Windows 10/11 users can use Windows Subsystem for Linux (WSL) running Ubuntu and install oneMKL there. Here I have combined information provided by Carlos Santillan and Dirk Eddelbuettel to easily set up WSL with an R installation using Intel’s oneMKL BLAS and LAPACK libraries.
Setting up Windows Subsystem for Linux
- Click start, and search for and open Command Prompt as administrator.
- Run
wsl --install
. - Reboot your device.
- After rebooting Ubuntu will start in a Command Prompt window. Choose a username and password when asked.
- Once Ubuntu has started the installed version can be checked by running
cat /etc/os-release
in the terminal. - Update and upgrade all packages by running
sudo apt update
(enter your password when prompted) followed bysudo apt upgrade -y
andsudo apt install build-essential -y
.
Installing R and RStudio
-
R can be installed by following the instructions at https://cloud.r-project.org/bin/linux/ubuntu/ under Install R (running the lines in the terminal).
-
Follow the instructions at https://posit.co/download/rstudio-server/ to install RStudio Server (starting at step 3 because we already istalled R itself). At the top of the page and under step 1, select Debian/Ubuntu for the operating system and Debian 12/Ubuntu 22 for the server version. RStudio Server should be started automatically after the last step.
-
On your Windows desktop, open a browser and navigate to
localhost:8787
. -
Log in using the username and password chosen in step 4 of the previous section.
-
After opening up RStudio Server and logging in we can check what BLAS and LAPACK libraries we are using (the default ones at this point) by running
sessionInfo()
in the console: -
We can run
parallel::detectCores()
in the console and see that the full number of threads is available to Ubuntu RStudio Server. The default BLAS and LAPACK libraries, however, are single-threaded and do not make full use of all threads. -
Quit RStudio Server by running
q()
in the console and close the browser tab. -
In the Ubuntu terminal, run
sudo rstudio-server stop
to stop the RStudio server (it can be started again later by runningsudo rstudio-server start
). -
Before installing Intel oneMKL we will change the session timeout value of the RStudio server so we can run long scripts without the session timing out. This can be done by running
sudo nano /etc/rstudio/rserver.conf
and adding the lineauth-timeout-minutes=6000
(or some other large number) before overwriting and closing the file.
Installing Intel oneMKL
As mentioned, oneMKL can be installed as part of the oneAPI Base Toolkit. However, this toolkit is quite large at ~20 GB. We only need oneMKL and installing it as a standalone component (~3 GB) saves a lot of space.
-
In the Ubuntu terminal, run the following lines (you might have to press Enter and enter your password after running the first line):
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list sudo apt update
-
We can now look at what versions of oneMKL are available by running the following line:
sudo -E apt-cache pkgnames intel | grep intel-oneapi-mkl-20
The latest version at the time of writing is
intel-oneapi-mkl-2024.1
. -
To install this version we run:
sudo apt install intel-oneapi-mkl-2024.1 -y
-
Now run the following lines (note the oneMKL version numbers in these commands):
sudo update-alternatives --install /usr/lib/x86_64-linux-gnu/libblas.so libblas.so-x86_64-linux-gnu /opt/intel/oneapi/mkl/2024.1/lib/libmkl_rt.so 50 sudo update-alternatives --install /usr/lib/x86_64-linux-gnu/libblas.so.3 libblas.so.3-x86_64-linux-gnu /opt/intel/oneapi/mkl/2024.1/lib/libmkl_rt.so 50 sudo update-alternatives --install /usr/lib/x86_64-linux-gnu/liblapack.so liblapack.so-x86_64-linux-gnu /opt/intel/oneapi/mkl/2024.1/lib/libmkl_rt.so 50 sudo update-alternatives --install /usr/lib/x86_64-linux-gnu/liblapack.so.3 liblapack.so.3-x86_64-linux-gnu /opt/intel/oneapi/mkl/2024.1/lib/libmkl_rt.so 50 echo "/opt/intel/oneapi/mkl/2024.1/lib" | sudo tee /etc/ld.so.conf.d/mkl.conf sudo ldconfig echo "MKL_THREADING_LAYER=GNU" | sudo tee /etc/environment -a
-
If we now start Ubuntu RStudio Server (by running
sudo rstudio-server start
and going tolocalhost:8787
in a browser) and runsessionInfo()
in the console we should see that we are now using the Intel oneMKL BLAS and LAPACK libraries: -
Switching back between the default and oneMKL libraries is possible using the following commands:
# For the BLAS library: sudo update-alternatives --config libblas.so.3-x86_64-linux-gnu # For the LAPACK library: sudo update-alternatives --config liblapack.so.3-x86_64-linux-gnu
Running these lines opens a menu where a number can be entered to select the desired library:
Setting the number of threads
While the default BLAS and LAPACK libraries are single-threaded, oneMKL can make use of multiple threads. By default it seems to be able to use as many threads as there are logical cores in the system, but the actual number of threads that is used is determined dynamically. Multithreading can be problematic if a script or function makes use of explicit parallelization using, for example, the foreach package. In that case the maximum number of threads to be used by oneMKL must be manually set to 1 (or another number so that the number of MKL threads times the number of parallel R processes does not become too large).
-
In the Ubuntu terminal check if nano is installed by running
nano --version
. If it is not installed it can be installed by runningsudo apt install nano
. -
Run
sudo nano /usr/lib/R/etc/Renviron.site
. -
Add the following line to the bottom of the file:
MKL_NUM_THREADS=1
. The file should look something like this: -
Press Ctrl + X followed by the Y key to save the changes and press Enter to overwrite the file.
-
While oneMKL using 1 thread will not be as fast as allowing it to use multiple threads, it will still far outperform the default BLAS and LAPACK libraries, especially if single-threaded oneMKL is combined with efficient explicit parallelization in R scripts.
-
The dynamic aspect of how many threads MKL actually uses can be disabled by similarly adding
MKL_DYNAMIC=FALSE
to/usr/lib/R/etc/Renviron.site
.