[Yandex Cloud documentation](../../index.md) > [Tutorials](../index.md) > Development and testing > High-performance computing on preemptible VMs

# High-performance computing (HPC) on preemptible VMs

[HPC clusters](https://en.wikipedia.org/wiki/Computer_cluster) are used for computation, particularly for computation-intensive scientific calculations. A computing cluster consists of multiple servers (computing nodes) connected over a network. Each computing node has a number of multicore processors, local memory, and runs its own autonomous operating system. The most widespread are homogeneous clusters, where all nodes are identical in their architecture and performance.

Follow this tutorial to create a cluster of [preemptible VMs](../../compute/concepts/preemptible-vm.md) for performing a shared computational task. For example, you can solve a system of linear equations using the [Jacobi method](https://en.wikipedia.org/wiki/Jacobi_method).

To create a cluster and run a computational task:
1. [Get your cloud ready](#before-you-begin).
1. [Create a master VM in the cloud](#create-master-vm).
1. [Prepare the VM cluster](#prepare-cluster).
1. [Create a cluster](#create-cluster).
1. [Create a task for computations in the cluster](#config-hpc).
1. [Run and analyze the computations](#start-hpc).
1. [Delete the resources you created](#clear-out).

## Get your cloud ready {#before-you-begin}

Sign up for Yandex Cloud and create a [billing account](../../billing/concepts/billing-account.md):
1. Navigate to the [management console](https://console.yandex.cloud) and log in to Yandex Cloud or create a new account.
1. On the **[Yandex Cloud Billing](https://center.yandex.cloud/billing/accounts)** page, make sure you have a billing account linked and it has the `ACTIVE` or `TRIAL_ACTIVE` [status](../../billing/concepts/billing-account-statuses.md). If you do not have a billing account, [create one](../../billing/quickstart/index.md) and [link](../../billing/operations/pin-cloud.md) a cloud to it.

If you have an active billing account, you can create or select a [folder](../../resource-manager/concepts/resources-hierarchy.md#folder) for your infrastructure on the [cloud page](https://console.yandex.cloud/cloud).

[Learn more about clouds and folders here](../../resource-manager/concepts/resources-hierarchy.md).

### Required paid resources {#paid-resources}

The costs for hosting servers include:
* Fee for multiple continuously running [VMs](../../compute/concepts/vm.md) (see [Yandex Compute Cloud pricing](../../compute/pricing.md)).
* Fee for using a dynamic or static [public IP address](../../vpc/concepts/address.md#public-addresses) (see [Yandex Virtual Private Cloud pricing](../../vpc/pricing.md)).

## Create a master VM in the cloud {#create-master-vm}

### Create a VM {#create-vm}

To create a VM:

1. In the [management console](https://console.yandex.cloud), select the [folder](../../resource-manager/concepts/resources-hierarchy.md#folder) where you want to create your VM.
1. Navigate to **Compute Cloud**.
1. In the left-hand panel, select ![image](../../_assets/console-icons/server.svg) **Virtual machines**.
1. Click **Create virtual machine**.
1. Under **Boot disk image**, select the [Ubuntu](https://yandex.cloud/en/marketplace?tab=software&search=Ubuntu&categories=os) image.
1. Under **Location**, select an [availability zone](../../overview/concepts/geo-scope.md) where your VM will reside.
1. Under **Disks and file storages**, select `SSD` as the boot [disk](../../compute/concepts/disk.md) type.
1. Under **Computing resources**, go to the **Custom** tab and specify parameters for your current computing tasks:

    * **Platform**: `Intel Ice Lake`
    * **vCPU**: `4`
    * **Guaranteed vCPU performance**: `100%`
    * **RAM**: `4 GB`
    * **Additional**: `Preemptible`

1. Under **Network settings**:

    * In the **Subnet** field, enter the ID of a subnet in the new VM’s availability zone. Alternatively, select a [cloud network](../../vpc/concepts/network.md#network) from the list.

        * Each network must have at least one [subnet](../../vpc/concepts/network.md#subnet). If there is no subnet, create one by selecting **Create subnet**.
        * If there are no networks in the list, click **Create network** to create one:

            * In the window that opens, specify the network name and select the folder where it will be created.
            * Optionally, enable the **Create subnets** setting to automatically create subnets in all availability zones.
            * Click **Create network**.

    * In the **Public IP address** field, select `Auto` to assign a random external IP address from the Yandex Cloud pool to the VM. Alternatively, select a static address from the list if you reserved one.

1. Under **Access**, select **SSH key** and specify the VM access credentials:

    * In the **Login** field, enter a name for the user you want to create on the VM, e.g., `ubuntu`.

      {% note alert %}

      Do not use `root` or other OS-reserved usernames. For operations requiring root privileges, use the `sudo` command.

      {% endnote %}

    * In the **SSH key** field, select the SSH key saved in your [organization user](../../organization/concepts/membership.md) profile.
      
      If there are no SSH keys in your profile or you want to add a new key:
      
      1. Click **Add key**.
      1. Enter a name for the SSH key.
      1. Select one of the following:
      
          * `Enter manually`: Paste the contents of the public SSH key. You need to [create](../../compute/operations/vm-connect/ssh.md#creating-ssh-keys) an SSH key pair on your own.
          * `Load from file`: Upload the public part of the SSH key. You need to create an SSH key pair on your own.
          * `Generate key`: Automatically create an SSH key pair.
          
            When adding a new SSH key, an archive containing the key pair will be created and downloaded. In Linux or macOS-based operating systems, unpack the archive to the `/home/<user_name>/.ssh` directory. In Windows, unpack the archive to the `C:\Users\<user_name>/.ssh` directory. You do not need additionally enter the public key in the management console.
      
      1. Click **Add**.
      
      The system will add the SSH key to your organization user profile. If the organization has [disabled](../../organization/operations/os-login-access.md) the ability for users to add SSH keys to their profiles, the added public SSH key will only be saved in the user profile inside the newly created resource.

1. Under **General information**, specify the VM name. For clarity, enter `master-node`.
1. Click **Create VM**.

### Set up the VM {#setup-vm}

1. [Use SSH to connect](../../compute/operations/vm-connect/ssh.md) to the VM and switch to administrator mode in the console:

   ```bash
   sudo -i
   ```

1. Update the repository and install the required utilities:

   ```bash
   apt update
   apt install -y net-tools htop libopenmpi-dev nfs-common
   ```

1. Exit admin mode and generate SSH keys for access between the VMs:

   ```bash
   exit
   ssh-keygen -t ed25519
   ```

1. Add the key you generated to the list of allowed ones:

   ```bash
   cd ~/.ssh
   cat id_ed25519.pub >> authorized_keys
   ```

## Prepare the VM cluster {#prepare-cluster}

### Create a cluster {#create-cluster}

1. In the [management console](https://console.yandex.cloud), go to **Disks**.
1. To the right of the `master-node` VM disk, click ![image](../../_assets/options.svg) and select **Create snapshot**. Enter the name: `master-node-snapshot`. After you create the snapshot, it will appear in the list under **Snapshots**.
1. Go to **Instance groups** and click **Create group of virtual machines**.
1. Create an [instance group](../../compute/concepts/instance-groups/index.md):
   * In the **Name** field, enter a name for your instance group, e.g., `compute-group`.
   * In the **Service account** field, add a [service account](../../compute/concepts/instance-groups/access.md) to the instance group. If you do not have a service account, click **Create**, enter a name, and click **Create**.

     To create, update, and delete VMs in the group, assign the [compute.editor](../../compute/security/index.md#compute-editor) role to the service account. By default, all operations in Instance Groups are performed on behalf of a service account.

   * In the **Availability zone** field, select the availability zone the `master-node` VM resides in. Make sure the VMs are in the same availability zone to reduce latency between them.
   * Under **Instance template**, click **Define**. This will open a screen for creating a [template](../../compute/concepts/instance-groups/instance-template.md).
     * Under **Disks and file storages**, select **Add disk**. In the window that opens, specify:
       * **Type**: [SSD](../../compute/concepts/disk.md#disks-types).
       * **Contents**: From the created [snapshot](../../compute/concepts/snapshot.md) named `master-node-snapshot`.
     * Under **Computing resources**, reproduce the master VM configuration:
       * **Platform**: `Intel Ice Lake`
       * **vCPU**: `4`
       * **Guaranteed vCPU performance**: `100%`
       * **RAM**: `4 GB`
       * **Additional**: `Preemptible`
     * Under **Network settings**, specify the same network and subnet as those of the master VM. Leave **Auto** as the IP address type.
     * Under **Access**, specify the information required to access the VM:
       * In the **Login** field, enter your preferred login for the user you will create on the VM.
       * Paste your public SSH key into the **SSH key** field. You will need to create a key pair for the SSH connection on your own. To learn more, see [Connecting to a VM over SSH](../../compute/operations/vm-connect/ssh.md).
     * Click **Save**. This will take you back to the instance group creation screen.
1. Under **Scaling**, select the number of instances to create. Specify three instances.
1. Click **Create**.

### Test the cluster {#test-cluster}

[Log in over SSH](../../compute/operations/vm-connect/ssh.md) to each VM in `compute-group` and make sure you can access the `master-node` VM from them over SSH:

```bash
ping master-node
ssh master-node
```

### Configure the NFS {#configure-nfs}

To allow the VMs to use the same source files, create a shared network directory using [NFS](https://en.wikipedia.org/wiki/Network_File_System):
1. Log in to the `master-node` VM over SSH and install an NFS server:

   ```bash
   ssh <master-node VM public IP address>
   sudo apt install nfs-kernel-server
   ```

1. Create a `shared` directory for the VMs:

   ```bash
   mkdir ~/shared
   ```

1. Open the `/etc/exports` file in any text editor, e.g., `nano`:
   
   ```bash
   sudo nano /etc/exports
   ```

1. Add an entry to the file to enable access to the `shared` directory:

   ```text
   /home/<username>/shared *(rw,sync,no_root_squash,no_subtree_check)
   ```

   Save the file.
1. Apply the settings and restart the service:

   ```bash
   sudo exportfs -a
   sudo service nfs-kernel-server restart
   ```

#### Mount the directories on the group VMs {#mount}

On each VM in `compute-group`, mount the directory you created:
1. Create a `shared` directory and mount the directory with the `master-node` VM on it:

   ```bash
   mkdir ~/shared
   sudo mount -t nfs master-node:/home/<username>/shared ~/shared
   ```

1. Make sure the directory is successfully mounted:

   ```bash
   df -h
   ```

   Result:

   ```text
   Filesystem                                   Size  Used  Avail  Use%  Mounted on
   ...
   master-node:/home/<username>/shared  13G   1.8G  11G    15%   /home/<username>/shared
   ```

## Create a computing task in the cluster {#config-hpc}

1. Log in to the `master-node` VM over SSH, go to the `shared` directory, and download the `task.c` source file with a computing task:

   ```bash
   cd ~/shared
   wget https://raw.githubusercontent.com/cloud-docs-writer/examples/master/hpc-on-preemptible/task.c
   ```

   This code solves a system of linear equations using the Jacobi method. This task has one distributed implementation using MPI.
1. Compile the source file into an executable:

   ```bash
   mpicc task.c -o task
   ```

   As a result, the `task` executable file should appear in the `shared` directory.

## Run and analyze the computations {#start-hpc}

{% note tip %}

You can check the load on the VM cores by running the `htop` command in a separate SSH session on each VM.

{% endnote %}

1. Run the task on two cores using only the `master-node` VM resources:

   ```bash
   mpirun -np 2 task
   ```

   Once the task has been completed, the program will display the time spent performing it:

   ```text
   JAC1 STARTED
   1: Time of task=45.104153
   0: Time of task=45.103931
   ```

1. Run the task on four cores using only the `master-node` VM resources:

   ```bash
   mpirun -np 4 task
   ```

   Result:

   ```text
   JAC1 STARTED
   1: Time of task=36.562328
   2: Time of task=36.562291
   3: Time of task=36.561989
   0: Time of task=36.561695
   ```

1. Run the task on four cores using the resources of two VMs with two cores per VM. To do this, run the task with the `-host` key that accepts parameters in `<VM IP address>:<number of cores>[,<ip>:<cores>[,...]]` format:

   ```bash
   mpirun -np 4 -host localhost:2,<VM IP address>:2 task
   ```

   Result:

   ```bash
   JAC1 STARTED
   0: Time of task=24.539981
   1: Time of task=24.540288
   3: Time of task=24.540619
   2: Time of task=24.540781
   ```

1. Similarly, you can further increase the number of VMs and cores in use and see how distributed computing can significantly speed up the task resolution.

## Delete the resources you created {#clear-out}

To stop paying for the deployed server and VM group you created, delete the `master-node` VM and `compute-group`.

If you reserved a static public IP address for this VM:
1. From your folder, navigate to **Virtual Private Cloud**.
1. Navigate to the **Public IP addresses** tab.
1. Find the required IP address, click ![ellipsis](../../_assets/options.svg), and select **Delete**.