# Exporting and importing Hive metadata in a Apache Hive™ Metastore cluster

## Getting started {#before-you-begin}

1. [Create a service account](../../../iam/operations/sa/create.md) named `my-account` with the `storage.uploader` and `managed-metastore.integrationProvider` roles.
1. [Configure the network and create a Apache Hive™ Metastore cluster](cluster-create.md). When creating it, specify the `my-account` service account.
1. [Create a bucket](../../../storage/operations/buckets/create.md) in Yandex Object Storage. It will store the metadata file for import and export.
1. [Grant](../../../storage/operations/buckets/edit-acl.md) the `READ and WRITE` permission to `my-account` for the bucket you created earlier.

For more on connecting to a bucket with configured bucket policies, see [this guide](s3-policy-connect.md).

## Exporting data {#export}

{% list tabs group=instructions %}

- Management console {#console}

   1. In the [management console](https://console.yandex.cloud), select the relevant folder.
   1. [Navigate](../../../console/operations/select-service.md#select-service) to **Yandex MetaData Hub**.
   1. In the left-hand panel, select ![image](../../../_assets/console-icons/database.svg) **Metastore**.
   1. Click ![image](../../../_assets/console-icons/ellipsis.svg) for the cluster you need and select ![image](../../../_assets/console-icons/arrow-up-from-square.svg) **Export**.
   1. In the window that opens, specify the following:

      * Bucket you [created earlier](#before-you-begin) for cluster data export.
      * The `.sql` file the cluster data will be written to. If a file with that name already exists, it will be overwritten.

   1. Click **Export**.

- CLI {#cli}

   If you do not have the Yandex Cloud CLI yet, [install and initialize it](../../../cli/quickstart.md#install).

   The folder used by default is the one specified when [creating](../../../cli/operations/profile/profile-create.md) the CLI profile. To change the default folder, use the `yc config set folder-id <folder_ID>` command. You can also specify a different folder for any command using `--folder-name` or `--folder-id`. If you access a resource by its name, the search will be limited to the default folder. If you access a resource by its ID, the search will be global, i.e., through all folders based on access permissions.

   To export metadata from a Apache Hive™ Metastore cluster, run this command:

   ```bash
   yc managed-metastore cluster export-data <cluster_name_or_ID> \
      --bucket <bucket_name> \
      --filepath <data_file>
   ```

   Where:

   * `--bucket`: Bucket you [created earlier](#before-you-begin) for cluster data export.
   * `--filepath`: Path to the `.sql` file to which the cluster data will be written. If a file with that name already exists, it will be overwritten.

   You can get the cluster ID and name with the [list of clusters in the folder](cluster-list.md#list-clusters).

- REST API {#api}

    1. [Get an IAM token for API authentication](../../api-ref/authentication.md) and put it into an environment variable:

        ```bash
        export IAM_TOKEN="<IAM_token>"
        ```

    1. Use the [Cluster.ExportData](../../api-ref/Cluster/exportData.md) method and send the following request, e.g., via [cURL](https://curl.se/):

        ```bash
        curl \
            --request POST \
            --header "Authorization: Bearer $IAM_TOKEN" \
            --url 'https://metastore.api.cloud.yandex.net/managed-metastore/v1/clusters/<cluster_ID>:export' \
            --data '{
                       "bucket": "<bucket_name>",
                       "filepath": "<data_file>"
                    }'
        ```

        Where:

         * `bucket`: Bucket you [created earlier](#before-you-begin) for cluster data export.
         * `filepath`: Path to the `.sql` file to which the cluster data will be written. If a file with that name already exists, it will be overwritten.

         You can get the cluster ID and name with the [list of clusters in the folder](cluster-list.md#list-clusters).

    1. Check the [server response](../../api-ref/Cluster/exportData.md#yandex.cloud.operation.Operation) to make sure your request was successful.

- gRPC API {#grpc-api}

    1. [Get an IAM token for API authentication](../../api-ref/authentication.md) and put it into an environment variable:

        ```bash
        export IAM_TOKEN="<IAM_token>"
        ```

    1. Clone the [cloudapi](https://github.com/yandex-cloud/cloudapi) repository:
       
       ```bash
       cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
       ```
       
       Below, we assume that the repository contents reside in the `~/cloudapi/` directory.

    1. Use the [ClusterService.ExportData](../../api-ref/grpc/Cluster/exportData.md) call and send the following request, e.g., via [gRPCurl](https://github.com/fullstorydev/grpcurl):

        ```bash
        grpcurl \
            -format json \
            -import-path ~/cloudapi/ \
            -import-path ~/cloudapi/third_party/googleapis/ \
            -proto ~/cloudapi/yandex/cloud/metastore/v1/cluster_service.proto \
            -rpc-header "Authorization: Bearer $IAM_TOKEN" \
            -d '{
                    "cluster_id": "<cluster_ID>",
                    "bucket": "<bucket_name>",
                    "filepath": "<data_file>"
                }' \
            metastore.api.cloud.yandex.net:443 \
            yandex.cloud.metastore.v1.ClusterService.ExportData
        ```

        Where:

         * `bucket`: Bucket you [created earlier](#before-you-begin) for cluster data export.
         * `filepath`: Path to the `.sql` file to which the cluster data will be written. If a file with that name already exists, it will be overwritten.

        You can get the cluster ID with the [list of clusters in the folder](cluster-list.md#list-clusters).

    1. Check the [server response](../../api-ref/grpc/Cluster/exportData.md#yandex.cloud.operation.Operation) to make sure your request was successful.

{% endlist %}

## Importing data {#import}

Before importing, [upload](../../../storage/operations/objects/upload.md#simple) the `.sql` file with metadata into the bucket you [created earlier](#before-you-begin). For information on how to prepare a file and how the import process works, see [Transferring metadata between Yandex Data Processing clusters using Apache Hive™ Metastore](../../tutorials/metastore-import.md).

{% list tabs group=instructions %}

- Management console {#console}

   To import data to a Apache Hive™ Metastore cluster:

   1. In the [management console](https://console.yandex.cloud), select the relevant folder.
   1. [Navigate](../../../console/operations/select-service.md#select-service) to **Yandex MetaData Hub**.
   1. In the left-hand panel, select ![image](../../../_assets/console-icons/database.svg) **Metastore**.
   1. Click ![image](../../../_assets/console-icons/ellipsis.svg) for the cluster you need and select ![image](../../../_assets/console-icons/arrow-down-to-square.svg) **Import**.
   1. In the window that opens, select the bucket you [created earlier](#before-you-begin) and the file to import the cluster data from.
   1. Click **Import**.

- CLI {#cli}

   If you do not have the Yandex Cloud CLI yet, [install and initialize it](../../../cli/quickstart.md#install).

   The folder used by default is the one specified when [creating](../../../cli/operations/profile/profile-create.md) the CLI profile. To change the default folder, use the `yc config set folder-id <folder_ID>` command. You can also specify a different folder for any command using `--folder-name` or `--folder-id`. If you access a resource by its name, the search will be limited to the default folder. If you access a resource by its ID, the search will be global, i.e., through all folders based on access permissions.

   To import metadata to a Apache Hive™ Metastore cluster, run this command:

   ```bash
   yc managed-metastore cluster import-data <cluster_name_or_ID> \
      --bucket <bucket_name> \
      --filepath <data_file>
   ```

   Where:

   * `--bucket`: Bucket you [created earlier](#before-you-begin) to import the cluster data from.
   * `--filepath`: Path to the `.sql` file to import the cluster data from.

   You can get the cluster ID and name with the [list of clusters in the folder](cluster-list.md#list-clusters).

- REST API {#api}

    1. [Get an IAM token for API authentication](../../api-ref/authentication.md) and put it into an environment variable:

        ```bash
        export IAM_TOKEN="<IAM_token>"
        ```

    1. Use the [Cluster.ImportData](../../api-ref/Cluster/importData.md) method and send the following request, e.g., via [cURL](https://curl.se/):

        ```bash
        curl \
            --request POST \
            --header "Authorization: Bearer $IAM_TOKEN" \
            --url 'https://metastore.api.cloud.yandex.net/managed-metastore/v1/clusters/<cluster_ID>:import' \
            --data '{
                       "bucket": "<bucket_name>",
                       "filepath": "<data_file>"
                    }'
        ```

        Where:

         * `bucket`: Bucket you [created earlier](#before-you-begin) to import the cluster data from.
         * `filepath`: Path to the `.sql` file to import the cluster data from.

         You can get the cluster ID and name with the [list of clusters in the folder](cluster-list.md#list-clusters).

    1. Check the [server response](../../api-ref/Cluster/importData.md#yandex.cloud.operation.Operation) to make sure your request was successful.

- gRPC API {#grpc-api}

    1. [Get an IAM token for API authentication](../../api-ref/authentication.md) and put it into an environment variable:

        ```bash
        export IAM_TOKEN="<IAM_token>"
        ```

    1. Clone the [cloudapi](https://github.com/yandex-cloud/cloudapi) repository:
       
       ```bash
       cd ~/ && git clone --depth=1 https://github.com/yandex-cloud/cloudapi
       ```
       
       Below, we assume that the repository contents reside in the `~/cloudapi/` directory.

    1. Use the [ClusterService.ImportData](../../api-ref/grpc/Cluster/importData.md) call and send the following request, e.g., via [gRPCurl](https://github.com/fullstorydev/grpcurl):

        ```bash
        grpcurl \
            -format json \
            -import-path ~/cloudapi/ \
            -import-path ~/cloudapi/third_party/googleapis/ \
            -proto ~/cloudapi/yandex/cloud/metastore/v1/cluster_service.proto \
            -rpc-header "Authorization: Bearer $IAM_TOKEN" \
            -d '{
                    "cluster_id": "<cluster_ID>",
                    "bucket": "<bucket_name>",
                    "filepath": "<data_file>"
                }' \
            metastore.api.cloud.yandex.net:443 \
            yandex.cloud.metastore.v1.ClusterService.ImportData
        ```

        Where:

         * `bucket`: Bucket you [created earlier](#before-you-begin) for cluster data export.
         * `filepath`: Path to the `.sql` file to which the cluster data will be written. If a file with that name already exists, it will be overwritten.

        You can get the cluster ID with the [list of clusters in the folder](cluster-list.md#list-clusters).

    1. Check the [server response](../../api-ref/grpc/Cluster/importData.md#yandex.cloud.operation.Operation) to make sure your request was successful.

{% endlist %}

_Apache® and [Apache Hive™](https://hive.apache.org/) are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries._