Apache Hive

Overview¶

This guide provides instructions on how to set up and use Apache Hive with Team Edition.

Before you start, you must create a connection in Team Edition and select Hive. If you have not done this, please refer to our Database Connection article.

Team Edition interacts with the Hive server using a specific driver. It supports all versions of Hive, but the correct driver must be selected: use Hive 2 for versions 2.x and earlier, and Hive 4 for version 4 and later. Team Edition also supports Hive extensions such as Kyuubi and Spark, depending on your environment configuration. You must select the appropriate driver in the Connect to a database window for these extensions.

Tip

Systems like Impala use Hive-compatible drivers, so the standard Hive JDBC driver works in this case.

Apache Hive specialty¶

Apache Hive is a data warehouse system built on top of Hadoop for querying and analyzing large datasets using a SQL-like language called HiveQL. It's optimized for batch processing and is commonly used for data summarization, ad-hoc queries, and analysis of structured data. Hive translates SQL queries into MapReduce jobs, making it well-suited for handling large-scale data stored in Hadoop Distributed File System (HDFS). Hive is a good fit for OLAP-style workloads and integrates with other big data tools in the Hadoop ecosystem.

Info

For more detailed information and a comprehensive understanding of Apache Hive, see the official documentation.

Setting up¶

This section provides an overview of Team Edition's settings for establishing a direct connection and the configuration of secure connections using SSH and proxies for Hive.

Apache Hive connection settings¶

The page of the connection settings requires you to fill in specific fields to establish the initial connection.

Field	Description
Connect by (Host/URL)	Choose whether you want to connect using a host or a URL.
URL	If you are connecting via URL, enter the URL of your Hive database here. This field is hidden if you are connecting via the host.
Host	If you are connecting via host, enter the host address of your Hive database here.
Database/Schema	Enter the name of the Hive database you want to connect to.
Port	Enter the port number for your Hive database. The default Hive port is `10000`.
Authentication	Choose the type of authentication you want to use for the connection. For detailed guides on authentication types, please refer to the following articles: - Native Database Authentication - DBeaver Profile Authentication
Connection Details	Provide additional connection details if necessary.
Driver Name	This field will be auto-filled based on your selected driver type.
Driver Settings	If there are any specific driver settings, configure them here.

Connection details¶

The Connection Details section in Team Edition allows you to customize your experience while working with Hive database. This includes options for adjusting the Navigator View, setting up Security measures, applying Filters, configuring Connection Initialization settings, and setting up Shell Commands. Each of these settings can significantly impact your database operations and workflow. For detailed guides on these settings, please refer to the following articles:

Apache Hive driver properties¶

The settings for Hive Driver properties enable you to adjust the performance of the Hive driver. These adjustments can influence the efficiency, compatibility, and features of your Hive database.

You can customize the Hive driver in Team Edition via the Edit Driver page, accessible by clicking on the Driver Settings button on the first page of the driver settings. This page offers a range of settings that can influence your Hive database connections. For a comprehensive guide on these settings, please refer to our Database drivers article.

Secure Connection Configurations¶

Team Edition supports secure connections to your Hive database. Guidance on configuring such connections, specifically SSH, Proxy, Kubernetes, and AWS SSM connections, can be found in various referenced articles. For a comprehensive understanding, please refer to these articles:

Secure Storage with Secret Providers¶

Team Edition supports various cloud-based secret providers to retrieve database credentials. For detailed setup instructions, see Secret Providers.

Powering Apache Hive with Team Edition¶

Team Edition provides a host of features designed for Hive databases. This includes the ability to view and manage databases, along with numerous unique capabilities aimed at optimizing database operations.

Apache Hive database objects¶

Team Edition lets you view and manipulate a wide range of Hive database objects. Team Edition has extensive support for various Hive metadata types, allowing you to interact with a wide variety of database objects, such as:

Databases/Schemas
Tables
- Columns
Views

Info

Hive doesn’t support referential integrity, so you won’t see primary keys or foreign keys. Diagrams also aren’t relevant.

Apache Hive features¶

Team Edition is not limited to typical SQL tasks. It also includes features specific to Hive.

Beyond regular SQL operations, Team Edition provides a range of Hive-oriented capabilities, such as:

Category	Feature
Data Types	Hive-specific types like `ARRAY`, `STRUCT`.
File System	HDFS.
Query Language	HiveQL (SQL-like query language).

Additional features compatible with Hive, but not exclusive to it:

Category	Feature
Data Transfer	Data Import
	Data Export
Data Management	Data Compare