Apache Hive
Overview
This guide provides instructions on how to set up and use Apache Hive with Team Edition.
Before you start, you must create a connection in Team Edition and select Hive. If you have not done this, please refer to our Database Connection article.
Team Edition interacts with the Hive server using a specific driver. It supports all versions of Hive, but the
correct driver must be selected: use Hive 2
for versions 2.x and earlier, and Hive 4
for version 4 and later.
Team Edition also supports Hive extensions such as Kyuubi and Spark, depending on your environment
configuration. You must select the appropriate driver in the Connect to a database window for these extensions.
Tip
Systems like Impala use Hive-compatible drivers, so the standard Hive JDBC driver works in this case.
Hive specialty
Apache Hive is a data warehouse system built on top of Hadoop for querying and analyzing large datasets using a SQL-like language called HiveQL. It's optimized for batch processing and is commonly used for data summarization, ad-hoc queries, and analysis of structured data. Hive translates SQL queries into MapReduce jobs, making it well-suited for handling large-scale data stored in Hadoop Distributed File System (HDFS). Hive is a good fit for OLAP-style workloads and integrates with other big data tools in the Hadoop ecosystem.
Info
For more detailed information and a comprehensive understanding of Apache Hive, see the official documentation.
Setting up
This section provides an overview of Team Edition's settings for establishing a direct connection and the configuration of secure connections using SSH and proxies for Hive.
Hive connection settings
The page of the connection settings requires you to fill in specific fields to establish the initial connection.
Field | Description |
---|---|
Connect by (Host/URL) | Choose whether you want to connect using a host or a URL. |
URL | If you are connecting via URL, enter the URL of your Hive database here. This field is hidden if you are connecting via the host. |
Host | If you are connecting via host, enter the host address of your Hive database here. |
Database/Schema | Enter the name of the Hive database you want to connect to. |
Port | Enter the port number for your Hive database. The default Hive port is 10000 . |
Authentication | Choose the type of authentication you want to use for the connection. For detailed guides on authentication types, please refer to the following articles: - Native Database Authentication - DBeaver Profile Authentication |
Connection Details | Provide additional connection details if necessary. |
Driver Name | This field will be auto-filled based on your selected driver type. |
Driver Settings | If there are any specific driver settings, configure them here. |
Connection details
The Connection Details section in Team Edition allows you to customize your experience while working with Hive database. This includes options for adjusting the Navigator View, setting up Security measures, applying Filters, configuring Connection Initialization settings, and setting up Shell Commands. Each of these settings can significantly impact your database operations and workflow. For detailed guides on these settings, please refer to the following articles:
- Connection Details Configuration
- Database Navigator
- Security Settings Guide
- Filters Settings Guide
- Connection Initialization Settings Guide
- Shell Commands Guide
Hive driver properties
The settings for Hive Driver properties enable you to adjust the performance of the Hive driver. These adjustments can influence the efficiency, compatibility, and features of your Hive database.
You can customize the Hive driver in Team Edition via the Edit Driver page, accessible by clicking on the Driver Settings button on the first page of the driver settings. This page offers a range of settings that can influence your Hive database connections. For a comprehensive guide on these settings, please refer to our Database drivers article.
Secure Connection Configurations
Team Edition supports secure connections to your Hive database. Guidance on configuring such connections, specifically SSH, Proxy, Kubernetes, and AWS SSM connections, can be found in various referenced articles. For a comprehensive understanding, please refer to these articles:
Secure Storage with Secret Providers
Team Edition supports various cloud-based secret providers to retrieve database credentials. For detailed setup instructions, see Secret Providers.
Powering Hive with Team Edition
Team Edition provides a host of features designed for Hive databases. This includes the ability to view and manage databases, along with numerous unique capabilities aimed at optimizing database operations.
Hive database objects
Team Edition lets you view and manipulate a wide range of Hive database objects. Team Edition has extensive support for various Hive metadata types, allowing you to interact with a wide variety of database objects, such as:
- Databases/Schemas
- Tables
- Columns
- Views
Info
Hive doesnāt support referential integrity, so you wonāt see primary keys or foreign keys. Diagrams also arenāt relevant.
Hive features
Team Edition is not limited to typical SQL tasks. It also includes features specific to Hive.
Beyond regular SQL operations, Team Edition provides a range of Hive-oriented capabilities, such as:
Category | Feature |
---|---|
Data Types | Hive-specific types like ARRAY , STRUCT . |
File System | HDFS. |
Query Language | HiveQL (SQL-like query language). |
Additional features compatible with Hive, but not exclusive to it:
Category | Feature |
---|---|
Data Transfer | Data Import |
Data Export | |
Data Management | Data Compare |