What is the Hive Metastore URI address? hive-site.xml? hive config resources?

Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data summarization, query and analysis.Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop.

In configuring an Apache NiFi Data Flow (within Hortonworks Dataflow) I ran in to the need to configure the Hive Streaming component to connect to a Hive Table, this personal knowledge base article documents the the locations of the resources I needed.

What is my Hive Metastore URI?

This is located on your Hive Metastore host at port 9083 and uses the Thrift protocol, an example URI would look like this:

thrift://<host_name>:9083

 

Where is my hive-site.xml file located? What should I enter under Hive Config Resources?

When configuring Apache NiFi to connect to a Hive table using Hive Streaming you will need to enter the location of your hive-site.xml file under Hive config resources. Below you can see the location in my hadoop node, to find the location in your installation look under directory /etc/hive the script below can help you with this:

 

#find the Hive folder
cd /etc/hive
#run a search for the hive-site.xml file, starting at the current location
find . -name hive-site.xml

#in my case after examining the results from the command the file is located at:

/etc/hive/2.6.5.0-292/0/hive-site.xml


 

 

 

 

Error deploying Hive using the Ambari agent (MySQL JAVA Connector JAR file missing)

Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data summarization, query and analysis.Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop.

 

I recently ran into an issue when deploying Hive using Ambari. Here’s what my personal Knowledge Base article on the issue.

Symptoms

When deploying Hive and Hive Metastore to a new node, the “Hive Metastore Start” task fails. At the bottom of stderr I found the following message:

 

File "/usr/lib/ambari-agent/lib/resource_management/core/source.py", line 52, in __call__

    return self.get_content()

  File "/usr/lib/ambari-agent/lib/resource_management/core/source.py", line 197, in get_content

    raise Fail("Failed to download file from {0} due to HTTP error: {1}".format(self.url, str(ex)))

resource_management.core.exceptions.Fail: Failed to download file from http://<my_node_host>:8080/resources/ mysql-connector-java.jar due to HTTP error: HTTP Error 404: Not Found

 

Root Cause

I did deploy the Amari, HDP and HDP Utils repositories on a local mirror host. It seems as if the Ambari Agent assumes that the MySQL JAVA Connector JAR file would also be hosted by my local mirror.

This could also happen if the repositories configured during deployment no longer host the MySQL JAVA Connector JAR file.

Solution

Once I figured the root cause out it was easy to solve the issue by using YUM to manually install the MySQL JAVA Connector JAR file:

 

#On the Linux/Unix host(s) experiencing the install error, issue the following command:

sudo yum -y -q install mysql-connector-java