kandi X-RAY | PyHive Summary
kandi X-RAY | PyHive Summary
Python interface to Hive and Presto. 🐝
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Read the data from an iprot
- Reads the data from an iprot
- Read this value from an IProt
- Execute the given operation
- Escape an object
- Escape parameters
- Retrieve the description of the result
- Fetch data from the fetcher
- Check response status
- Process a GetCrossReference message
- Process a CancelDelegationToken request
- Process a CancelOperation request
- Process a close operation
- Process GetFunctions message
- Poll the status of the operation
- Process FetchResults message
- Process a GetDelegateToken message
- Get the columns of a table
- Reads this object from an iprot
- Read the data from iprot
- Fetch more results from the server
- Reads this struct from an iprot
- Get the indexes of the table
- Process GetOperationStatus request
- Process GetTypeInfo message
- Process GetPrimaryKeys message
PyHive Key Features
PyHive Examples and Code Snippets
CREATE DATABASE hive_datasource
WITH
engine='hive',
parameters={
"user": "demo_user",
"password": "demo_password",
"host": "127.0.0.1",
"port": "10000",
"database": "default"
};
SELECT * FROM hive_datasource.test_hdb;
# Install additional packages and do any other bootstrap configuration in this script
# For production clusters it's recommended to build own image with this step done in CI
bootstrapScript: |
#!/bin/bash
rm -rf /var/lib/apt/lists/* &&
class TestHive(unittest.TestCase):
@mock.patch('pyhive.hive.connect')
def test_workflow(self, mock_connect):
hive_ip = "localhost"
processor = Hive(hive_ip)
mock_connect.assert_called_with(hive_ip)
engine = create_engine('hive://localhost:10000/default')
HKEY_LOCAL_MACHINE\SOFTWARE\Carnegie Mellon\Project Cyrus\SASL Library\SearchPath
C:\Users\cdarling\Miniconda3\envs\hive\Library\bin\sasl2
from pyhive import hive
class Hive(object):
def connect(self):
return hive.connect(host='hive.hadoop-prod.abc.com',
port=10000,
database='temp',
RUN apt-get update && apt-get install libsasl2-dev
hive.connect('host', configuration={'hive.strict.checks.cartesian.product':'false'})
hive.connect('host', configuration={'hive.mapred.mode':'strict'})
from pyhive import hive
cursor = hive.connect('localhost').cursor()
cursor.execute('SELECT * FROM my_awesome_data LIMIT 10')
print(cursor.fetchone())
print(cursor.fetchall())
Community Discussions
Trending Discussions on PyHive
QUESTION
Superset want to connect HiveServer2 datasource with custom auth (that is specify username and password), python code as below is ok
...ANSWER
Answered 2022-Feb-14 at 04:10QUESTION
Is there a way for presto to check if a string is hex or not? I have the following query keeps failing:
...ANSWER
Answered 2021-Nov-12 at 21:44QUESTION
I want to automate the whole process to test a scenario where I would like to create a database, perform an operation and then delete the database. Is there any way to do it using Python 3. I tried with PyHive, but it requires the database name to connect to.
...ANSWER
Answered 2021-Aug-11 at 07:48If one cannot connect to database and need to run hive commands from terminal, we can use Popen function from subprocess module to use Hive from the Terminal.
QUESTION
I want to connect Superset to a Databricks for querying the tables. Superset uses SQLAlchemy to connect to databases which requires a PAT (Personal Access Token) to access.
It is possible to connect and run queries when I use the PAT I generated on my account through Databricks web UI? But I do not want to use my personal token in a production env. Even so, I was not able to find how to generate a PAT like token for a Service Principal.
The working SQLAlchemy URI is looks like this:
...ANSWER
Answered 2021-Jun-24 at 06:20You can create PAT for service principal as following (examples are taken from docs, do export DATABRICKS_HOST="https://hostname"
before executing):
- Add service principal into the Databricks workspace using SCIM API (doc):
QUESTION
I'm new to testing and testing in python. I have a python class that looks like this :
File name : my_hive.py
ANSWER
Answered 2021-Apr-27 at 06:02Your problem is that you have already mocked connect
, so the subsequent calls on the result of connect
will be made on the mock, not on the real object.
To check that call, you have to make the check on the returned mock object instead:
QUESTION
I launch a Dataproc cluster and serve Hive on it. Remotely from any machine I use Pyhive or PyODBC to connect to Hive and do things. It's not just one query. It can be a long session with intermittent queries. (The query itself has issues; will ask separately.)
Even during one single, active query, the operation does not show as a "Job" (I guess it's Yarn) on the dashboard. In contrast, when I "submit" tasks via Pyspark, they show up as "Jobs".
Besides the lack of task visibility, I also suspect that, w/o a Job, the cluster may not reliably detect a Python client is "connected" to it, hence the cluster's auto-delete might kick in prematurely.
Is there a way to "register" a Job to companion my Python session, and cancel/delete the job at times of my choosing? For my case, it is a "dummy", "nominal" job that does nothing.
Or maybe there's a more proper way to let Yarn detect my Python client's connection and create a job for it?
Thanks.
...ANSWER
Answered 2021-Mar-14 at 04:42This is not supported right now, you need to submit jobs via Dataproc Jobs API to make them visible on jobs UI page and to be taken into account by cluster TTL feature.
If you can not use Dataproc Jobs API to execute your actual jobs, then you can submit a dummy Pig job that sleeps for desired time (5 hours in the example below) to prevent cluster deletion by max idle time feature:
QUESTION
I am trying to connect prestodb which is running on localhost:8080 to apache presto(installation by scratch not the docker one).
I have installed the Pyhive connector as the indicate in the documentation, also I have tried:
- hive://hive@localhost:8080/mysql
- presto://localhost:8080/
- presto://localhost:8080/mysql
- presto://localhost:8080/mysql/test
- hive://hive@localhost:8080/mysql/test where mysql is the catalog and the test is the name of the db, and nothing works :/
Any ideas ?
Thank you
...ANSWER
Answered 2021-Mar-10 at 22:11Actually I was missing the python package "requests" that is needed from PyHive.. So a pip3 install requests solved the issue!
QUESTION
I am new to Apache Pinot, PrestoDb and Superset. I have successfully setup PrestoDB and connected it to Apache Pinot using the following steps:
...ANSWER
Answered 2021-Feb-22 at 09:14When you try to access presto from superset, the network connection is between superset container to presto container, so localhost will not work.
You will need to get the real ip of prestodb container, either container ip or host ip. Can you try the following?
QUESTION
I need to replace a table in Hive with a new pandas dataframe. I am using pyhive to create a connection engine and subsequently using pandas.to_sql with 'if_exists' as replace.
...ANSWER
Answered 2020-Nov-09 at 18:22Other answers seem to indicate that this is related to to_sql expecting a SqlAlchemy engine - I was under the impression that this is what pyhive uses to create a connection.
PyHive can create a SQLAlchemy Engine
object, but not the way you're doing it. As illustrated in the PyHive docs, you need to do something like
QUESTION
I m a newbie to python. I want to pass a command-line argument to my presto query which is inside a function and then writes the result as a CSV file. But when I try to run it on the terminal it says 'Traceback (most recent call last): File "function2.py", line 3, in from pyhive import presto ModuleNotFoundError: No module named 'pyhive'
The pyhive requirement is already satisfied. Please find attached my code:
...ANSWER
Answered 2020-Aug-24 at 11:32I only describe how to pass command line arguments to function and query.
If you define function
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install PyHive
You can use PyHive like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page