turbodbc | Python module to access relational databases | Database library

by blue-yonder C++ Version: 4.12.1 License: MIT

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | turbodbc Summary

turbodbc is a C++ library typically used in Database applications. turbodbc has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. You can download it from GitHub.

[Anaconda-Server Badge] Turbodbc is a Python module to access relational databases via the [Open Database Connectivity (ODBC)] interface. Its primary target audience are data scientist that use databases for which no efficient native Python drivers are available. For maximum compatibility, turbodbc complies with the [Python Database API Specification 2.0 (PEP 249)] For maximum performance, turbodbc offers built-in [NumPy] and [Apache Arrow] support and internally relies on batched data transfer instead of single-record communication as other popular ODBC modules do. Turbodbc is free to use ([MIT license] open source ([GitHub] works with Python 3.8+, and is available for Linux, macOS, and Windows. Turbodbc is routinely tested with [MySQL] [PostgreSQL] [EXASOL] and [MSSQL] but probably also works with your database.

Support

Quality

Security

License

Reuse

Support

turbodbc has a low active ecosystem.

It has 575 star(s) with 83 fork(s). There are 31 watchers for this library.

There were 1 major release(s) in the last 12 months.

There are 84 open issues and 163 have been closed. On average issues are closed in 258 days. There are 6 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of turbodbc is 4.12.1

Quality

turbodbc has no bugs reported.

Security

turbodbc has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

turbodbc is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

turbodbc releases are available to install and integrate.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of turbodbc

Get all kandi verified functions for this library.

turbodbc Key Features

No Key Features are available at this moment for turbodbc.

turbodbc Examples and Code Snippets

No Code Snippets are available at this moment for turbodbc.

Community Discussions

Trending Discussions on turbodbc

Trouble installing turbodbc

ModuleNotFoundError: No module named 'turbodbc' on running the docker image

Error when writing dask dataframe to mssql via turbodc

Cloud Dataflow Python: Failed to install packages: failed to install workflow

Why don't SQLAlchemy show up in the search results of `pip3 search SQLAlchemy`?

Pandas to_sql Parameters & Performance

the pythonic way of optional imports

What is the fastest way to see whether values in a numpy OrderedDict are the same in another OrderedDict?

Invalid object name when trying to insert into newly-created temporary SQL Server table from Python

pyodbc fast_executemany with Access ODBC crashes Python interpreter

QUESTION

Trouble installing turbodbc

Asked 2021-Jan-11 at 20:49

I am attempting to install turbodbc on my Ubuntu 20.10 machine.
My specs are as follows: pip 20.2.4, Python 3.8.5 , gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0

I have attempted the solutions provided in the previous posts here and and here.

I am getting this error message

...

ANSWER

Answered 2021-Jan-11 at 20:49

Boost is not installed. You can try this

Source https://stackoverflow.com/questions/65674126

QUESTION

ModuleNotFoundError: No module named 'turbodbc' on running the docker image

Asked 2020-Dec-11 at 04:38

I'm using Cloudera Hive ODBC driver in my code and I'm trying to containerize the app. Below is my Dockerfile,

...

ANSWER

Answered 2020-Dec-11 at 04:38

As suggested by @DavidMaze, I managed create a successful Dockerfile & is shown below

Source https://stackoverflow.com/questions/64766750

QUESTION

Error when writing dask dataframe to mssql via turbodc

Asked 2020-Oct-06 at 14:20

I have a dask dataframe which has 220 partitions and 7 columns. I have imported this file from a bcp file as and completed some wrangling in dask. I then want to write this whole file to mssql using turboodbc. I connect to the DB as follows:

...

ANSWER

Answered 2020-Oct-06 at 14:20

I needed to convert to a maskedarray by changing:

Source https://stackoverflow.com/questions/64154741

QUESTION

Cloud Dataflow Python: Failed to install packages: failed to install workflow

Asked 2020-Apr-17 at 03:01

I am trying to test my dataflow pipeline on the DataflowRunner. My code always gets stuck at 1 hr 1min and says: The Dataflow appears to be stuck. When digging through the stack trace of the Dataflow stackdriver, I come across the error saying the Failed to install packages: failed to install workflow: exit status 1. I saw other stack overflow messages saying that this can be caused when pip packages are not compatible. This is causing my worker startup to always fail.

This is my current setup.py. Can someone please help me understand what I am missing. The job id is 2018-02-09_08_22_34-6196858167817670597.

setup.py

...

ANSWER

Answered 2018-Feb-24 at 20:05

So I have figured out that workflow is not a pypi package in this case, but actually the name of the .tar that is created by Dataflow which contains the source code. Dataflow will compress your source code and create a workflow.tar file in your staging environment, then it will try to run pip install workflow.tar. If any issues comes up from this install, it will fail to install the packages onto the workers.

My issue was resolved by a few things: 1) I added six==1.10.0 to my requires, as I found from : Workflow failed. Causes: (35af2d4d3e5569e4): The Dataflow appears to be stuck , that there is an issue with the latest version of six. 2) I realized that sqlalchemy-vertica and sqlalchemy are out of sync and have issues with dependency versions. I hence removed my need for both and found a different vertica client.

Source https://stackoverflow.com/questions/48710679

QUESTION

Why don't SQLAlchemy show up in the search results of `pip3 search SQLAlchemy`?

Asked 2020-Apr-01 at 18:38

I wanted to install SQLAlchemy for Python 3 for working with databases.

I searched for the package using pip3 search SQLAlchemy, but I didn't find SQLAlchemy as part of the results.

Why don't SQLAlchemy show up in the output below, when the package is available on PyPI?

https://pypi.org/project/SQLAlchemy/

SQLAlchemy 1.3.15

...

ANSWER

Answered 2020-Apr-01 at 18:38

$ pip search sqlalchemy | wc -l
100

Source https://stackoverflow.com/questions/60976156

QUESTION

Pandas to_sql Parameters & Performance

Asked 2019-Dec-22 at 13:13

I'm currently trying to tune the performance of a few of my scripts a little bit and it seems that the bottleneck is always the actual insert into the DB (=MSSQL) with the pandas to_sql function.

One factor which plays into this is mssql's parameter limit of 2100.

I establish my connection with sqlalchemy (with the mssql + pyodbc flavour):

...

ANSWER

Answered 2019-Dec-22 at 13:13

If you are using the most recent version of pyodbc with ODBC Driver 17 for SQL Server and fast_executemany=True in your SQLAlchemy create_engine call then you should be using method=None (the default) in your to_sql call. That will allow pyodbc to use an ODBC parameter array and give you the best performance under that setup. You will not hit the SQL Server stored procedure limit of 2100 parameters (unless your DataFrame has ~2100 columns). The only limit you would face would be if your Python process does not have sufficient memory available to build the entire parameter array before sending it to the SQL Server.

The method='multi' option for to_sql is only applicable to pyodbc when using an ODBC driver that does not support parameter arrays (e.g., FreeTDS ODBC). In that case fast_executemany=True will not help and may actually cause errors.

Source https://stackoverflow.com/questions/59444160

QUESTION

the pythonic way of optional imports

Asked 2019-Nov-17 at 16:48

I have a package that allows the user to use any one of 4 packages they want to connect to a database. It works great but I'm unhappy with the way I'm importing things.

I could simply import all the packages, but I don't want to do that in case the specific user doesn't ever need to use turbodbc for example:

...

ANSWER

Answered 2018-Oct-16 at 00:10

You can put imports in places other than the beginning of the file. "Re-importing" something doesn't actually do anything, so it's not computationally expensive to import x frequently:

Source https://stackoverflow.com/questions/52826046

QUESTION

What is the fastest way to see whether values in a numpy OrderedDict are the same in another OrderedDict?

Asked 2019-Apr-24 at 20:32

I am trying to identify whether two values held in different numpy orderdict objects are the same.

Both dictionaries were created by using the fetchallnumpy() option in turbodbc and consist of two keys. First key is an id field the second key is a string value of variable length. I want to see whether the string value in the fist set of dictionary items, is present in the second set of dictionary items.

It's probably worth noting that both dictionary objects are holding approximately 60 million values under each key.

I've tried several things so far:-

np.isin(dict1[str_col],dict2[str_col])

As a function but this was extremely slow, presumably because the string values are stored as dtype object.
I've tried converting both ditctionary objects to numpy arrays with an explicit string type as np.asarray(dict1[str_col], dtype='S500') and then tried to use the isin and in1d functions. At which point the system runs out of RAM. Have swapped out 'S500' to dtype=np.string_ but still get a MemoryError. (ar=np.concatenate((ar1,ar2))) whilst performing the isin function.
I also tried a for loop.

[r in dict2[str_col] for r in dict1[str_col]]

Again this was extremely slow.

My aim is to have a relatively quick way of testing the two string columns without running out of memory.

Additional Bits In the long run I'll be running more than one check as I'm trying to identify > new values and values that have changed.

Dictionary A = Current Data ['ID': [int,int,int]] Dictionary B = Historic Data ['record':[str,str,str]]

So the bits I'm interested in are :-

A != B (current record is different to historic record)
A not present in B (New record added to the database)
B not present in A (Records need to be redacted)

The last two elements the quickest way I've found so far has been to pass the id columns to a function that contains the np.isin(arr1,arr2). Takes on average 15 seconds to compare the data.

...

ANSWER

Answered 2019-Apr-24 at 15:37

You can use np.searchsorted for faster searches:

Source https://stackoverflow.com/questions/55833558

QUESTION

Invalid object name when trying to insert into newly-created temporary SQL Server table from Python

Asked 2018-Aug-13 at 02:53

I'm trying to create a temporary table in Microsoft SQL Server, then insert data into it, then return the data to Python as a dataframe, preferably.

Here is my connection, which works fine (password hidden).

...

ANSWER

Answered 2018-Aug-13 at 02:53

I had to use [enmax].[smccarth].[#retaildeals] instead of just #retaildeals.

Source https://stackoverflow.com/questions/51689634

QUESTION

pyodbc fast_executemany with Access ODBC crashes Python interpreter

Asked 2018-Apr-24 at 16:46

I'm trying to generate and insert many (>1.000.000) Rows in a MS Access Database. For the generation I use numpy functions, therefore I try to access the database with python. I started with pyodbc:

...

ANSWER

Answered 2018-Apr-24 at 16:27

The pyodbc fast_executemany feature uses an ODBC mechanism called "parameter arrays". Not all ODBC drivers support parameter arrays, and apparently the Microsoft Access ODBC driver is one that doesn't. As mentioned in the pyodbc Wiki

Note that this feature ... is currently only recommended for applications running on Windows that use Microsoft's ODBC Driver for SQL Server.

Source https://stackoverflow.com/questions/50002917

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install turbodbc

You can download it from GitHub.

Support

Follow this link to the [latest turbodbc documentation](http://turbodbc.readthedocs.io/en/latest/). The documentation explains how to install and use turbodbc, and also provides answers to many questions you might have.

Find more information at: