HiveSQL in Python: Which module to choose - pypyodbc, pyodbc, or pymssql?

17 comments-0 reblogs
avatar of @geekgirl
LeoFinance Badge
2 years ago - 3 minutes read


For those who still don't know what HiveSQL is, it is a centralized MSSQL database developed and maintained by Arcange. It is one of the oldest services on Hive and has been around for years. It is always updated and provides access to Hive blockchain data with simple or complex SQL queries.

HiveSQL is great for anybody who wants to get data from Hive blockchain fast. If you plan to make an App, or compile some specific stats, or just want to learn and practice your SQL skills with real world database, give HiveSQL a try. It is free for anybody with a Hive account.

When I first started using HiveSQL, I had absolutely zero knowledge of SQL. Using HiveSQL is how learned SQL.

I prefer to use HiveSQL within python scripts. Because before I was started learning SQL, I was trying to learn python. Knowing a little bit of python gives more options of how to get data from Hive. Using Beem module by Holger80, we can get data from Hive blockchain directly. Beem is great. However, getting data directly from the blockchain in some cases may be slow. Sometime even too slow. Perhaps, best way is to use both Beem and HiveSQL.

When I first started using HiveSQL, I used pypyodbc module. At first it was working good. But later I realized it wasn't delivering all the data I wanted and some data was getting truncated. Then, Crokkon suggested using pyodbc instead and it worked perfectly.

I have used pyodbc module for a while and didn't have any issues until I decided to deploy a simple app to heroku. The app wouldn't build properly due to some problems with pyodbc. I suspect it had to do with having proper driver. Normally on my local computer I would install the microsoft driver. In this case I couldn't do it on Heroku. I tries various solutions offered online. None of them worked.

Before giving up on deploying the app to heroku, I decided to try one more thing. It was to use a different module - pmssql. It uses FreeTDS, which is easy to install on a computer and seems to be available on heroku already.

The best thing is to connect to the an MSSQL database all three of these modules use the similar methods, so it is not difficult to adjust to code from one to another. With pypyodbc and pyodbc we need to pass the driver name as argument, and it is not needed with pymssql. Everything else, was almost same.

Once I changed to pymssql, the app deployed to heroku without any issue. For this reason I would recommend using pymssql. I haven't had any problems using it yet.

I use the following simple python function to connect to HiveSQL and the results back. Afterwards I used the results within python code to do whatever else needed to be come with the data received.

def hive_sql(SQLCommand, limit):
    db = os.environ['HIVESQL'].split()
    conn = pymssql.connect(server=db[0], user=db[1], password=db[2], database=db[3])
    cursor = conn.cursor()
    result = cursor.fetchmany(limit)
    return result

It takes SQL query commands as an argument and returns the results from HiveSQL. Within the python code we can write our SQL query and assign it to SQLCommand variable in a text format like this.

SQLCommand = '''
WHERE TABLE_NAME = 'DynamicGlobalProperties'

Hive blockchain and HiveSQL are great place to start for those who are interested in learning a little bit of python and SQL.

I would like to continue sharing my experiences using HiveSQL with more task focused code snippets and keep building into something more interesting. Feel free to share you thoughts and experiences with HiveSQL.

Posted Using LeoFinance Beta