Posts

Connecting to PostgreSQL database and making queries within Python code.

avatar of @geekgirl
25
@geekgirl
·
0 views
·
2 min read

Cheat sheet from a Udemy course

I code as a hobby. I like how simple codes can automate daily tasks, experiment with new ideas, and just be fun. Whenever I get a chance I do like to learn new coding skills. But one aspect of coding I always tried to avoid was dealing with databases. I did have experience in the past retrieving data from database using SQL. I learned about SQL from friends here, @carlgnash and @markangeltrueman. First time I used SQL was for curating purposes using SteemSQL(now HiveSQL). Usually there are some GUI programs to use SQL. I used it within a python code, so that I could do other operations on the data. In the following post I wrote about how to do the initial setup on Mac to use SQL commands in a python code:

It can still be useful for those need to connect to a MSSQL server on a Mac computer. One change I would make is to use pyodcb, instead of pypyodcb as was suggested by @crokkon back then, and worked perfectly.

Last few days I have been trying to learn more about SQL, specifically PostgreSQL. It is open source, widely used, and powerful relational database. I almost finished a Udemy course by Jose Portilla. I would highly recommend this Udemy course to anybody who is interested in learning SQL. Jose is a really good at teaching how to code. He also has useful courses on python, data science, etc.

We can download PostgreSQL from PostgreSQL.org. To connect to the database, create databases, query data, etc we can use pgAdmin 4, which can be found at pgadmin.org. pgAdmin 4 runs on a browser and is really cool. I enjoyed using when learning how to SQL. However, my personal preference is to use SQL commands within python code. To do that we can use psycopg2 module. The following code does just that, it allows to connect to a PostgreSQL within a python code. Afterwards we can execute any python operations we want with the retrieved data. Moreover, we should be able to create our own databases, tables, and store data all within python code.

import psycopg2 as pg2 
import pprint 
 
def queryDB(sql): 
    database = 'awesome_database' 
    user = 'postgres' 
    password = 'XXXXXXXXXXXX' 
    conn = pg2.connect(database=database, user=user, password=password) 
    cur = conn.cursor() 
    cur.execute(sql) 
    result = cur.fetchmany(100) 
    conn.close() 
    return result 
 
sql = '''SELECT * FROM awesome_table''' 
 
result = queryDB(sql) 
pprint.pprint(result) 
 

By changing the content of the text inside sql variable we can execute any SQL commands. This code connects to a local database. To connect to a remote database we will need to provide details for host address and port. For more details on pscycopg2 visit Psycopg.org

To experiment more with SQL, I want to explore the Hive blocks, learn what and how data is stored within Hive blocks, and store them in my own local PostgresSQL database. Of course not all of Hive, just few recent transactions. All of Hive blocks would probably take a lot of time and space. Doing so will give me an opportunity to understand Hive more, and experiment with SQL. I hope to share more about that next time.

Coding is fun!