Posts

HiveSQL is under (unexpected) maintenance

avatar of @arcange
25
@arcange
·
0 views
·
4 min read

This first of March 2022 will not have been a day like the others!

TLDR; The Hive blockchain has exceeded the number of 2,147,483,647 transactions!

For many of you, the above number may not represent anything special. But for computer scientists, it is the maximum positive value of a 32-bit signed integer.

And guess what, the blockchain hitting this limit caused HiveSQL to shut down and put itself into maintenance.

A bit of history

I designed the HiveSQL database almost 6 years ago. When doing so, I had no idea that we would reach such a number of transactions so quickly. For you (as for me at that time), this number seems quite inaccessible.

But that was without taking into account the recent dazzling success of Splinterlands and the growing number of applications that are developed on top of the Hive blockchain, which in itself is good news.

Several tables in the HiveSQL database store the various operations performed on the blockchain with the transaction number and block number in which they were included.

To store the transaction number, I chose to use a 32-bit integer (4 bytes). This means that storing the 2 billion transaction references consumes at least 8,590 GB of disk space. Add to that the indexes and foreign keys, you quickly find yourself with ~30GB required to store that data.

If I had chosen a 64-bit integer (8 bytes), thus allowing 9,223,372,036,854,775,807 transactions to be referenced, it would not only have required double the storage space, but it would also have had a negative impact on the performance of the database.

Remember that six years ago, high-performance, high-capacity NVMe drives were uncommon and extremely expensive.

The level of performance of HiveSQL, even when it is hammered by several thousand queries every day, has always been a concern of mine and that is what makes it successful today.

As always in these cases, choices had to be made!

Back to yesterday ...

You will no doubt have understood what happened yesterday: once the limit of the number of transactions was reached, the blockchain data injector simply stopped!

In itself, nothing too serious since it is designed to react in this way in the event of an error, in order to avoid injecting incorrect data and corrupting the integrity of the database.

Except that to be able to restart it, you have to modify the schema of several tables and that these sometimes contain several billion records.

I won't go into all the details but, although it seems very simple to do (just change some fields size from 32 bits to 64 bits), it is an extremely complex job that requires:

  • a lot of caution so as not to corrupt existing data,
  • total and exclusive access to the infrastructure resources,
  • additional resources, especially in terms of storage space (HiveSQL database uses over to 3TB of disk space),
  • a recovery solution and the certainty of being able to rollback if anything goes wrong,
  • a lot of time and patience!

HiveSQL in maintenance mode

This is why yesterday afternoon, as soon as I understood the reasons for the stoppage of the data injector and that I considered the actions to be taken to restart as soon as possible, I decided to put HiveSQL under (unplanned) maintenance and to block all access to the database, including read-only.

I didn't do it happily because I know how much some of you, analysts or application developers rely on the availability of this service, just like me.

Over the past 6 years, I have always been proud to be able to claim an availability rate of over 99%. The last maintenance that required downtime was more than a year and a half ago. Each time it happened, I made sure to minimize the effect and duration of the maintenance carried out.

When back?

I slept very little last night (barely two short hours). I take advantage of a waiting time during a long-running operation to write this post.

I apologize for this unexpected interruption and the late feedback on what is happening. But as already said above, choices have to be made and priorities set.

If all goes well, I hope that the services will be available again at the end of the day (no promise). It could be shorter or longer.

In any case, I will notify you as soon as this maintenance is over. Any communication and support related to this maintenance will be done on HiveSQL Discord Channel

Thank you for understanding.

PS: A few hours before reaching the fateful number of transactions that caused HiveSQL to stop, my Hive API node crashed! Is this called the law of series? As I said in the introduction: a f**** day like no other!


Check out my apps and services

Vote for me as a witness