Posts

LeoGlossary: Data

avatar of @leoglossary
25
@leoglossary
·
·
0 views
·
3 min read

Facts or numbers, collected to be examined and considered and used to help decision-making. Many associate it with bits and bytes due to the advancement in the digital era. Data can also be texts or numbers written on papers.

Data is measured, collected and reported, and analyzed, whereupon it is often visualized using graphs, images or other analysis tools.

Data can be generated by:

  • Humans
  • Machines
  • Human-Machine combinations

Due to the expansion of the Internet along with smartphones, the amount of data created by humanity (along with machines)has exploded. It is now on a growth curve that is exponential.

Process

Raw data (“unprocessed data”) is often a collection of numbers or characters before it’s been “cleaned” and corrected by researchers. It must be corrected so that we can remove outliers, instrument or data entry errors. Data processing commonly occurs by stages, and therefore the “processed data” from one stage could also be considered the “raw environment.

Experimental data is the data that is generated within the observation of scientific investigations. This is often used in research.

Artificial Intelligence

The advancement of artificial intelligence is requiring a great deal more data. To date, most of the process entails the combination of growing amounts of computing with expanding data.

Machine learning engines use the Internet to scrap data from social media sites. This led to platform such as Twitter and Reddit to take steps to reduce the pulling of their data. Since it is on their servers, these technology companies look at it as theirs.

Corporations that generate the most data have an obvious advantage. Two of the leaders are Facebook and Google, both of who are leading websites.

AI requires a foundation of specialized hardware and software for writing and training machine learning algorithms. This tends to be a process that comes with a heavy price tag.

That said, due to processes such as Moore's Law, we see the capability for AI development expanding. No longer are breakthroughs are coming only from large corporations with the ability to hire the best developers.

Why Is Data Important

It helps to support organizational decision-making and strategy.

  • Data helps in make better decisions.
  • Data helps in solve problems by finding the reason for underperformance.
  • Data helps one to evaluate the performance.
  • Data helps one improve processes.

All of this can help businesses to understand both consumers and the market.

Decentralized Data

Blockchain is introducing a new form of data storage.

Most databases are in possession of individual entities that house them on private servers. This means the data is controlled by that company through authorized access.

By establishing networks with decentralized nodes, data is duplicated on multiple servers that are unrelated.

This was originally brought to the masses with the release of the Bitcoin network. It was the first blockchain to operate in this manner.

The Bitcoin network is a ledger, housing data similar to that of a bank or financial institution. It is a series of transactions. All balances are adjusted as more information is processed.

Ethereum took things to the next level with the introduction of smart contracts. This allows for data outside that of a ledger to be stored on these decentralized chains.

Steem brought the storage of text at the base layer. This starting to bring social media into the picture. It took another turn when there was a fork which resulted in Hive.

Many feel blockchain is going to help usher in the era of distributed computing. With storage being decentralized, no longer would the Internet be build upon centralized cloud storage. The different data distribution is at the core of Web 3.0.

Data Accessibility

The last few decades saw a rise in data accessibility.

Databases are now stored on servers allowing people to access. Entire companies such as Netflix are built around this model.

When information is stored in physical form, it is hard to access, duplicate, and distribute. This was all accelerated as well as being simplified when it became digitized.

General: