SteemOps: Extracting and Analyzing Key Operations in Steemit Blockchain-based Social Media Platform


Advancements in distributed ledger technologies are driving the rise of blockchain-based social media platforms such as Steemit, where users interact with each other in similar ways as conventional social networks. These platforms are autonomously managed by users using decentralized consensus protocols in a cryptocurrency ecosystem. The deep integration of social networks and blockchains in these platforms provides potential for numerous cross-domain research studies that are of interest to both the research communities. However, it is challenging to process and analyze large volumes of raw Steemit data as it requires specialized skills in both software engineering and blockchain systems and involves substantial efforts in extracting and filtering various types of operations. To tackle this challenge, we collect over 38 million blocks generated in Steemit during a 45 month time period from 2016/03 to 2019/11 and extract ten key types of operations performed by the users. The results generate SteemOps, a new dataset that organizes over 900 million operations from Steemit into three sub-datasets: 1) social-network operation dataset (SOD); 2) witness-election operation dataset (WOD); 3) value-tansfer operation dataset (VOD). We describe the dataset schema and its usage in detail and outline various potential research directions based on SteemOps. SteemOps is designed to facilitate future studies aimed at providing better insights on emerging blockchain-based social media platforms.

In 2021 ACM Conference on Data and Application Security and Privacy (CODASPY 2021 Dataset/Tool Paper).