Sincerity Lab Instructions - Part 1

in #steemdev6 years ago (edited)

The Sincerity API is intended to provide data (and meta-data) about the active accounts of the Steem blockchain to 3rd-party developers. The Sincerity Lab is a data analysis tool for the data stored and used by the API. It allows developers to explore the data they might wish to use in their applications, and investigators of the Steem blockchain to examine a large amount of relevent data visually, to identify patterns of behaviour.

For the purposes of this tool, 'active accounts' are those which have commented or voted within the last 14 days. Any accounts that have not done so will not be found in our datasets. We limit the data to 14 days to make the spam classification process efficient, and so that queries sent to the API return data quickly.

The following is the first in a short series of posts which attempt to document the Sincerity Labs functionality.

To get started in using the tool visit, and click 'Existing User Log In'. Use demo/demo as the account name and password.

There are currently around 118,000 active accounts, and by default 15% of these are included in a report that you generate. Using a small sample makes the chart generation quicker, but you can increase this to 100% by sliding the 'Sample Size' slider if you prefer.

With the slider set at 100%, the full list of 118,000 accounts will be plotted on the chart, and the default values to plot are the 'readable_reputation' against the 'classification_human_score'. Adding your Steem account name to the 'Highlight' field, will highlight your position in red, see mine here:

Adding several account names by using commas between them will highlight several accounts, so you can see for example the positions of the top 20 witnesses here:

You can see from this that all these witnesses have reputations of more than 60, and that most have high classification_human_scores. This is a measure of how likely these accounts are to be human content creators, as determined by the Sincerity software algorithm. The main exception is the Curie account which is considered to be a bot. The six accounts which have not posted or commented in the last 14 day period do not provide the algorithm with enough information to assess their purpose, so they receive our neutral score of 0.48, as do many new Steem accounts, as can be seen by the thick band across the middle of the chart.

To remove the accounts with negative reputations and low human scores, we can add two filters as follows:

Filters allow us to 'zoom in' on the data we are most interested in, so we can see it more clearly:

As well as getting the scatter chart, the raw data that makes the chart is available below it as a link to a downloadable CSV file for use in a Spreadsheet. The first 25 rows of data are also shown in a table, which allows us to get some useful information quickly.

If we reduce the sample size to 0%, only the data for the selected witnesses will be shown on the chart, and is also available in the table beneath it, with clickable links:

(steemreports is not a witness, but was accidentally included here!)

That's all for now. In the next post I'll explain what the other available data fields are, and explain how you might use some of them.


This looks really interesting. I will give it a spin when I can.

Congratulation andybets! Your post has appeared on the hot page after 23min with 7 votes.

Congratulations @andybets! You have received a personal award!

1 Year on Steemit
Click on the badge to view your own Board of Honor on SteemitBoard.

Upvote this notificationto to help all Steemit users. Learn why here!

Coin Marketplace

STEEM 0.20
TRX 0.12
JST 0.027
BTC 64091.48
ETH 3514.97
USDT 1.00
SBD 2.52