Questions by data: What are the most social #categories? [OC]
Before I begin the analysis, I have to give endless credit to @arcange, the creator of the http://steemsql.com/ database. Without making this amazing project free, I could not have done this analysis. All the data used here is extracted from the SteemSQL database by Tableau. Please go and support him if you can, even in the smallest way, so that people can continue to produce transparent and interesting analyses about the world of Steem.
I would like to try something: Instead of me creating an elaborate data analysis as I usually do, I would like to try a simple question-answer format.
For a first attempt messing around with the data, I'm curious as to what we can learn about some social patterns from looking at comment data in different #categories.
Can we answer this question:
I thought perhaps a good initial indicator would be the level of commenting on one another's posts.
Due to the extremely large number of categories that have been posted and commented under, the first thing to do is to filter the results to a more manageable size: "total sum of comment length in category" > 50,000,000 characters.
As a proxy for measuring sociability or interaction, I chose the average comment text length and average depth of comments (i.e. how many levels of comments on average in the comment hierarchy are there for a post)
So we can start off with a look at the average comment text length
Curious results, #spam has by far the longest average comment length. I'm sure someone has a very reasonable theory for this :)
Let's take #spam out for a moment and sort by average comment text length to get a better comparison.
I've also highlighted anything more or less
crypto-related in  .
.
Creative-related in  and
 and
regional/language-related in 
We do get a pattern emerging here somewhat.
- People tend to discuss more about crypto-stuff than other things.
- Creative-related categories are surprisingly low down on the interactivity level. Perhaps it's more about looking at stuff than discussing stuff? That would make sense, since writing and fiction are high-up exceptions.
- Some region/language-specific categories are more sociable than others :). Korean uses 1-4 letters per character with the norm being 2-3, so you could multiply it by 2.5x - putting kr very close to the others.
Now, let's shade it in with the average comment depth or the amount of hierarchical levels of comments per category:
The comment depth scale isn't very broad, no categories reach over 2.2 average comment depth: 
Funnily enough, the top interactive category by comment length is ICE COLD by comment depth - that's #ripple.
#money on the other hand, has 2.2 average comment depth. Am I right in interpreting that people on #ripple just throw their opinions out there en masse but don't engage in discussion whereas people on #money like to discuss ideas on how to best make money in much more depth?
Thanks for bearing with me, I know this was a long post! Please drop your feedback, theories and opinions below - I am so excited to hear them all.
Also, please let me know what kind of things you wondered about Steemit, and what kind of questions you have that I might answer in a future post!! I would love to do this on a regular basis, it was excellent fun :).
Thank you for reading and happy steeming! :)




Thank you so much for this! I would love to see a visualization of interactions on Steemit.
For instance, something like this or this video.
There are a lot of different communities including the Korean, Indonesian, German, Spanish, OCD, MSP, et. all.
I think it would be interesting to see the individual blobs of connections and see how centralized they are. What circle of up voting exist!
Also, if you join steemit.chat, please feel free to send me a message. I wouldn't mind paying for a specific type of data visualization and also talking about what tools you use for this type of visualization!
Thank you for bringing some amazing data and visualizations to Steem!
-BiasNarrative
This is a great idea!
I really like this idea too! I’ll look into it. Might be a bigger project as I’ve never done a network analysis with so much data
Pretty interesting results! Especially the comment-depth in my opinion. Some are quite obvious, like #news having a really low depth and #philosophy having a (somewhat) large one (even though I expected #philosophy to have a larger depth). But it's really surprising that all the technology related stuff like #technology, #crypto and #blockchain have such a low comment-depth. I wonder what's the cause for that.
I have seen a lack of interest in specific technology/science posts and it makes me sad. I loved /r/futurology and other tech forums on reddit. I don't spend much time on reddit anymore because of steemit and I just wish there was more options.
Guess we gotta grow Steem first or take the reins ourselves!
Glad you enjoyed the read :). It is strange. Perhaps #technology is more similar to #news in some ways? Also in #crypto there is a lot of quick-paced trading advice and investment advice going on. #blockchain surprises me too :/
Might be, but I'd say #technology is as newslike as #politics, which has quite a large depth. But maybe #technology stuff is just a bit more quick-paced in general. It could be interesting to watch this over time, to see if anything changes.
yeah it would be interesting! and I'm sure it will. The categories on steemit are so biased toward the crypto-world still, and I think as it matures, the other categories will gain popularity at an increased rate. I wonder how "social" they will be though!
Perfect for a follow-up post. :P
That’s right!
So what you're saying is I should write really spammy yet elaborate articles about Ripple in Korean and then never answer any comments. Got it.
On a more serious note I think you managed to visualise the current Ripple pump (at least I think that's what's going on).
Really fascinating entry yet again.
haha love it
that's what I was thinking XD
Still reading thru the analysis but the start of it just gave me an idea that it's a good one. Have you considered posting for utopian?
I’m not fully sure I understand what utopian entails :). I’m pretty new here. Is it relevant for utopian or you mean something else?
Great piece of analysis, nice job. I'm just about to start learning SQL, it's gibberish to me ATM, Tableau's a great piece of software too.
Good to see politics/ philosophy/ history 'up there'. No surprise on the crypto front, and interesting ideas on why the variations. Think you'd need quals to really find out what's going on there.
Thanks! SQL is gibberish to me too :). But Tableau makes it so easy to use.
But surely you still have to enter all that perculiar SQL code? Or does Tableau have a translation matrix that just 'makes it so'? Quite new to all this.
Yeah it has a translation matrix :). No coding whatsoever. Just connect to the Microsoft sql server as per the tutorial on the steemSQL website. Then select which table you want to look at and voila! If you’re struggling, I can do a 2 minute video to show you tomorrow!
HOWEVER, it is a huge database and it took around 3 minutes per change in query for me when my CPU (7700k) was calculating stuff.
Hey that sounds great, I'll have to get onto it... video would be cool. Just bloody work means I am very limited time wise! And I have a budget laptop on which I run tableau so that might not be the best either...
This is really interesting. Thanks for sharing. I didn't realise we could access this kind of data so double thanks for that. Hopefully I will get some spare time at some stage to have a play with it myself.
huh. I found this incredibly interesting actually, and I'm normally only (very) marginally interested in data analysis (yes, yes, sad, I know).
It helps though, that all of your posts are, from my humble clueless perspective, extremely well structured and informative without being overbearing, and also usually prettyy~~~ which doesn't hurt. haha
I'll be looking forward to more steemit-data related stuff from you.
(Now there's a sentence I never thought I'd utter!)
Fantastic information! Very interesting read about what people really interested to talk about on Steemit.