Steemit User Popularity Score / Puntación por popularidad en steemit [ENG/ESP]
Hi there,
I was having today some fun with @furion`s SteemData nonQGL database trying to figure a way to measure "popularity" in the network. After a couple of tries I decided to use the following formula:
popularity_score = (num_of_followers - n_of_following) * (total_num_of_post / num_of_comments_per_post)
The higher the popularity_score is, the most popular is the user.
To reduce the number of samples I restricted the search to accounts with more than 500000 VESTS
, so do not under if you do not find yourself in the table.
You can see the results in here:
- https://docs.google.com/spreadsheets/d/11l3IJBXnfoK9SLmUHxz1NAZbQYQGqgV2KWh0pHevPWo/edit?usp=sharing
Do not take this table too seriously, it i just a small experiment to initiate some discussion about ways to measure popularity and influence in the network.
Also, if you are interested, here is the small piece of code I wrote for it:
def get_num_of_average_received_comments(sd, account):
num_of_posts = sd.Posts.find({"author": account}).count()
resut_dict = sd.Posts.aggregate([{"$match":{"author":account}},{"$group":{"_id": "null", "totalComments": { "$sum": "$children"}}}], useCursor=False)
avg_num_of_comments = 0
tot_num_of_comments = 0
for c in resut_dict:
tot_num_of_comments = c["totalComments"]
if(num_of_posts != 0):
avg_num_of_comments = tot_num_of_comments / num_of_posts
return(num_of_posts, tot_num_of_comments, avg_num_of_comments)
def popularity():
min_vests = 500000
s = SteemData()
num_of_authors = s.Accounts.find({"balances.total.VESTS": {"$gt":min_vests}}).count()
list_of_authors = s.Accounts.find({"balances.total.VESTS":{"$gt":min_vests}})
author_rewards_dict = dict()
count = 0
print("author;ratio;followers_count;following_count;n_posts;n_comms;avg_comm;popularity_index")
for author in list_of_authors:
count = count+1
if author["following_count"] != 0:
follower_factor = author["followers_count"] - author["following_count"]
n_posts, n_comms, avg_comm = get_num_of_average_received_comments(s,author["account"])
popularity_index = dif*avg_comm
print(author["account"] + ";" + str(follower_factor) + ";" + str(author["followers_count"]) + ";" + str(author["following_count"]) + ";" +str(n_posts) + ";" + str(n_comms) + ";" + str(avg_comm) + ";" + str(popularity_index))
popularity()
best Regards
Pablo @pgarcgo
PS: Do not forget to vote our @cervantes witness!!
Hola
Jugando un poco con la fantástica replica no-sql del steem-blockchain creado por @furion (SteemData), he creado una pequeña lista para medir la popularidad de los usuarios.
Al final me decidí por esta formula:
popularity_score = (num_de_seguidores - num_de_seguidos) * (num_de_posts / num_de_comentarios_por post)
Cuanto más alto sea popularity_score
más popular es la cuenta en cuestión.
También cabe decir que he restringido el número de cuentas sampleadas a aquellas con más de 500000 VESTS
.
Tanto la lista como el código, para no redundar y gastar ancho de banda, los podéis ver arriba en la versión en inglés
Hasta otra,
Pablo @pgarcgo.
Ah, y no te olvides de votar el witness @cervantes!!
@Trafalgar is ruling!
How can he keep up interacting with his sockpuppet accounts this much? Red Bull into the veins? :P
I don't know man. He is like 24/7 commenting and will end up getting depressive by the 20 sec restriction haha
This is interesting data. Great job with the code.
Thanks mate. Just playing around...
Wow this is an awesome table. Thanks for the info buddy
Youre welcome.
Excelente trabajo mi amigo @pgarcgo
At first glance it looks like you never compensate for if a user is following more people than are following them, which would yield a negative popularity. Unless this is acceptable in your final results?
After this formula users which have a positive followers / followed ratio are prefered, yes.
Great formulation
Me encanto el post pablo. Trafalgar la esta rompiedo. Hay alguna forma de ver en que hora hay mas transito para publicar?
Si. Pero no los paises.
Muy interesante @pgarcgo. Yo he estado experimentando con datos crudos obtenidos con la API para python. No conocía eso del SteemData database. Le voy a echar un ojo. Sobre la medida de popularidad, el num_of_comments_per_post es un promedio? Yo pienso que debería entrar en la ecuación cuántas veces le hacen resteem a los posts.
Buena idea!