Steemit User Popularity Score / Puntación por popularidad en steemit [ENG/ESP]

in #steem7 years ago

Hi there,

I was having today some fun with @furion`s SteemData nonQGL database trying to figure a way to measure "popularity" in the network. After a couple of tries I decided to use the following formula:

popularity_score = (num_of_followers - n_of_following) * (total_num_of_post / num_of_comments_per_post)

The higher the popularity_score is, the most popular is the user.

To reduce the number of samples I restricted the search to accounts with more than 500000 VESTS, so do not under if you do not find yourself in the table.

You can see the results in here:

()

Do not take this table too seriously, it i just a small experiment to initiate some discussion about ways to measure popularity and influence in the network.

Also, if you are interested, here is the small piece of code I wrote for it:



def get_num_of_average_received_comments(sd, account): 

    num_of_posts = sd.Posts.find({"author": account}).count()
    resut_dict  = sd.Posts.aggregate([{"$match":{"author":account}},{"$group":{"_id": "null", "totalComments": { "$sum": "$children"}}}],  useCursor=False)
 
    avg_num_of_comments = 0
    tot_num_of_comments = 0
    
    for c in resut_dict:
        tot_num_of_comments = c["totalComments"]
      
    if(num_of_posts != 0):
        avg_num_of_comments = tot_num_of_comments / num_of_posts
        
    return(num_of_posts, tot_num_of_comments, avg_num_of_comments)
    
def popularity():

    min_vests = 500000
 
    s = SteemData()
    num_of_authors = s.Accounts.find({"balances.total.VESTS": {"$gt":min_vests}}).count()
    list_of_authors = s.Accounts.find({"balances.total.VESTS":{"$gt":min_vests}})
    author_rewards_dict = dict()
    count = 0
    
    print("author;ratio;followers_count;following_count;n_posts;n_comms;avg_comm;popularity_index")
    
    for author in list_of_authors:

        count = count+1
        if author["following_count"] != 0:
            follower_factor = author["followers_count"] - author["following_count"] 
            n_posts, n_comms, avg_comm = get_num_of_average_received_comments(s,author["account"])
            popularity_index = dif*avg_comm
            print(author["account"] + ";" + str(follower_factor) + ";" + str(author["followers_count"]) + ";" + str(author["following_count"]) + ";" +str(n_posts) + ";" + str(n_comms) + ";" + str(avg_comm) + ";" + str(popularity_index))
     
popularity()

best Regards
Pablo @pgarcgo

PS: Do not forget to vote our @cervantes witness!!



Hola

Jugando un poco con la fantástica replica no-sql del steem-blockchain creado por @furion (SteemData), he creado una pequeña lista para medir la popularidad de los usuarios.

Al final me decidí por esta formula:

popularity_score = (num_de_seguidores - num_de_seguidos) * (num_de_posts / num_de_comentarios_por post)

Cuanto más alto sea popularity_score más popular es la cuenta en cuestión.

También cabe decir que he restringido el número de cuentas sampleadas a aquellas con más de 500000 VESTS.

Tanto la lista como el código, para no redundar y gastar ancho de banda, los podéis ver arriba en la versión en inglés

Hasta otra,

Pablo @pgarcgo.

Ah, y no te olvides de votar el witness @cervantes!!

Sort:  

How can he keep up interacting with his sockpuppet accounts this much? Red Bull into the veins? :P

I don't know man. He is like 24/7 commenting and will end up getting depressive by the 20 sec restriction haha

This is interesting data. Great job with the code.

Thanks mate. Just playing around...

Wow this is an awesome table. Thanks for the info buddy

Youre welcome.

Excelente trabajo mi amigo @pgarcgo

At first glance it looks like you never compensate for if a user is following more people than are following them, which would yield a negative popularity. Unless this is acceptable in your final results?

After this formula users which have a positive followers / followed ratio are prefered, yes.

Great formulation

Me encanto el post pablo. Trafalgar la esta rompiedo. Hay alguna forma de ver en que hora hay mas transito para publicar?

Si. Pero no los paises.

Muy interesante @pgarcgo. Yo he estado experimentando con datos crudos obtenidos con la API para python. No conocía eso del SteemData database. Le voy a echar un ojo. Sobre la medida de popularidad, el num_of_comments_per_post es un promedio? Yo pienso que debería entrar en la ecuación cuántas veces le hacen resteem a los posts.

Buena idea!

Coin Marketplace

STEEM 0.21
TRX 0.18
JST 0.031
BTC 87446.28
ETH 3163.39
USDT 1.00
SBD 2.93