User Profiling: Detailed Metrics (Steemit Feature Proposal)
As a follow-up post to my user engagement and content discovery suggestion made yesterday, I have provided a more detailed proposal for the various methodologies that could be used to derive the metrics that make up a user profile.
Creating the profile
To create the profile, the data collected may be organized into the following categories (in order of importance) and their data fields:
User preferences
- communities followed
- manual tag or category inputs (from settings "please tell Steemit what you like")
Posts a person upvoted or downvoted
- list of posts
- upvote or downvote
- voting strength
- community it belongs to
- tags used
- payout value
Posts a person has read
- lists of posts
- time spent vs length ratio / time spent vs video length ratio
- resultant action (nothing/upvote/downvote/comment/resteem)
- value
Comments a user has made
- list of the main posts
- communities they belong to
- tags used
- payout value of posts
Blocked/Muted/Toned Down users
- takes the list of users the person has blocked or muted
- lists the users whose content the person asked not to receive recommendations for through the Discover interface "show me less content from person/community"
After working on the five data sets that we get on-chain and off-chain, we have data that we can use to calculate the probability of a person enjoying a post. This is what I call the profile and its data can be used to make queries that ask for a post's score, comparing the post's attributes with the profile's preferred attributes. The following are some of the types of scores that may be queried:
Relevance Score
how relevant is the post to the user?
- Focuses more on the user preferences
- Prioritizes tag and community matching
- Prioritizes the user's upvote behavior
- Penalizes posts similar to those the user downvoted
Valuation Score
how much value does the post have in comparison with other posts the user likes?
- Focuses more on a post's value in comparison with the profile's value averages, medians etc
- Prioritizes payout value
- Penalizes posts with low value, even though they are relevant
Popularity Score
how popular is the post compared to other posts the user likes?
- Focuses more on the amount of interaction that a post has garnered
- Prioritizes interaction counts: number of upvotes, resteems, comments and views
- Prioritizes posts read and liked by people with similar profiles
- Penalizes posts with little interaction
There will need to be a minimum score on all three scores, so that all posts that fall under it won't be recommended to a user; even if the number of recommended posts returns zero and the Discover feed becomes empty. But that's something highly unlikely to occur when the platform has got millions of users all posting and voting.
The profiles may even strengthen each other and another person's similar profile may be used to recommend stuff to a user, at which point the recommendation will be clearly labeled "enjoyed by people you share similar tastes with."
Thanks for reading.
Posted on Utopian.io - Rewarding Open Source Contributors
Thank you for the contribution. It has been approved.
[utopian-moderator]
Hey @imwatsi I am @utopian-io. I have just upvoted you at 4% Power!
Achievements
Suggestions
Community-Driven Witness!
I am the first and only Steem Community-Driven Witness. Vote for my Witness. Lets GROW TOGETHER!
Up-vote this comment to grow my power and help Open Source contributions like this one. Want to chat? Join me on Discord https://discord.gg/Pc8HG9x