[Analysis] Blocktrades worldcup contest Round of 16 #mypicks analysis on choices made

in #utopian-io6 years ago (edited)

Repository

https://github.com/superoo7/worldcup

Introduction

In my previous post I shared about the analysis of the group stage of world cup competition prediction hosted by @blocktrades. In this analysis, will be the follow-up competition for Round of 16 predictions.

The analysis extraction program is hosted on github which is a forked version of the previous analysis. https://github.com/superoo7/worldcup-2

Outline

  • Scope
  • Summary
  • Data Extraction
  • Data Visualization
  • Possible errors
  • Conclusion
  • Tools and Scripts
  • Relevant Links and Resources

Scope

The data is extracted from SteemSQL on 3/7/2018, where the timeframe for the analysis is between 2018-06-28 20:26:03 (GMT +0) and 2018-06-30 13:59:00 (GMT +0). The first timestamp is when the post is made, whereas the second timestamp is the deadline given on the post.

Result

Summary

There are a total of 1351 posts which fulfill the previous timeframe, but the finalize data is 1226 posts only. This means that there are 9.25% of invalid entries from all the participants which could be due to post without following the format, violation of participation rule, post that used both #blocktradesworldcup and #mypicks but not meant for entry of the competition.

Here are some rules that are considered in this analysis:

  • Last edit and created should be before 2018-06-30 13:59:00 (GMT +0)
  • Use of both tags #blocktradesworldcup and #mypicks
  • Only submission in English for the choices made (e.g. Spain instead of Espanyol)
  • Follow the markdown table templates given
  • Reputation > 35

Data Extraction

The data extraction process is as followed:

  • Check reputation of the authors.
  • Extract out data by using string replacement and regex
  • Check whether an author make multiple posts
  • Stored data in both JSON and CSV format

There are 1351 posts extracted from SteemSQL

Check reputation of the authors

Screen Shot 2018-07-05 at 10.45.08 AM.png

In this section of filtering, there are 63 authors that violated the rule of a minimum reputation of 35.

Extract out data from the post body

Screen Shot 2018-07-05 at 10.47.01 AM.png

In this section, there are 54 posts that violated this section. This could be caused by the post used both tags but not intend to participate in the competition, invalid table format, post entry that does not use English and etc.

Check whether an author make multiple posts

Screen Shot 2018-07-05 at 10.49.01 AM.png

In this section of filtering, there are 4 authors, a total of 12 posts violated the rule of submitting duplicate entries.


After the filter, there are 1226 posts left, with a 9.25% of violated post rate from all the post extracted.

Data Visualize

By using Tableau, a tool for doing data visualization, I am able to share a few plotted out data.

Choices made

wc2.png

This is the choices made by the participants on the goal score between both teams at 90 minutes full-time mark.

Win, Lose or Tie based on choices made

wc2-1.png

The green color indicates the LHS as the winner, the blue color indicates a draw, red color indicates LHS as the loser and RHS as the winner, and orange indicates invalid data.

Based on the data previously made, I further investigate on which team are more favorable in terms of winning instead of goal scores.

Final Score result

Since the Round of 16 is over, these are the result of goal scores and condition.

Screen Shot 2018-07-05 at 11.06.47 AM.png

MatchScoreCondition
France vs Argentina4:3w
Uruguay vs Portugal2:1w
Spain vs Russia1:1t
Croatia vs Denmark1:1t
Brazil vs Mexico2:0w
Belgium vs Japan3:2w
Sweden vs Switzerland1:0w
Colombia vs England1:1t

Possible errors

  • The script also only accept the country being sorted in the order given in the template.
  • The script is able to extract only markdown table and most of the HTML table. Without using the markdown table given, the script is not able to extract out the data.

Conclusion

I am glad that I was again being assigned to carry out the task of making analysis for this competition.

If you interested in joining, there is another round going on by @worldcup-russia: https://steemit.com/blocktradesworldcup/@worldcup-russia/quarter-finals-stage-submit-your-entry-to-keep-competing-for-the-2-000-sbd-prize-pool

Tools and Scripts

  • SteemSQL (MSSQL) - Extracting data of all the posts
  • TypeScript - To run data extraction with Regex and string replacement
  • Jest - To test individual functions created for data extraction
  • Tableau - Data visualize tools

Relevant Links and Resources

  • SQL Query
SELECT
  'https://steemit.com' + Comments.url AS link,
  Comments.author,
  Comments.title,
  Comments.created,
  Comments.last_update,
  Comments.body,
  Accounts.reputation
FROM
  Comments
LEFT JOIN Accounts ON Comments.author = Accounts.name
WHERE
  depth = 0 AND
  Comments.created > '2018-06-28 20:26:03' AND
  Comments.created < '2018-06-30 13:59:00' AND
  Comments.last_update < '2018-06-30 13:59:00' AND
  CONTAINS(Comments.json_metadata, 'blocktradesworldcup') AND
  CONTAINS(Comments.json_metadata, 'mypicks')

Proof of Authorship

https://github.com/superoo7/worldcup-2

Sort:  

Btw, will you also provide an update for the ranking?

Exactly, we're all waiting for them. We count on you very much ;)

Lovely post!

Is there any possibility to detect if someone predicted all the wins/tight/loss correct?
And who scored the most points?

Hi @superoo7, impressive work again! As for the previous post, this topic does not fit very well with Utopian. Neither Blocktrades nor the contest are open source projects. You've submitted this as an analysis of your tool, but the data you shown is unrelated to that. For an analysis of the steemit/steem open source project, this contest is a very narrow and social aspect, it's analysis can barely help the project. I can only leave a tip from utopian in this case. We'd love to reward you for analysis posts slightly more related to open source projects! :)

Thanks for the review again, I understand the concern about it the contest which is not related to open source, and I am totally fine with it.

Thank you for sharing your posts with us. This post was curated by TeamMalaysia as part of our community support. Looking forward for more posts from you.

To support the growth of TeamMalaysia Follow our upvotes by using steemauto.com and follow trail of @myach

Vote TeamMalaysia witness bitrocker2020 using this link vote bitrocker2020 witness

Go BRAZIL! NYahahaha....

Hey @superoo7
Thanks for contributing on Utopian.
We’re already looking forward to your next contribution!

Want to chat? Join us on Discord https://discord.gg/h52nFrV.

Vote for Utopian Witness!

Coin Marketplace

STEEM 0.26
TRX 0.11
JST 0.033
BTC 64383.21
ETH 3098.60
USDT 1.00
SBD 3.89