Update: Steem-Exif-Spider-bot New Voting and Comments Queues

in #utopian-io7 years ago (edited)

There are several improvements and features added.

  1. Comments queue
  2. Voting queue
  3. Switch from filesystem streams to in-memory buffers

Comments and Voting queues


In other projects, I have noticed significant efficiency improvements from switching to queues that spread voting/commenting out instead of processing all votes and comments simultaneously. The queue implementations are fairly simple. Just pushing/shifting off an array. It is also managed through a scheduler that shifts items off the array in periodic increments. Ideally, we would want to manage this with an event. That may come in as a future change.

Switching from Filesystem Streams to in-memory Buffers

Prior to this change, JPG data was read via HTTP using streams. The data would be streamed and then written to a file. This file would then be read back in as a buffer. We are essentially skipping a step. Rather than blocking on filesystem I/O for streams, I switched to in-memory buffers. The data is converted from a stream to a buffer and loaded into memory rather than writing to disk first.

Changes

diff --git a/steem-exif-spider-bot/helpers/bot/comment.js b/steem-exif-spider-bot/helpers/bot/comment.js
new file mode 100644
index 0000000..10c2093
--- /dev/null
+++ b/steem-exif-spider-bot/helpers/bot/comment.js
@@ -0,0 +1,61 @@
+const Promise = require('bluebird')
+const steem = require('steem')
+const { user, wif, weight } = require('../../config')
+const schedule = require('node-schedule')
+const Handlebars = require('handlebars')
+const fs = Promise.promisifyAll(require('fs'))
+const path = require('path')
+
+const MINUTE = new schedule.RecurrenceRule();
+MINUTE.second = 1
+
+function loadTemplate(template) {
+    return fs.readFileAsync(template, 'utf8')
+}
+
+
+function execute(comments) {
+
+    if (comments.length() < 1) {
+        return {};
+    }
+
+    const { author, permlink } = comments.shift();
+
+    var context = {
+    }
+
+    return loadTemplate(path.join(__dirname, '..', 'templates', "exif.hb"))
+        .then((template) => {
+            var templateSpec = Handlebars.compile(template)
+            return templateSpec(context)
+        })
+        .then((message) => {
+            var new_permlink = 're-' + author 
+                + '-' + permlink 
+                + '-' + new Date().toISOString().replace(/[^a-zA-Z0-9]+/g, '').toLowerCase();
+            console.log("Commenting on ", author, permlink, type)
+
+            return steem.broadcast.commentAsync(
+                wif,
+                author, // Leave parent author empty
+                permlink, // Main tag
+                user, // Author
+                new_permlink, // Permlink
+                new_permlink,
+                message, // Body
+                { tags: [], app: "steemit-exif-spider-bot/0.1.0" }
+            ).then((results) => {
+                console.log(results)
+                return results
+            })
+            .catch((err) => {
+                console.log("Error ", err.message)
+            })
+        })
+}
+
+module.exports = {
+    execute
+}

Comment queue implementation. Pulls comments off the queue and posts them to the steem blockchain.

diff --git a/steem-exif-spider-bot/helpers/bot/exif.js b/steem-exif-spider-bot/helpers/bot/exif.js
index bc49a21..738c48a 100644
--- a/steem-exif-spider-bot/helpers/bot/exif.js
+++ b/steem-exif-spider-bot/helpers/bot/exif.js
@@ -17,6 +17,9 @@ module.exports = {
     execute
 }
 
+let VOTING = {}
+let COMMENTS = {}
+
 function loadTemplate(template) {
     return fs.readFileAsync(template, 'utf8')
 }
@@ -35,60 +38,32 @@ function processComment(comment) {
             }
             return [];
         })
-        .each((image) => {
-            if (image.indexOf(".jpg") > -1|| image.indexOf(".JPG") > -1) {
-                const dest = tempfile('.jpg');
-                try {
-                    got.stream(image).pipe(fs.createWriteStream(dest))
-                        .on('close', () => {
-                            try {
-                                const input = ExifReader.load(fs.readFileSync(dest));
-                                const tags = []
-                                for (let key in input) {
-                                    const value = input[key];
-                                    if (key != "MakerNote"
-                                        && key.indexOf("undefined") < 0
-                                        && key.indexOf("omment") < 0
-                                        && key.indexOf("ersion") < 0) {
-                                        tags.push({ name: key, value: value.value, description: value.description })
-                                    }
-                                }
-
-                                reply(comment, tags)
-                            }
-                            catch(err) {
-                                if (err.message == "No Exif data") {
-
-                                }
-                            }
-                        })
-                }
-                catch (err) {
-                    console.log("Error ", err)
-                }
-                finally {
-                    fs.unlink(dest, (err) => {
-                        // file deleted
+        .map((image) => {
+            if (image.indexOf(".jpg") > -1 || image.indexOf(".JPG") > -1) {
+                const buffers = [];
+                return got(image, {encoding: null })
+                    .then((response) => {
+                        console.log("Loading ", image);
+                        return ExifReader.load(response.body);
+                    })
+                    .catch((error) => {
+                        console.log("Error ", error);
                     });
+            }
+        })
+        .filter((tags) => tags ? true : false)
+        .each(input => {
+            const tags = []
+            for (let key in input) {
+                const value = input[key];
+                if (key != "MakerNote"
+                    && key.indexOf("undefined") < 0
+                    && key.indexOf("omment") < 0
+                    && key.indexOf("ersion") < 0) {
+                    tags.push({ name: key, value: value.value, description: value.description })
                 }
             }
+            reply(comment, tags)
         })
         .catch((error) => {
             console.log("Error ", error)
@@ -101,51 +76,23 @@ function reply(comment, tags) {
         tags: tags
     }
 
-
-    return loadTemplate(path.join(__dirname, '..', 'templates', 'exif.hb'))
-    .then((template) => {
-        var templateSpec = Handlebars.compile(template)
-        return templateSpec(context)
-    })
-    .then((body) => {
-        console.log("Body ", body)
-        return body;
-    })
-    .then((body) => {
-        var permlink = 're-' + comment.author 
-            + '-' + comment.permlink 
-            + '-' + new Date().toISOString().replace(/[^a-zA-Z0-9]+/g, '').toLowerCase();
-
+    return new Promise((resolve, reject) => {
         console.log("Replying to ", {author: comment.author, permlink: comment.permlink})
-        return steem.broadcast.commentAsync(
-            wif,
-            comment.author, // Leave parent author empty
-            comment.permlink,
-            user, // Author
-            permlink, // Permlink
-            permlink, // Title
-            body, // Body
-            { "app": "steem-exif-spider-bot/0.1.0" }
-        )
-        .catch((err) => {
-            console.log("Unable to process comment. ", err)
-        })
+        COMMENTS.push({ author: comment.author, permlink: comment.permlink })
+
+        return [ comment.author, comment.permlink]
     })
-    .then((response) => {
-        return steem.broadcast.voteAsync(wif, user, comment.author, comment.permlink, weight)
-            .then((results) =>  {
-                console.log(results)
-            })
-            .catch((err) => {
-                console.log("Vote failed: ", err)
-            })
+    .spread((author, permlink) => {
+        VOTING.push({ author: author, permlink: permlink, weight: weight });
     })
     .catch((err) => {
         console.log("Error loading template ", err)
     })
 }
 
-function execute() {
+function execute(voting, comments) {
+    VOTING = voting
+    COMMENTS = comments
 
     steem.api.streamOperations((err, results) => {
         return new Promise((resolve, reject) => {

Moving functionality that will be implemented in comment and voting queues. Switching from filesystem streams to in-memory buffers.

diff --git a/steem-exif-spider-bot/helpers/bot/index.js b/steem-exif-spider-bot/helpers/bot/index.js
index 083546c..64b61ac 100644
--- a/steem-exif-spider-bot/helpers/bot/index.js
+++ b/steem-exif-spider-bot/helpers/bot/index.js
@@ -1,6 +1,27 @@
+const voting_queue = [];
+const comment_queue = [];
+
+const voting = {
+    length: () => { return voting_queue.length },
+    push: (obj) => { return voting_queue.push(obj) },
+    pop: () => { return voting_queue.pop() },
+    shift: () => { return voting_queue.shift() },
+    unshift: (obj) => { return voting_queue.unshift(obj) }
+}
+
+const comments = {
+    length: () => { return comment_queue.length },
+    push: (obj) => { return comment_queue.push(obj) },
+    pop: () => { return comment_queue.pop() },
+    shift: () => { return comment_queue.shift() },
+    unshift: (obj) => { return comment_queue.unshift(obj) }
+}
+
 
 function run() {
-    return require("./exif").execute();
+    require('./comment').execute(comments)
+    require('./vote').execute(voting)
+    require('./exif').execute(voting, comments)
 }

The model for the queue management

diff --git a/steem-exif-spider-bot/helpers/bot/vote.js b/steem-exif-spider-bot/helpers/bot/vote.js
new file mode 100644
index 0000000..58714c8
--- /dev/null
+++ b/steem-exif-spider-bot/helpers/bot/vote.js
@@ -0,0 +1,84 @@
+const Promise = require('bluebird')
+const steem = require('steem')
+const { user, wif, weight } = require('../../config')
+const schedule = require('node-schedule')
+const moment = require('moment');
+
+const MINUTE = new schedule.RecurrenceRule();
+MINUTE.second = 1
+
+const SECONDS_PER_HOUR = 3600
+const PERCENT_PER_DAY = 20
+const HOURS_PER_DAY = 24
+const MAX_VOTING_POWER = 10000
+const DAYS_TO_100_PERCENT = 100 / PERCENT_PER_DAY
+const SECONDS_FOR_100_PERCENT = DAYS_TO_100_PERCENT * HOURS_PER_DAY * SECONDS_PER_HOUR
+const RECOVERY_RATE = MAX_VOTING_POWER / SECONDS_FOR_100_PERCENT
+const DEFAULT_THRESHOLD = 9500
+
+
+function current_voting_power(vp_last, last_vote) {
+    console.log("Comparing %s to %s ", moment().utc().add(7, 'hours').local().toISOString(), moment(last_vote).utc().local().toISOString())
+
+    var seconds_since_vote = moment().utc().add(7, 'hours').local().diff(moment(last_vote).utc().local(), 'seconds')
+    return (RECOVERY_RATE * seconds_since_vote) + vp_last
+}
+
+function time_needed_to_recover(voting_power, threshold) {
+    return (threshold - voting_power) / RECOVERY_RATE
+}
+
+function check_can_vote() {
+    return steem.api.getAccountsAsync([ user]).then((accounts) => {
+        if (accounts && accounts.length > 0) {
+            const account = accounts[0];
+            console.log("Voting threshold for %s: %s", user, DEFAULT_THRESHOLD)
+            console.log("Getting voting power for %d %s", account.voting_power, account.last_vote_time)
+            var voting_power = current_voting_power(account.voting_power, account.last_vote_time)
+            if (voting_power > DEFAULT_THRESHOLD) {
+                return true;
+            }
+        }
+        return false;
+    })
+}
+
+function vote(author, permlink, weight) {
+    return steem.broadcast.voteAsync(
+        wif, 
+        user, 
+        author,
+        permlink,
+        weight
+    )
+    .then((results) =>  {
+        console.log("Vote results: ", results)
+        return results;
+    },
+    (err) => {
+        console.log("Vote failed for %s: %s", user, err.message)
+    })
+}
+
+function execute(voting) {
+    schedule.scheduleJob(MINUTE, function() {
+        if (voting.length() < 1) {
+            return {};
+        }
+               
+        const { author, permlink, weight } = voting.shift();
+
+        return check_can_vote().then((can_vote) => {
+            if (can_vote) {
+                vote(author, permlink, weight)
+            }
+            else {
+                voting.push({ author, permlink, weight })
+            }
+        })
+    })
+}
+
+module.exports = {
+    execute
+}

This is the module for the voting queue implementation. It periodically pulls votes off a queue and votes them.

diff --git a/steem-exif-spider-bot/package.json b/steem-exif-spider-bot/package.json
index 1abbdbe..8024fd2 100644
--- a/steem-exif-spider-bot/package.json
+++ b/steem-exif-spider-bot/package.json
@@ -31,6 +31,7 @@
     "got": "^8.3.0",
     "handlebars": "^4.0.11",
     "jdataview": "^2.5.0",
+    "node-schedule": "^1.3.0",
     "request": "^2.85.0",
     "steem": "^0.7.1",
     "tempfile": "^2.0.0",

Adding node schedule dependency so the queue can be run using node-schedule



Posted on Utopian.io - Rewarding Open Source Contributors

Sort:  

Thank you for the contribution.

Great effort and I like seeing how often you contribute to these projects and utopian.io .

When you write these contributions I would like to think that you know there is at least another human being reading your contribution. I am not sure what function these long diffs achieve for other people, but they are not helpful at all to me.

What I would do , maybe, is to have at least the removed lines not features in this, while considering making the other lines more about the domain logic not about the lower-level implementation.

You can contact us on Discord.
[utopian-moderator]

Thanks so much. Yeah, the diffs are really lengthy and you can probably just look at the PR, so I don't know what's so great about them. It's just that there was a time a moderator asked me to add them in.

BTW, all lines (I am pretty sure) are related to the features. I think I actually left out a few that I think aren't (package-lock.json). If you see any that you think are not, let me know. I'll be sure to keep a look out in the future.

Thanks for the feedback. I appreciate the chance to improve.

Hey @r351574nc3 I am @utopian-io. I have just upvoted you!

Achievements

  • You have less than 500 followers. Just gave you a gift to help you succeed!
  • Seems like you contribute quite often. AMAZING!

Community-Driven Witness!

I am the first and only Steem Community-Driven Witness. Participate on Discord. Lets GROW TOGETHER!

mooncryption-utopian-witness-gif

Up-vote this comment to grow my power and help Open Source contributions like this one. Want to chat? Join me on Discord https://discord.gg/Pc8HG9x

Loading...

@r351574nc3 @salty-mcgriddles @stranded
Why don't you use APP Engine to create a parser and interface for users to generate their own metadata - posting it directly to comments looks like nothing more than comment spamming for upvotes.

@r351574nc3 @salty-mcgriddles @stranded
I don't know where you learned to code - but you should not be running alpha code in a live environment.

Hehe...it's not even alpha. This is called a/b testing. You sound like you know just enough to know where software comes from, but not at all how it's made. Also, pretty conceited like you think you can do better. You should. Competition is what drives quality, but you already knew that.

Also, private alpha and closed/open beta testing is done on ... live/in-production systems (not test environments).

I mean - adapt your scraper to do what it is doing now (collecting steemit EXIF and formatting it) - but move it to another platform. Then, let a user input their steemit name and present them with pre-formatted exif data which they can link to or include in their posts.

You mean like this one? https://steemit.com/utopian-io/@r351574nc3/new-project-steemit-exif-microservice

let a user input their steemit name

Check (not just a name, but directed content by permlink)

present them with pre-formatted exif data which they can link to or include in their posts.

Check (preformatted JSON, that they can then refine and prune for whatever data they want)

So yes, you could create a very useful and refined service which would be unique to steemit - I think a project like that would have a lot of merit.

Here's your demo https://steemit-exif-ms.herokuapp.com/soma909/le-fond-de-l-air-est-rouge-a-grin-without-a-cat-2017-fuji-x100-mk1-tele-conversion-lens

With some mediocre skills, you can extract exactly what fields you want and simultaneously convert it to markdown to include in a post.

Coin Marketplace

STEEM 0.16
TRX 0.17
JST 0.030
BTC 70241.58
ETH 2516.12
USDT 1.00
SBD 2.55