Introducing Steem.Cloud - A (near) Realtime Word Cloud Experiment - v0.0.1 Alpha "PB&J" Release

in #steemit8 years ago (edited)

Inspired by the subtle majesty of @trogdor and @team's word clouds, and the realtime goodness of @roelandp 's SteamStreem, I couldn't get my mind off the possibility of a realtime Steem-driven Word Cloud that kisses each passing block, extracting, scrubbing, sanitizing and analyzing the macro-discussion which is the product of the Steemian experience.

Yeah, just like that

So for almost two weeks my computer has been an extension of my body, each idle moment replaced by lines of code and "Sleep".replace(/^Sleep$/g, "Can't Stop, Won't Stop").

The struggle is real.

Memory leaks, non-blocking induced race conditions and that tiny voice in the back of my head telling me to go to sleep, watch Mr. Robot or do anything other than stare blankly with bloodshot eyes at code that is begging to be refactored.

And while the server-trolls have not let me off easy, and it's far away from the perfect flower I had envisioned, it has to be liberated from the clutches of my perfectionism at some point. Without further adieu, I present to the alpha release of Steem.Cloud 0.0.1 "PB&J".

tl;dr -> The Takeaway, and some Release Notes

Never has a blockchain been so accessible to me. For this I would like to thank @xeroc because this.piston.rocks and without it I would have been stuck with chasing a runaway steemd and not had the time to focus on the tasks at hand. I would certainly like to thank svk31 for steem-rpc, for helping me out with an issue, and introducing me to some ES6 patterns, uncertain if you're on this site. Many thanks to those mentioned at the top of this article for the inspiration, the developers at d3, Node, Engine.io and many more. A tip of the hat to developers here who are powering through the learning curve, adapting to new methodologies and apply themselves to new and better patterns.

The Road Ahead

If any of you have worked with roadmaps before, you all know how often they change, so I'm going to keep this one as loose as possible, and abstain from release dates

  • 0.0.X "PB&J" Optimization and Refactoring optimization
  • 0.1.X "Caprese" GUI/UX/More Realtime-ish (Beta?) design
  • 0.2.X Statistics, Further Analysis, Historical Snapshots (more energy efficient than historical processing ... afterall, why redo if the analysis is already complete?) design interactivity
  • 0.3.X 100% Client-Side Version, probably not realtime yet, maybe not ever... interactivity
  • 0.4.X Account/Topic Filtering, not realtime... interactivity optimization
  • 0.5.X Secret sauce
  • 0.?.X TBA
  • 1.0.0

When the code is acceptable I will git push origin master and publish the link here for all to hack away at, and hopefully contribute back to.

Wait, why didn't you do a browser-side version first?

Because admittedly I was being a bit selfish, I realllllly wanted to see a realtime cloud and to figure out the problems that come along with it. The biggest of which, is processing everything fast enough before the next block. Alleviating the restrictive sensitivities of browsers and externalizing those challenges onto V8 provided me the breathing room to move forward, and complete the proof of concept.

I am 95% there, and I promise, once complete, a 100% client-side version for generating Steem clouds with a variety of filters and customizations will follow. Some of this is already at work in another branch, which will be merged into master for 0.1, but it is presently not compatible with the alpha release, so you're going to have to wait.

Known Issues

  • Text sanitization related to memory heap issues and overlapping processes from separate workers, resulting in lost text. This is a result of overly flexible content parameters in Steem ... IE: It accepts bad/non-standard code that requires custom scraping techniques (high priority)
  • Slow startup.
  • <IE9 You may be frustrated when your browser crashes.
  • All browsers when the tab is left open for an extended period of time.

Could use help with...

  • Logo/Favicon
  • Advice on best practices for querying content specific to a user and topics ... I have dug through the API and have been unable to determine a solid pattern for this.
  • Feedback, both negative and positive, and your ideas! I'll add everything within reason to the list.
  • Opinions on fastest language for text sanitization and process (please don't say lisp)

Version Agnostic Todo List

  • Credits (For Library Authors)
  • Better Compression
  • General Design UX, Color Scheme (Feel free to suggest a scheme!)

Closing Statements and Lessons

There's more to Steem than meets the eye

  • While this community stems from Cryptocurrency, observing the cloud over the past week has shown me otherwise hidden diversity. Don't take my word for it, take a screenshot, and then come back to it in a couple hours
  • Contrary to popular belief, whales and their relative positioning to minnows aren't that big of a topic.
  • English isn't the only language on Steem. As the earth spins through the different time zones this become self-evident. For example, خواب was dominating the cloud during much of my development. It's Parsi for Dream. Chinese, Korean, Ukranian and Russian were other languages I have observed and have all claimed notable dominance at some time or another.

Steem is a living entity

It's a living, breathing organism for which you and I compose the anatomy that gives it purpose. Remember, Steem it is what you make of it and the real value is in your participation. So I'm going to keep that in my mind and heart, while I work towards a better Sandwich that doesn't get stuck at the top of your mouth...

caprese

See you at Caprese!

Follow Me for updates :)

Sort:  

Without further adieu, I present to the alpha release of Steem.Cloud 0.0.1 "PB&J".

I think there are 2 important question no one is asking

  • CRUNCHY or CREAMY ??
  • GRAPE or STRAWBERRY??

All kidding aside, very well laid out post and interesting project you have. I like the idea of it being "alive" and constantly changing as keywords are used rather than the 30 Second Refresh it currently has. I'd love to see it constantly changing and moving.

Thanks for the feedback!

I would personally opt for creamy peanut butter with Fig Jam, with some chopped baked bacon and a small amount of ground black pepper as a bonus, on pan-fried sourdough dashed with grated parmesan on top, cut diagonally. However, given the alpha state of Steem.Cloud, it's probably closer to crunchy with grape jam on wonderbread ;)

Indeed, I wanted to touch more on this, but was afraid to make promises in this area. In the v0.1.0 branch I am working on realtime cloud adjustments, however, I have to do some icky tricks to make it work.

  • I have to impose superfluous padding to avoid constant collisions and readjustments, this affects the aesthetic of the cloud more than I am comfortable with.
  • Infinite loops, once the size of one object changes, positioning of another object needs to change, which then basically just continues indefinitely, making the Cloud impossible to read. It's very much an alive, as you put it, but more of an alive blur cloud. I have some solutions to create the affect of realtime updates without the adjustment of size or positioning and I touch more on that below.

These problems link to the algorithms that build word clouds. More from the creator of Wordle

The hard part is in doing the intersection-testing efficiently, for which I use last-hit caching, hierarchical bounding boxes, and a quadtree spatial index (all of which are things you can learn more about with some diligent googling).

Those three algorithmic patterns mentioned, are all heavily dependent on pre-placement calculations, which infers the calculations for each element need to occur at the same time. They are all interconnected.

Is it impossible? I doubt it. I am presently modifying d3.layout.cloud.js library, and I am making some progress. I have some fallback plans to make the cloud feel more alive but those will have to wait until I have either succeeded in the initial mission, once I have determined it is not feasible in the browser, or have determined it is beyond my mathematical capabilities at this moment in time.

Beyond the cloud I have an alternative app I am considering building that would not take me very long, and would satiate your and likely many others' desires. But it will have to wait until after Caprese

First thing's first, need to clean up Steem.Cloud, and make it more beautiful :)

Released v0.0.20, it's alive... ;). Let me know what you think @stoner19

Great. Now I'm craving caprese...

So you did all this to make a bot that just posts the most popular words at that time and rake in the big bucks, right?

Not quite. I have a fascination with language and it's implications on a macro level. I've always had a fetish for word clouds, and wanted to get involved with Blockchain related development, but the bar has always been too high for my capabilities. This experiment is the equivalent of me dipping my toes into the water.

Greetings! Very useful advice in this particular post! Its the little changes that make the most important changes. Thanks for sharing!

Coin Marketplace

STEEM 0.30
TRX 0.12
JST 0.033
BTC 64303.16
ETH 3137.29
USDT 1.00
SBD 3.97