You are viewing a single comment's thread from:

RE: Response to @dantheman "notice-to-bot-spammers"

in #steemit8 years ago

The thing with a code of conduct is that it is not as rigid as rules. And even when there are rules, someone may be breaking them if the penalty is not high.

Regarding bots like cheetah, yeah, this type of automation is nice to have - but I guess the level of implementation is rather fluid. It could be embedded in the page, be displayed by a bot in the comments, or run locally by a user.

Btw, I'm seeing some bypasses for original content detection with slightly broken english. Perhaps someone is using google translate from english => other language => english to reorder the text, or some other rewording technique. This would need a more evolved type of cheetah to detect the similarities...

Sort:  

@alexgr Yeah thank you for your comments on my next week's blog posting :) I actually have a much better solution in mind based on contextual fingerprinting algorithms.

Basically a machine translates like a machine. Ergo you can spot machine translations when they occur and each translator has a way of screwing things up that is absolutely unique to them. For example, try putting "I would like a hotdog please." Into google translate, then translate to spanish and watch the hilarity ensure. (especially if you show it to someone who actually speaks spanish)

Now machine translation itself isn't a bad thing. I couldn't survive one day in a foreign country (some parts of the USA too) without tools such as google translate .

It does mean you need to look at the conceptual flow through the document, see if anyone has said anything which is conceptually and structurally the same. Identify those documents, and run them through various machine translators to see if you get a strong match.

This is called strong attribution via contextual analysis and knowledge extraction
There isn't a way to cheat this without actually rewriting the entire document yourself first and at that point it's pretty much the same as a term paper.

But umm that's next week's blog, so I hope you'll stop back by and comment on this then.

Looking forward to it :)

Coin Marketplace

STEEM 0.19
TRX 0.13
JST 0.030
BTC 63186.04
ETH 3392.68
USDT 1.00
SBD 2.50