The 5 Rules of a Valid Username on the Steem Blockchain (and a 3 SBD contest to make an account name validation RegEx)

in #programming6 years ago (edited)

The rules

The rules for a valid account name in the Steemit blockchain are not as simple as you may think. They have some quirks that I did not know of before I actually read the code that validates each username at account creation in the blockchain.

Here's Steem's code if you want to read it

I had to read it a few times, since I don't actually know C++ (the programming language in which the validation code is written), and I will share my findings.

1. Each part around a period (.) is a name segment

segments

The account name validation does not read the whole name but divides it in segments whose divisions are marked by .

The following rules will apply mostly to each segment instead of the whole account name.

2. Each segment must be at least 3 characters long

ValidInvalid
cryptosharoncr
cry.ptosh.ar

What's the maximum length?

The maximum length of a segment doesn't seem to be defined within the code.

However, Steemit enforces a 16 character maximum to the whole account name, and the test code seems to assume that the maximum for the blockchain would be 63 or 64


3. Each segment must begin with a letter (a-z, English alphabet) and end with a letter or a number (0-9)

begin-end

ValidInvalid
cryptosharon99cryptosharon
cry.pto99sh.aron
c-r.y-p.t-o9sha.9ron

4. All letters contained in a segment must be lowercase

ValidInvalid
cryptosharonCryptoSharon

5. Hyphens (-) must be accompanied side by side by letters or numbers

alphanumerical-loners

This means no double hyphens. Hyphens can't be at the beginning or end of a segment either because of rule 3

ValidInvalid
cry-9-pto-5sh--aron
s-h-a-r-o-n9crptshrn-
a12.b-3otp-.yrc

That's it!

These rules are weird sometimes, and some tidbits are unclear, like the maximum length, but follow them and don't go beyond 16 characters and you'll be ok.

The Contest


regex

I wanted to make a RegEx to validate the names, but I couldn't find a way to make sure it didn't grab double hyphens while making the number of hyphens unlimited and optional. Since I really want to have a RegEx for this, I decided to make a contest for it.

This is the way Steemit does it <-- (Condenser GitHub repo)

It simply iterates the characters and checks with individual regexes, but that's no fun.

I hope a single regex is possible.

If you manage to do it, post it in the comment section. I'll give you 3 SBD :3

And if anyone else wants to give the person more money, that's ok, but I'm poor enough already that this contest seems like crazy money spenditure.

The rules for the contest are simple (more rules!):

  • 1 regex
  • 1 week
  • I'll pick the best (if there's any)

If you're up for the challenge

Don't go alone! Take these resources:

  • RegExr - Test your regexes and learn RegExp the easy way
  • RegEx101 - A more in-depth learning tool that also has a tester


Other posts that might interest you

How the Crisis has Changed Venezuela: The Rise of Nepotism
Travelling into the world of a song: "Tar-Calion" by Summoning
Forlorn days
Skye

What did you think of this guide?
Do you think the regex is possible?

Leave me a vote and a comment (and maybe resteem)


Edit:

Best regex so far (by @eonwarped):

  • ^[a-z](-[a-z0-9](-[a-z0-9])*)?(-[a-z0-9]|[a-z0-9])*(?:\.[a-z](-[a-z0-9](-[a-z0-9])*)?(-[a-z0-9]|[a-z0-9])*)*$
Sort:  

I would first use a "maxlength" command (every language has one) to find if the length has been violated. I would then use a split command to split the segments based on the period (.) then use this regular expression to verify each segment:

/^[a-z][a-z0-9\-][a-z0-9]+$/

This is untested and might need quirking but it fits all your rules.

Oh, I could use some code for sure. Maybe I am too much of a dreamer, but I really want a regexp that kinda checks it completely. But in absence of any other, I could use yours after some quirks :)

Your dash is bound to the second place, so it only validates dashes if they're the second character.

Didn't test this yet but it's pretty so I'm going to draw it.

. (-.)? (.-.|.)* (.-)? .

Just as an idea for dealing with the hyphen.

This one matches pretty much anything :P

And with a varietal with [a-z] at the beginning and numbers at the end, accepting dots in the middle, there is a biiit more detection.

[a-z](-[a-z0-9.])?([a-z0-9.]-[a-z0-9.]|[a-z0-9.])*([a-z0-9.]-)?[a-z0-9]

Oh I was already operating under the assumption that you split the segments. The purpose of this was to give an idea of how to handle hyphens. Of course you should replace the dots with the appropriate character classes :)

It also accepts two characters, so that's not ideal. Maybe change the middle * to a +

Oh!

Then we're only missing the alternating hyphen and it's done.

^[a-z](-[a-z0-9])?([a-z0-9]-[a-z0-9]|[a-z0-9])*([a-z0-9]-)?[a-z0-9](?:\.[a-z](-[a-z0-9])?([a-z0-9]-[a-z0-9]|[a-z0-9])*([a-z0-9]-)?[a-z0-9])*$

More tests:

How about

. (-.(-.)*)? (.-.|.)* ((.-)*.-)? .

Alternating hyphens wouldn't work either

Further testing:

That gets dangerous as the more complicated the regex the more unpredictable the behaviour. I think my solution is actually the most robust method. But some people like a long regex.

You're right! And I absolutely agree. The problem is that I'll be working in different setups and I think that the most portable method is regex. It can just be copied into any test and it will also allow for capturing and detection of certain parts of posts that are supposed to be account names.

A longer test such as Steemit's would be ideal for many things and I think that you're right about it being more robust and solid, but it's at the same time more complicated and time-consuming.

Hmm. You are correct that the dash poses a logistical problem but looking at it closer it just simply does not prevent a double dash. Back to the drawing board.

These are some tests I wrote: https://pastebin.com/7dpgMiZC

You can check for them on RegExr

You can include the split in the regex like this:

/^regex(?:\.regex)*$/

Sure. But here's a solution in Perl with very few headaches. I can also write it in C++, C, Python or Javascript rather painlessly. I can write it in any other language you prefer with a little bit of reference work.

@split = split(/\./, $username);
$l = length($username);
if ($l > 16) { return 0; }
else {
    foreach $seg (@split) {
       if (($seg =~ /-/) && ($seg !~ /\b\-*\b/)) {
          return 0;
      }
       elsif ($seg !~ /^[a-z0-9][a-z0-9\-]+[a-z0-9]+$/) {
          return 0;
       }
       else {
          return 1;
       }
    }
}

You got a 2.08% upvote from @postpromoter courtesy of @cryptosharon!

Want to promote your posts too? Check out the Steem Bot Tracker website for more info. If you would like to support the development of @postpromoter and the bot tracker please vote for @yabapmatt for witness!

Interesting, I didn't know all of this, thank you for the information, Sharon.

I've been hearing on the steem subreddit that creating new accounts is taking ages, that's a bummer. Good info tho!

It is! I told my friend a week ago to register and just today I got his follow notification (and another friend also got his account today).

yeah it took me over a week

Mine took 30 minutes beginning January :PPP But my friends at that moment also had to wait weeks for their accounts to be approved.

Great work and post Sharon, I really liked that you offered 3 SBD too. Very cool!

Can see your hard work and processing. I liked eon’s added help too.

Good job guys and gals!

Hey everyone,

I'm testing something here...can I crowdsource people's ideas and brains to find innovative solutions to problems? With the price of SBD and Steem right now...an upvote at 100% from me is almost worth $1000 US. Seems like a good enough incentive to get people's thinking right?

This seems to be as close as it get to being able to create customize bounties using my vote. You are welcome to participate and pitch in your ideas. If I find one particularly insightful, I might very well give a generous upvote!

The Problem
Right now I'm getting frustrated visiting the #photography tag. I'm working with friends to remedy the situation but I would like to crowdsource your brain for solutions.

It more seems you cannot be trusted, Your vote is not worth $1000 more like your vote is worth $0.005
if you lie about the value of your vote what else do you lie about?

I see from your wallet you are still using the delegated SP you were given at the start 4.794 STEEM
(+10.345 STEEM) I tell you what. You give me an upvote now of good value and I will give you insightful ideas on any topic you choose. Alternative solution to those you think of. Prove your worth.

He's a plagiarist. He copies and pastes without reading neither the post nor the thing he copies and pastes. I reported him to Steemcleaners and all his posts. He'll be a goner in 3 days, hopefully never to be seen again.

Oh, he's also an identity thief. I had forgotten.

Wow this was intense, even for me! Great work Jan and Sharon!

Please don't spam my posts...

Coin Marketplace

STEEM 0.24
TRX 0.11
JST 0.032
BTC 60446.31
ETH 2952.73
USDT 1.00
SBD 3.76