You are viewing a single comment's thread from:

RE: The 5 Rules of a Valid Username on the Steem Blockchain (and a 3 SBD contest to make an account name validation RegEx)

in #programming7 years ago (edited)

I would first use a "maxlength" command (every language has one) to find if the length has been violated. I would then use a split command to split the segments based on the period (.) then use this regular expression to verify each segment:

/^[a-z][a-z0-9\-][a-z0-9]+$/

This is untested and might need quirking but it fits all your rules.

Sort:  

Oh, I could use some code for sure. Maybe I am too much of a dreamer, but I really want a regexp that kinda checks it completely. But in absence of any other, I could use yours after some quirks :)

Your dash is bound to the second place, so it only validates dashes if they're the second character.

Didn't test this yet but it's pretty so I'm going to draw it.

. (-.)? (.-.|.)* (.-)? .

Just as an idea for dealing with the hyphen.

This one matches pretty much anything :P

And with a varietal with [a-z] at the beginning and numbers at the end, accepting dots in the middle, there is a biiit more detection.

[a-z](-[a-z0-9.])?([a-z0-9.]-[a-z0-9.]|[a-z0-9.])*([a-z0-9.]-)?[a-z0-9]

Oh I was already operating under the assumption that you split the segments. The purpose of this was to give an idea of how to handle hyphens. Of course you should replace the dots with the appropriate character classes :)

It also accepts two characters, so that's not ideal. Maybe change the middle * to a +

Oh!

Then we're only missing the alternating hyphen and it's done.

^[a-z](-[a-z0-9])?([a-z0-9]-[a-z0-9]|[a-z0-9])*([a-z0-9]-)?[a-z0-9](?:\.[a-z](-[a-z0-9])?([a-z0-9]-[a-z0-9]|[a-z0-9])*([a-z0-9]-)?[a-z0-9])*$

More tests:

How about

. (-.(-.)*)? (.-.|.)* ((.-)*.-)? .

Alternating hyphens wouldn't work either

Further testing:

How about

. (-.(-.)*)? (.-.|.)*

Over eager in matching that last character. I should actually put your test cases into a tester before throwing this out there haha... I'll do that next time I swear.

That gets dangerous as the more complicated the regex the more unpredictable the behaviour. I think my solution is actually the most robust method. But some people like a long regex.

You're right! And I absolutely agree. The problem is that I'll be working in different setups and I think that the most portable method is regex. It can just be copied into any test and it will also allow for capturing and detection of certain parts of posts that are supposed to be account names.

A longer test such as Steemit's would be ideal for many things and I think that you're right about it being more robust and solid, but it's at the same time more complicated and time-consuming.

Hmm. You are correct that the dash poses a logistical problem but looking at it closer it just simply does not prevent a double dash. Back to the drawing board.

These are some tests I wrote: https://pastebin.com/7dpgMiZC

You can check for them on RegExr

You can include the split in the regex like this:

/^regex(?:\.regex)*$/

Sure. But here's a solution in Perl with very few headaches. I can also write it in C++, C, Python or Javascript rather painlessly. I can write it in any other language you prefer with a little bit of reference work.

@split = split(/\./, $username);
$l = length($username);
if ($l > 16) { return 0; }
else {
    foreach $seg (@split) {
       if (($seg =~ /-/) && ($seg !~ /\b\-*\b/)) {
          return 0;
      }
       elsif ($seg !~ /^[a-z0-9][a-z0-9\-]+[a-z0-9]+$/) {
          return 0;
       }
       else {
          return 1;
       }
    }
}

Coin Marketplace

STEEM 0.19
TRX 0.15
JST 0.029
BTC 62948.49
ETH 2583.15
USDT 1.00
SBD 2.74