Programming Diary #4: Learning about aesthetics; Document Listeners; Improved scraping and validation of metadata; Challenges with "git"

in Steemit Dev Group2 years ago (edited)

This update covers the last two weekends and includes my activity in the following areas: learning more about aesthetics; Document Listeners; Improved scraping and validation of metadata; Challenges with "git"


Introduction

image.png

Pixabay license, source

Well, it has been about two weeks since my last programming update, but I've been continuing to work on stuff behind the scenes, slowly working my way through the checklist from previous posts, and also setting the stage for future improvements. In general, I've had four focus areas during the last two weeks: Learning about JavaFX for possible improvements to the look and feel of the Steem Links Creator tool that I'm working on; Learning how to implement Java Document listeners within Netbeans so that I can start validating fields in the form in near real time; Improving web scraping capabilities - especially for keywords and publication dates; and learning to navigate "git" in order to set the stage for future collaboration.

Before I get into descriptions of each of the above, here's the current state of my "checklist":

  • [_] Figure out how to download the functional version of SteemJ and update its dependencies to current versions.
  • [x] Add fields to specify tags.
  • [x] Figure out how to scrape the publication date from a web site using jSoup.
  • [_] Put min/max lengths on blockquotes and author commentary.
  • [_] Require a publication date.
  • [x] Protect against accidentally double-posting from the same form data (clear form after posting).
  • [x] Protect against accidentally clearing the form before posting.
  • [~] Lots of error handling, all over the place.
  • [_] Find a way to add multiple beneficiaries
  • [~] Additional jSoup web scraping for alternate publication date formats, and possibly to suggest categories/tags/keywords.
  • [_] Identify open source software and do security audit of SteemJ and my own code. One possible starting point appears to be: OWASP
  • [~] Cosmetic improvements (look & feel - maybe JavaFX and/or CSS)
  • [x] Change "Save" to "Save & Exit" in pop-up form for account entry
  • [x] Prevent posting with stale data in the autogenerated HTML after updates/corrections in other fields.
  • [_] HTML/Markdown preview option before posting.
  • [_] Spell check in author commentary
  • [~] Tag length and character-type validation
  • [_] Code cleanup - move code to methods where possible and eliminate redundant copy/paste code blocks. (new)
  • [x] Merge git development branch into "Master" branch (new)
  • [x] Highlight Metadata field borders in red if they cannot be scraped. (new)
  • [_] Fix OpenGraph scraping for Steemit links. (new)
  • [_] Rewrite Document Listener to make use of lambda function. (new)
  • [_] Figure out how to distinguish between MM/DD/YYYY and DD/MM/YYYY date formats in web scraping (if possible?). (new)

These items are being dropped, since they were previously completed:

  • [x] Manual entry of account name and Posting key (in a password field).
  • [x] Figure out how to apply beneficiary settings to a post through SteemJ.
  • [x] Provide a checkbox and entry field to set @penny4thoughts as a beneficiary and specify the percentage.

As before, [_] means it hasn't been started, [~] means that it's in progress, and [x] means that it's completed.

For me, a major milestone will be when I am able to do a security audit of my code and the SteemJ code. I probably won't be entering my own posting key into the web form until that is complete, so for now I'm continuing to test with the @social account. I haven't even begun to look at that item yet.

And now on to brief descriptions of the largest areas of focus during the last two week-ends.

Learning about aesthetics

I have zero front end experience with any programming languages since cobbling together some HTML and perl CGI scripts back in the 1990s. I guess I know some of the buzz-words and what they mean (at a really high level), but I don't have anywhere near the level of detail I need, so I've been reading and watching tutorial videos to try to make plans for the eventual point where I want to make the application look more attractive. So far, this mostly involved experimentation and tutorial videos with JavaFX. I've made it about half-way through this YouTube playlist. I also downloaded SceneBuilder and made it work from inside of Netbeans.

As of now, I am able to parrot some examples, but I'm still a long way from being able to replace the Swing forms that I'm using now. Also, I'm not sure if it might be better to look into Javascript and CSS. So, this remains a long term work in progress.

Document Listeners

It turns out that Java Swing's jTextField and jTextArea data structures automatically create an underlying "Document" to hold the data. In order to be able to monitor for changes, then, it is necessary to implement a "Document Listener". I needed to be able to do this so that I can prevent the tool from posting stale HTML/markdown code and also (eventually) validate the field contents.

Conceptually, it makes sense, and it's fairly easy to understand. In practice, though, it's not so simple. I found lots of examples that told me "how" to do it, but not the context where the code is needed. I also found an old Netbeans bug claiming that it couldn't be done in Netbeans without using a separate editor. I wasn't willing to accept that answer, though, and eventually found the right context for the code. Here's what the Document Listener looks like in my main class file.

private void listenForDocChanges()
{
    jTextFieldArticleTitle.getDocument().addDocumentListener(new DocumentListener() {

        @Override
        public void insertUpdate(DocumentEvent e) {
            disable_posting();
            // throw new UnsupportedOperationException("Not supported yet."); // Generated from nbfs://nbhost/SystemFileSystem/Templates/Classes/Code/GeneratedMethodBody
        }   

        @Override
        public void removeUpdate(DocumentEvent e) {
            disable_posting();
            // throw new UnsupportedOperationException("Not supported yet."); // Generated from nbfs://nbhost/SystemFileSystem/Templates/Classes/Code/GeneratedMethodBody
        }

        @Override
        public void changedUpdate(DocumentEvent e) {
            disable_posting();
            // throw new UnsupportedOperationException("Not supported yet."); // Generated from nbfs://nbhost/SystemFileSystem/Templates/Classes/Code/GeneratedMethodBody
        }
    } );

Right now, the only thing these listeners are doing is disabling posting after changes until the HTML/Markdown block is regenerated. There is a corresponding block of code inside the "listenForDocChanges" function for each text field in the form. This is also where I will eventually add the field validation logic.

It seems that a better way to do this might be through the use of a lambda function, but I'm not there yet.

Improved metadata scraping

It seems that various web pages each have their own particular way of presenting metadata, so metadata scraping is likely to be an iterative process where I'll need to make modifications to accommodate newly discovered formats for the foreseeable future. These past two week-ends, my efforts have been focused on scraping the date and the keywords.

With regards to date, as described in my previous post, the app is currently looking for three different formats, and I am quite sure that I'll need to add more. Changes over the last two week-ends focused on parsing the date and then printing it in a format that's friendly to human readers. In particular, I can now parse YYYY-MM-DD, YYYY/MM/DD, and DD/MM/YYYY and then put it in the post with the format, "Full_Month_Name DD, YYYY", which I believe is most readable for humans.

With regards to tags, I wrote a routine to scrape keywords from a web site, replace spaces with dashes, and then validate the characters. I still need to add logic to make sure the tags are of the proper length (24 characters or less).

One oddity is that scraping of the Open Graph information is not working for Steemit links.

Here is what the web form currently looks like with the formatted date and the red borders on a field that couldn't be scraped.

image.png

Challenges with "git"

Git is another tool that I'm relatively inexperienced with. I have cloned repositories many times, but I have never contributed updates. Pulling from and pushing to the repository was easy to understand at first glance, but this week I decided to merge my changes to date back into the "Master" branch.

For some reason, when I did so, the code for the pop-up button to enter posting credentials did not find its way into the "Master" branch. I was able to fix that by using git in a linux shell instead of the one in Netbeans, but I'm still not quite sure what I did wrong in the first place.

Conclusion

In all, considering that I'm mostly working on this in short bursts from week-end to week-end, I'm happy with the progress I'm making, but it still feels like there's a long way to go for a relatively simple project.

One question I have for the platform is whether this community still has active moderators and whether it should be used for non-Steemit topics (the name is "Steemit Dev Group", not "Steem"). If not, I'm going to suggest that we set up a new community (or choose one that already exists) for Steem Development discussions. My suggestions for this or any Steem development community is to:

  1. Increase the number of moderators to avoid having it orphaned. (5? 10?)
  2. Moderate aggressively to keep the focus on Steem development efforts.
  3. Set up a pinned post in the community as a registry for active, orphaned, and hoped for projects so that developers can easily find each other as well as getting ideas for things to work on. Each project would have the project's language(s), licensing, repo (if published) and the Steem account of the person who is/was contributing to it. (anything else?)
  4. Create a community curation account.

Personally, off the top of my head, some projects that I would love to see resurrected or replaced include:

I think it would be good to maintain a list of projects like these in a pinned post, so that new developers who come along are aware of them as potential opportunities. What projects would you like to see tracked?


A 36% beneficiary has been assigned to @penny4thoughts in order to reward relevant and substantive commentary; A 5% beneficiary has been assigned to @steemj.


Thank you for your time and attention.

As a general rule, I up-vote comments that demonstrate "proof of reading".




Steve Palmer is an IT professional with three decades of professional experience in data communications and information systems. He holds a bachelor's degree in mathematics, a master's degree in computer science, and a master's degree in information systems and technology management. He has been awarded 3 US patents.

Sort:  

It seems that a better way to do this might be through the use of a lambda function, but I'm not there yet.

I think so too. Essentially, this shortens the function considerably, since you don't have to overwrite every method of the interface (not tested):

private void listenForDocChanges()
{
    jTextFieldArticleTitle.getDocument().addDocumentListener(e -> {
        disable_posting();
        // throw new UnsupportedOperationException("Not supported yet."); // Generated from nbfs://nbhost/SystemFileSystem/Templates/Classes/Code/GeneratedMethodBody
    });
}

In all, considering that I'm mostly working on this in short bursts from week-end to week-end, I'm happy with the progress I'm making, but it still feels like there's a long way to go for a relatively simple project.

I can well understand that. But you're doing it in your free time and it can take longer. And it should also be fun.

Thanks! That code kicked out an error because "DocumentListener is not a functional interface", but with guidance from here, I was able to extend DocumentListener into a Functional Interface, and now it's working. The code is far more streamlined that way.

Oh yes, right! Especially because you need the code multiple times, the additional definition is useful.

I have zero front end experience with any programming languages since cobbling together some HTML and perl CGI scripts back in the 1990s.

Hehe. I am a dev born in an era where everyone learns React and JS. Yet I am really very bad at anything UI-related.

Coming to the Git-related issue with Netbeans, Why would you prefer anything over the shell for making commits? Even though there are several Git GUIs today, I still feel that I have more control over things when I use the shell to make the commits, push, pull or mergs.

Yeah, I normally prefer to work at the command line, but I also hate switching between windows and moving my hands between keyboard and mouse. So once I'm in the IDE, I stay there unless forced to move. ;-)

It seems to me an excellent suggestion for the panel , I have seen the panel very quiet so to speak , the routine is not good , demotivating , it is an issue that the panel developers should take into account account or read your post, here are good approaches that could be implemented

I don't understand much about programming, this topic is not my forte, but to give the page a more striking appearance I think that would be the easiest job to do.

When I was learning programming it was because I wanted to make my own web page after spending money on a programming course I regretted it because my brain did not collect information from the commands and when I wanted to make my web page it gave me a headache when the commands did not respond to me and to not be having a headache I preferred to use WordPress.

The no code approach is very popular!

Coin Marketplace

STEEM 0.30
TRX 0.12
JST 0.032
BTC 59179.00
ETH 2969.17
USDT 1.00
SBD 3.75