Migrating from Python to C++

in #programming7 years ago (edited)

Using C++ for CGI scripts

Python is great for writing programs fast and I used Python 3 because there was a somewhat documented API for the Steem blockchain. However, it is only necessary for one script to run in the background and feed the API data to the database. I have always been partial to strong typing in languages. I am a big fan of Rust, but I don't really know all of the APIs and I fear they might be lacking. In compiled languages syntax errors are caught before you run tests on them and type-saftey prevents other non-syntax problems. Many problems that would throw a run-time exception in Python are caught at compile-time in C++.

Note: When I use three dots on one line in a row, it indicates there are more lines after or before the excerpt shown in the example.

Before I had python file for the configuration. This is because python is so simple, it looks so much like a config file.
Old steemfiles.py

root_dir = "/var/steemfiles/"
data_dir = root_dir + "data/"
bits_dir = root_dir + "bits/"
log_dir  = root_dir + "log/"

The above is an except from the actual python code file. And essentially, converting it to an ini file is just getting rid of operations other than = and adding [all] to the top of the file. Now, all of the scripts are expecting these variables root_dir, data_dir, etc...

new steemfiles.ini

[all]
root_dir = "/var/steemfiles/"
data_dir = /var/steemfiles/data/
bits_dir = /var/steemfiles/bits/
log_dir  = /var/steemfiles/log/

Similar, isn't it? Then we need python code to pull information out of this ini file. In C++ we will do the same to the same file.

Here is the code for loading the configuration of the file.
new steemfiles.py

import configparser
import os
import json
config = configparser.ConfigParser()
config.read(os.path.dirname(os.path.realpath(__file__)) + '/steemfiles.ini')
config = config['all']

Later on in the file we have these lines so we have same interface to all of the existing python code

# the data directory.  Where user's files are stored
data_dir = config["data_dir"]

# Where template s and bits are stored
bits_dir = config["bits_dir"]
log_dir = config["log_dir"]

Now for boolean options I decided to validate things a bit better. After all, there is only two possible values, right?

try:
    assert config['testnet'] == 'True' or config['testnet'] == 'False'
    testnet = config['testnet'] == 'True'
except KeyError:
    raise Exception("testnet not set in config file: Must be True or False")

Where's the C++?

I had to port Python into an INI file so that I can use this parser that works for ini file. Depending on the response I get from this post here I will decide if it is worth doing the next one. Comment, vote, resteem.

#include <vector>
#include <string>

#include <boost/program_options/config.hpp>
#include <boost/program_options/options_description.hpp>
#include <boost/program_options/parsers.hpp>
#include <boost/program_options/variables_map.hpp>

#include <fstream>
#include <iostream>
namespace po = boost::program_options;
using namespace std;

#include "steemfiles-config.hpp"

struct config_type steemfiles_setup(string program_name) {
    // configuration as a C++ structure
    config_type cfgs;
    // description of configuration file
    boost::program_options::options_description cfg_od("Options", 80, 40);
    // configuration as a string to any map
    po::variables_map cfgm;

    
    try {
        
        cfg_od.add_options()
        ("all.webserver_host", po::value< std::vector<std::string> >(), "localhost or steemfiles.com")
        ("all.root_dir", po::value< vector<string> >(), "root_dir")
        
        // the data directory.  Where user's files are stored
        ("all.data_dir", po::value< vector<string> >(), "")
        
        // Where template s and bits are stored
        ("all.bits_dir", po::value< vector<string> >(), "")
        ("all.log_dir", po::value< vector<string> >(), "")
        // much deleted...
        ("all.testnet", po::value< vector<string> >(), "");
        

Well, here is a cute way to parse not only configuration files but also command lines. We pass the type of the value if it is a numeric value. Boolean values are not handled well in this Boost C++ library so we make that a string and after we convert them into bool values.

Now just like in python I want these values to be stored into a global variable but instead of seperate variables without a type. I put all of the values in a struct. The struct gives everything a type.

        static std::runtime_error webserver_host_not_in_cfg("webserver_host not specified in the config file!");
        static std::runtime_error webserver_host_repeated_in_cfg("webserver_host specified more than once in the config file!");
        if (cfgm.count("webserver_host") == 0) {
            throw webserver_host_not_in_cfg;
        } else if (cfgm.count("webserver_host") > 1) {
            throw webserver_host_repeated_in_cfg;
        } else {
            cfgs.webserver_host = cfgm["webserver_host"].as< vector< string> >()[0];
            

It becomes tedious to write these again and again, so I right away I decided to make macros. After some refactoring, I settled on these macros:

        #define check_option(opt) \
        static std::runtime_error opt ## _repeated_in_cfg(#opt " specified more than once in the config file!");\
        if (cfgm.count("all." #opt) == 0) {\
            throw new std::runtime_error(string(#opt) +  " not specified in the config file!");\
        } else if (cfgm.count("all." #opt) > 1) {\
            throw new std::runtime_error(string(#opt) +  " specified more than once in the config file!");\
        } else
        
        #define check_option_string(opt) check_option(opt)  {\
            cfgs.opt = cfgm["all." #opt].as< vector< string> >()[0];\
        }
        
        #define check_option_int(opt) check_option(opt) {\
            cfgs.opt = cfgm["all." #opt].as< int >();\
        }
        
        #define check_option_bool(opt) check_option(opt) {\
            vector< string > vs = cfgm["all." #opt].as< vector< string > >();\
            if (vs[0] == "False") {\
                cfgs.opt = false;\
            } else if (vs[0] == "True") {\
                cfgs.opt = true;\
            } else {\
                throw new std::runtime_error(string(#opt) + " has an invalid value in the configuration file: It should be True or False not " + vs[0]);\
            }\
        }

After that, you can have options like these:

        check_option_string(webserver_host)
        check_option_string(root_dir)
        check_option_string(data_dir)
        
        // Where template s and bits are stored
        check_option_string(bits_dir)
        check_option_string(log_dir)
        check_option_string(thumb_dir)

The header file is just a struct and declaration of this routine:

#include <string>
using namespace std;

struct config_type {
    string webserver_host;
    string root_dir;
        
        // the data directory.  Where user's files are stored
    string data_dir;
        
        // Where template s and bits are stored
    string bits_dir;
    string log_dir;
    string thumb_dir;
    //....
};
struct config_type steemfiles_setup(string program_name);

Finally

Well I hope that was interesting to some of you. I have just finished porting over the download.cgi script to C++. It is so much faster, now. It's still running on my first test-server at the moment.

Dash XjzjT4mr4f7T3E8G9jQQzozTgA2J1ehMkV
LTC LLXj1ZPQPaBA1LFtoU1Gkvu5ZrxYzeLGKt
BitcoinCash 1KVqnW7wZwn2cWbrXmSxsrzqYVC5Wj836u
Bitcoin 1Q1WX5gVPKxJKoQXF6pNNZmstWLR87ityw (too expensive to use for tips)

See my other recent article:


Coin Marketplace

STEEM 0.18
TRX 0.15
JST 0.029
BTC 62915.59
ETH 2542.92
USDT 1.00
SBD 2.63