How to recognize data on your credit card
Recognition of data from the credit card is a highly actual and very interesting task from the standpoint of algorithms. Well implemented card recognition software can save people from the need to enter most of the data manually when they make online payments and payments in mobile applications. In terms of recognition, the bank card is a complex document of standard size (85,6 × 53,98 mm), made on a standard form and containing a specific set of fields (both mandatory and optional): card number, cardholder name, date issuance, expiration date, account number, CVV2-code or its equivalent.
Although steps of recognition for all three mandatory fields are the same, the complexity widely varies. The easiest is to recognize the card number; the situation is more complicated with the remaining two fields: the validity and name of the card holder.
In this article, we will have a closer look at the card validity recognition procedure (recognition of the name is similar).
Algorithm of validity recognition
Let the image of the card be already straightened (projective transformation, resulting in getting an image of the card with an orthogonal view with a fixed resolution). The result of the algorithm must be 4 decimal digits: two for the month and two for the year. It is considered that the algorithm gave the correct answer if 4 digits coincide with those that are shown on the card are received. The symbol that separates them is not included.
The first step is to locate the field on the card (as opposed to the number, the location of this field is not standardized). The use of "brute force" all over the card areas is unpromising as the corresponding text fragment is very short (usually 5 characters), syntactic redundancy is small, and the probability of false detection of an arbitrary piece of text or even a colorful background area is unacceptably large. Therefore, we apply a trick: we look not for the actual date, but some information area which is located under the card number and have stable geometric structure.
Considered zone is divided into three lines, one of which is often empty. In the case when there is two non-empty row in a zone, their spacing coincides with the three-line zones spacing, or approximately equal to the sum of twice the line spacing and height.
A search of an area and splitting it into 3 lines is complicated by the presence of the background on the card. To solve this problem a combination of filters, the purpose of which is to distinguish the vertical boundaries of letters and blank other parts of the image on a picture of the card.
The sequence of filters is a follows:
The image is getting gray by averaging the color values of the channels using the formula
The calculation of the vertical borders using the formula.
Filtering small vertical boundaries using mathematical morphology.
After filtering pixel intensity the processed images are projected on a vertical axis.
With the help of resulting projection, it is now possible to find the most probable position of the lines, suggesting the absence of horizontal borders in the spacing. We minimize the amount by the projection over all periods and the initial phases of a predetermined range:
Since local minimum is usually quite pronounced and at the outer boundaries of the text, the optimal value is four (meaning that the card has three lines, and hence, four local minimums). As a result, we find the parameters that define the centers of line spacing, as well as the outer limits of the text.
Now the area of the search can be substantially reduced at the same time taking into account the original shape of the area and found the position of the lines on the image. For such a crossing is the set of possible positions of the substrings are generated by, and we’ll work with them.
Each substring-candidate is segmented into characters, given that all the symbols on these cards monospaced. This allows using dynamic programming algorithm to search for profiles without inter-symbol character recognition (all you need to know - the permissible range for the character width).
After segmentation into characters, it's time for detection using the artificial neural network (ANN).
Note couple of facts:
- For recognition, the convolutional neural network trained using the cuda-convnet tool are used.
- Alphabet of trained network includes numbers, punctuation marks, space, and a sign of non-symbols ( "garbage").
Thus, for each symbol, we get an image array containing a pseudo-probability location estimation symbol alphabet corresponding to a given image. It seems that the correct answer is to build a line that consists of the best options. However, the ANN is sometimes mistaken. Part of the ANN errors can be corrected using post-processing due to existing constraints on the estimated value of the date (for example, there is no 13th month). It uses an algorithm called "roulette", iteratively enumerating all possible options of "line reading" in descending order of total pseudo-probability. The first one that meets the existing restrictions option is considered to be the answer.
Follow me, to be the first to learn about my publications devoted to popular science and educational topics