A FUZZY RECOGNITION MODEL FOR ARABIC HANDWRITTEN ALPHABET

A fuzzy recognition model for some handwritten Arabic alphabet is designed. This fuzzy model could be envisaged as an algorithm which is structured over two concepts. First; the handwritten character variation is modeled by the fuzziness of the feature vector's elements. Second; the notion of entropy is fuzzily modified to extract the amount of information in the elements within the feature vector so as to speed up the recognition process. Consequently, a fuzzy recognition graph of the optimum paths decision tree is designed for the handwritten Arabic alphabet character's recognition.

Each character is written inside a square frame with sixteen radial axes drawn from its center, the number of intersections made by any character with the radial axes will be considered as the elements of its feature vector.However, the amount of information per any ray is extracted from the feature's vagueness (fuzziness) by an index of fuzziness, and the maximum information per ray will be the criterion for the design of the decision tree graph.
An experiment, on some handwritten Arabic alphabet, is conducted.The designing data of 56-characters see fig.

Fig.(1):
The Alphabet characters designing samples are to be recognized by the fuzzy model, e.g. if you see the sample X 5 9 within one of the fuzzy recognition tree graphs then it means sample number 5 from the X 9 entity.

A FUZZY RECOGNITION MODEL:
The handwritten Arabic alphabet recognition technique, in this paper, is a fuzzy entropic recognition approach which can be formally described by the following fuzzy model.Let the handwritten characters be of (n) distinguishable entities; Where; the superscript (m) is the sample's number for each entity (i.e., a handwritten character class).See appendix (1).Practically, this handwritten character, x, can be attributed as a feature vector, y, of sixteen elements.Each one of them is the intersection number with the sixteen radial axes drawn from the center of a square frame on which the character was scripted; These elements of the feature vector are of no-crossing, single-crossing, doublecrossing and triple-crossing feature(s); Where, r; is the radial axis ray number.
Formally, a fuzzy subset, F, from the feature vector, y, is established to simulate the vagueness of the linguistic expression "that handwritten character is almost the entity x n "; µ F (r): y (r) → [0, 1] ………… (4) Where µ F (r) is the grade of membership value of the r th feature.Subjectively, this grade of membership function is the πfunction [9] [11], since it reflects the context suitability.The membership function's independent variable is the crossing point distance from one of the ray ending (the smallest distance).Thus the fuzzy subset, F, is a set of ordered pairs; Notably, a unity grade of membership value is assigned to the no-crossing feature since it carries the largest amount of vague belonging to which entity.
The compound grade of membership value for a k-crossing feature, here the highest value of k is three, is evaluated as follows: That is because the amount of information revealed by a certain ray is depending on the elimination of the feature vagueness (fuzziness).Thus an index of fuzziness which provides a measure of the amount of information in a certain ray may be one of the following fuzzy entropies; Where; m; is the number of the designing handwritten samples.µ F (r, i); is the grade of the membership value in the fuzzy subset, F, for the r th ray's feature from the i th sample.µ Fc (r, i) = 1-µ F (r, i).P(r, i); is the probability of occurrence of a feature in the rth ray in a sample space of the m samples.
Both formulae are an indication for the amount of information per ray so that the index of uncertainty in the first formula is the fuzziness of the feature.While, in equ (8), the randomness as well as the fuzziness of that feature is considered.However, both formulae satisfy some properties [9] [10] [11], such that the amount of information is zero as the fuzzy subset becomes a sharpened subset, that is the membership value is either 0 or near 1.
The ray, r 0 , with the largest amount of information is selected as a candidate for the recognition of some handwritten alphabet which is characterized uniquely by only one type of a feature; We refer to the first formula for simplicity of representation.After selecting, observing, and recognizing, there will be sets of different handwritten alphabet entities, each set is characterized by one type of a feature observed in the selected ray, exclude that ray, separate these common -featured sets, and calculate the amount of information for the remaining rays for each set; Where; M; is the number of the unrecognized character in a common featured set; M ≤ m R= 1, 2… r-s s; is the number of the excluded rays.
Then each set will have a ray with maximum information.It must be observed first; This process of calculating, selecting, observing, excluding, then again calculating will be continued until all of the alphabetic characters in the designing set are recognized.Consequently, a decision tree graph of a hierarchically observed rays so that there are optimum paths of rays observation for the recognition of an unknown handwritten alphabetical characters.

EXPERMENT:
The handwritten alphabet sample was placed on a square frame.The designing sets were 56-samples; six samples for each entity as shown above in fig.
(1), and the testing set were 144-samples; twelve samples for each entity.
After measuring the crossing point(s) distance(s) from one of the ray's ending to determine its grade of membership value, we have tabulated the designing set as an entity-ray number matrix, ( m x r ), where each element is a pair of feature and its membership grade value, as in equ.( 5).Now, the amount of information per ray is revealed by the calculation of the fuzzy entropy by equ.(7) or equ.(8).Therefore, there are two graphs of decision tree as shown below in fig.
(3).Next as an example the decision tree graph of fig ( 2) will be traced.feature set, where the no-crossing feature set is of sixteen samples and the singlecrossing feature set is of thirty-two samples.After excluding the 6 th ray, there will be two entity-ray number matrices of (16 x 15) and (32 x 15).For each one of both calculate the fuzzy entropy, select the ray with the maximum information, and scan it for some uniquely featured alphabet and so on, until all of the designing data have been classified as in fig ( 2).This decision tree graph is tested by twelve samples for each entity, all of them are recognized except those of the second, third and fourth entities which have some misclassification rate of 95%.
• At each termination box the recognized alphabet character(s) are stated as X n m , where, m indicates the handwritten sample, and n indicates the identity number.

Fig. (3):
A Fuzzy Recognition tree Graph is obtained by utilizing equ (8) as Fuzzy Entropy.

CONCLUSION
A fuzzy entropy algorithm is designed for the computer recognition of any handwritten Arabic alphabet characters.In the sequel, a decision tree graph can eventually be translated into a fuzzy algorithm consisting of a set of rules in the form; "If … then".So that the ray's feature is observed, and either a handwritten recognized character is obtained or another ray's feature is to be observed.This fuzzy recognition tree is a computer fuzzy algorithm of significant benefit to be further extended for future Arabic handwritten scripts recognition work of bank checks, mail addresses, forms, and manuscripts [22][23].Additionally related applications to man-machine communication [27], includes mobile phone screen handwritten script text to be handwritten via a stylus, or a robotic optical character recognition for handwritten messaging, which assist in the automatic processing of handwritten documents / messaging, web-based translation [24], search engines [25], and information retrieval [26].
Startognition rate of 95% for the randomly ARABIC handwritten testing set.This highly success recognition rate is due to that the recognition model is mathematically representing the subjectivity (fuzzy) and the objectivity (probability) of the ARABIC handwritten alphabet.Margner and El Abed [27][28] compared 14 Arabic handwritten recognition model based on the INF/ENIT-database.Consequently, the higher recognition rate was by the SIEMENS recognition model of (94.58%).The Hidden-Markov Recognizer is the recognition engine utilized by the SIEMENS model.This technique was originally developed in 1993 in which a feature vector is created by a sliding window, then decoded and recognized by a multiple left-to-right models.During the past four decade this model had been in a series of improvements till the above mentioned recognition rate is successfully reached.Amin [4]designed a handwritten letter recognition model by utilizing a skeleton-base graph representation.The feature vectors were fed into a five-layer neural network which yield a (92%) recognition rate.Mostafa and Darwish [4]created a baseline-independent algorithm to segment handwritten words into letters and primitives utilizing the chain code representation to successfully achieve a (97.7%) recognition rate.