KurdSet: A Kurdish Handwritten Characters Recognition Dataset Using Convolutional Neural Network
Handwritten character recognition (HCR) involves identifying characters in images, documents, and various
sources such as... See more
Handwritten character recognition (HCR) involves identifying characters in images, documents, and various
sources such as forms surveys, questionnaires, and signatures, and transforming them into a machine-readable
format for subsequent processing. Successfully recognizing complex and intricately shaped handwritten characters
remains a significant obstacle. The use of convolutional neural network (CNN) in recent developments has
notably advanced HCR, leveraging the ability to extract discriminative features from extensive sets of raw data.
Because of the absence of pre-existing datasets in the Kurdish language, we created a Kurdish handwritten dataset
called (KurdSet). The dataset consists of Kurdish characters, digits, texts, and symbols. The dataset consists of
1560 participants and contains 45,240 characters. In this study, we chose characters only from our dataset. We
utilized a Kurdish dataset for handwritten character recognition. The study also utilizes various models, including
InceptionV3, Xception, DenseNet121, and a custom CNN model. To show the performance of the KurdSet dataset,
we compared it to Arabic handwritten character recognition dataset (AHCD). We applied the models to both
datasets to show the performance of our dataset. Additionally, the performance of the models is evaluated using
test accuracy, which measures the percentage of correctly classified characters in the evaluation phase. All models
performed well in the training phase, DenseNet121 exhibited the highest accuracy among the models, achieving a
high accuracy of 99.80% on the Kurdish dataset. And Xception model achieved 98.66% using the Arabic dataset.
2024-04