• Ph.D. (1990)

    Electrical Engineering

    Electronic Systems Engineering, University of Essex, Colchester, England

  • M.Sc. (1985)

    B.Sc. continued to M.Sc. in Electrical Engineering

    Faculty of Engineering, University of Tehran, Tehran, Iran

  • Document image analysis and recognition
  • Handwriting analysis and recognition

    Ehsanollah Kabir is a professor of electrical engineering at Tarbiat Modares University. He was born in Tehran, Iran on Nov. 1st, 1958. He obtained his B.Sc. and M.Sc. degrees in electrical and electronics engineering from the University of Tehran. He received his Ph.D. degree in 1990 from the University of Essex, where he worked on the recognition of handwritten postal addresses. His main areas of research are document image analysis and cultural heritage technology.


    Curriculum Vitae (CV)

    A Multi-Focus Image Fusion Method based on Fractal Dimension and Guided Filtering

    Nikoo Dehghani, Ehsanollah Kabir
    Conference Paper2020 25th International Conference on Pattern Recognition (ICPR) , 2021 January 10, {Pages 10697-10703 }


    Fractal Dimension (FD) is widely used for image segmentation because of its successful approach toward quantifying texture information. In this paper, we present a FD-based multi-focus image fusion method that utilizes FD to identify focused regions, as the primary step for the multi-focus image fusion process. The algorithm aims to extract the local FD features of each multi-focus pair estimated using the differential box-counting method. A guided filter is employed to further specify the spatial information and increase the robustness of the FD features to noise. The outcome would be analyzed to achieve a focus map that identifies sharp regions in each partially focused image. Afterwards, the detected regions are combined into a single al

    Late Combination shows that MEG adds to MRI in classifying MCI versus Controls

    D Vaghari, E Kabir, RN Henson
    Journal Paper , , {Pages }


    A Modified Inexact Arithmetic Median Filter for Removing Salt-and-Pepper Noise from Gray-Level Images

    M Monajati, E Kabir
    Journal PaperIEEE Transactions on Circuits and Systems II: Express Briefs , 2019 May 29, {Pages }


    We have recently proposed the approximate median filters (APMF). They are based on the sorting network and achieve acceptable image quality under low-cost hardware. In this brief, we develop a specific comparator to improve the capabilities of those filters in noise elimination. The architecture of our inexact median filters (IMF) is regular and modular. Also, we introduce the histogram based error dispersion plot as a new error evaluation method to have a better assessment of IMF performance. Simulation results show that the proposed filter is effectively low cost in power, area, and speed. Despite the trade-off between the filtering accuracy and circuit characteristics, the output quality of the filter is largely similar to that of the pr

    Handwritten Farsi word recognition using NN-based fusion of HMM classifiers with different types of features

    Seyed Ali Asghar Abbaszadeh Arani, Ehsanollah Kabir, Reza Ebrahimpour
    Journal PaperInternational Journal of Image and Graphics , Volume 19 , Issue 01, 2019 January 28, {Pages 1950001 }


    In this paper, an off-line method, based on hidden Markov model, HMM, is used for holistic recognition of handwritten words of a limited vocabulary. Three feature sets based on image gradient, black–white transition and contour chain code are used. For each feature set an HMM is trained for each word. In the recognition step, the outputs of these classifiers are combined through a multilayer perceptron, MLP. High number of connections in this network causes a computational complexity in the training. To avoid this problem, a new method is proposed. In the experiments on 16000 images of 200 names of Iranian cities, from “Iranshahr 3” dataset, the results of the proposed method are presented and compared with some similar methods. An er

    Synthesizing the note-specific atoms based on their fundamental frequency, used for single-channel musical source separation

    Mohammadali Azamian, Ehsanollah Kabir
    Journal PaperMultimedia Tools and Applications , 2019 January , {Pages 20-Jan }


    The musical source separation deals with extracting the musical signals from a mixture. To attain this goal, one of the efficient methods is to decompose the mixture into a dictionary of some basic functions that inherently describe the instruments. Usually, a unique function is synthesized for each of the notes of each instrument, called the note-specific atom. In this paper, a sine-harmonic model is utilized to synthesize note-specific atoms and the note’s fundamental frequency is used as a prior information to determine the model parameters. To calculate these parameters, the training signal spectrum is processed only around the main note harmonics. Experimental results demonstrated that the proposed method is much faster

    Training a whole-book LSTM-based recognizer with an optimal training set

    Mohammad Reza Soheili, Mohammad Reza Yousefi, Ehsanollah Kabir, Didier Stricker
    Conference PaperTenth International Conference on Machine Vision (ICMV 2017) , Volume 10696 , 2018 April 13, {Pages 1069610 }


    Despite the recent progress in OCR technologies, whole-book recognition, is still a challenging task, in particular in case of old and historical books, that the unknown font faces or low quality of paper and print contributes to the challenge. Therefore, pre-trained recognizers and generic methods do not usually perform up to required standards, and usually the performance degrades for larger scale recognition tasks, such as of a book. Such reportedly low error-rate methods turn out to require a great deal of manual correction. Generally, such methodologies do not make effective use of concepts such redundancy in whole-book recognition. In this work, we propose to train Long Short Term Memory (LSTM) networks on a minimal training set obtai

    A query-by-example music retrieval system using feature and decision fusion

    Nastaran Borjian, Ehsanollah Kabir, Sanaz Seyedin, Ellips Masehian
    Journal PaperMultimedia Tools and Applications , Volume 77 , Issue 5, 2018 March 1, {Pages 6165-6189 }


    An attractive topic of Music Information Retrieval (MIR) is focused on query-by-example (QBE), which receives a user-provided query and aims to find the target song from an associated music dataset. In this paper, we use feature and decision fusion techniques to develop a two-stage accurate and rapid QBE based MIR system. For this purpose, a proposed diverse ensemble of recognizers automatically recognizes the genre of the query in first stage. This diversity is yielded through feature extraction over different frequency bands followed by feature fusion to train the recognizers, and then a decision fusion technique fuses the individual results obtained by members of ensemble. Second stage measures similarity between query and o

    Combining RtL and LtR HMMs to recognise handwritten Farsi words of small-and medium-sized vocabularies

    Seyed Ali Asghar Abbaszadeh Arani, Ehsanollah Kabir, Reza Ebrahimpour
    Journal PaperIET Computer Vision , 2018 April 30, {Pages }


    In this study, a method for holistic recognition of handwritten Farsi words is proposed, which fuses the outputs of right-to-left (RtL) and left-to-right (LtR) hidden Markov models (HMMs). The experimental results on 16,000 images of 200 names of Iranian cities, from the ‘Iranshahr 3’ are presented and compared with those methods using only RtL or LtR models. Experimental results show that the main sources of error are similar beginnings or similar endings of the words. Since RtL and LtR models when dealing with the words behave differently, there is notable error diversity between the two classifiers in such a way that their combination increases the recognition rate. Compared to the RtL-HMM, the product of output scores of the RtL and

    Onset detection for tar solo

    Behraz Farrokhi, Ehsanollah Kabir, Hedieh Sajedi
    Journal PaperInternational Journal of Speech Technology , Volume 21 , Issue 4, 2018 December 15, {Pages 761-771 }


    This paper develops a new method of onset detection for the Tar, a traditional Iranian musical instrument. The proposed method is based on both types of pitch and energy features. Therefore, it can be utilized to detect either soft or hard onsets. Through this combination, we obtained a more precise separation between two adjacent notes. This ability is especially useful to detect the reaz, repeatedly played notes with the same frequency and short durations. For the evaluation of the method, a data set with predetermined onsets was produced and the results were compared with an energy-based method explained in terms of F-measure.

    Color Reduction in Hand-drawn Persian Carpet Cartoons before Discretization using image segmentation and finding edgy regions

    M Fateh, E Kabir
    Journal PaperJournal of AI and Data Mining , Volume 6 , Issue 1, 2018 March 1, {Pages 47-58 }


    In this paper, we present a method for color reduction of Persian carpet cartoons that increases both speed and accuracy of editing. Carpet cartoons are in two categories: machine-printed and hand-drawn. Hand-drawn cartoons are divided into two groups: before and after discretization. The purpose of this study is color reduction of hand-drawn cartoons before discretization. The proposed algorithm consists of the following steps: image segmentation, finding the color of each region, color reduction around the edges and final color reduction with C-means. The proposed method requires knowing the desired number of colors in any cartoon. In this method, the number of colors is not reduced to more than about 1.3 times of the desired number. Auto

    Combining RtL and LtR HMMs to recognise handwritten Farsi words of small‐and medium‐sized vocabularies

    SAA Abbaszadeh Arani, E Kabir, R Ebrahimpour
    Journal Paper , , {Pages }


    Text-image super-resolution through anchored neighborhood regression with multiple class-specific dictionaries

    Ali Abedi, Ehsanollah Kabir
    Journal PaperSignal, Image and Video Processing , Volume 11 , Issue 2, 2017 February 1, {Pages 275-282 }


    In the dictionary-based image super-resolution (SR) methods, the resolution of the input image is enhanced using a dictionary of low-resolution (LR) and high-resolution (HR) image patches. Typically, a single dictionary is learned from all the patches in the training set. Then, the input LR patch is super-resolved using its nearest LR patches and their corresponding HR patches in the dictionary. In this paper, we propose a text-image SR method using multiple class-specific dictionaries. Each dictionary is learned from the patches of images of a specific character in the training set. The input LR image is segmented into text lines and characters, and the characters are preliminarily classified. Likewise, overlapping patches a


    Journal Paper , Volume 15 , Issue 2, 2017 January 1, {Pages 102-112 }


    In color reduction algorithms the result will be evaluated based on visual or qualitative standards. Evaluation without considering the quantitative standard wouldn't be a complete and accurate evaluation and trends of viewer are very effective on the evaluation. In some articles, the result will be evaluated with MSE. In this standard error the difference between the final images’ pixels color with first image will be considered as a failure in which is not a suitable technique for evaluating of color reduction methods. In images color reduction, if a color completely be replaced by a color closed to the original color it wouldn’t be considered as a failure. If these replacements don’t happen for all of those specific color pixels, t

    An adaptive sparse algorithm for synthesizing note specific atoms by spectrum analysis, applied to music signal separation

    Mohammadali Azamian, Ehsanollah Kabir, Sanaz Seyedin, Ellips Masehian
    Journal PaperAdvances in Electrical and Computer Engineering , Volume 17 , Issue 2, 2017 January 1, {Pages 103-113 }


    In this paper, a sparse method is proposed to synthesize the note-specific atoms for musical notes of different instruments, and is applied to separate the sounds of two instruments coexisting in a monaural mixture. The main idea is to explore the inherent time structures of the musical notes by a novel adaptive method. These structures are used to synthesize some time-domain functions called note-specific atoms. The note-specific atoms of different instruments are integrated in a global dictionary. In this dictionary, there is only one note-specific atom for each note of any instrument, resulting in a sparse space for each instrument. The signal separation is done by mapping the mixture signal to the global dictionary. The signal related t

    Merging clustering and classification results for whole book recognition

    Mohammad Reza Soheili, Mohammad Reza Yousefi, Ehsanollah Kabir, Didier Stricker
    Conference Paper2017 10th Iranian Conference on Machine Vision and Image Processing (MVIP) , 2017 November 22, {Pages 134-138 }


    Historical printed books OCR is one of the challenging tasks in the area of document image analysis. Low quality of print and paper and unfamiliar font faces are the most known problems. However, redundancy of word and sub-word occurrences in the document can be used to improve the recognition results. In this paper, we propose a highly accurate recognition system for printed old books. We use the combination of sub-word clustering and a LSTM neural network as a character recognizer to reduce the error rate. Due to the lack of information about the font faces, we manually label some part of the books. We show that the recognition error rate can be reduced noticeably by combining the results of the LSTM recognizer and sub-word clustering and

    Text image super resolution using within-scale repetition of characters and strokes

    Ali Abedi, Ehsanollah Kabir
    Journal PaperMultimedia Tools and Applications , Volume 76 , Issue 15, 2017 August 1, {Pages 16415-16438 }


    In text images, there are some frequently used characters repeating more than others. Likewise, some characters have common strokes. This characteristic is used in this paper for machine-printed text-image super resolution. After segmenting the input low-resolution image into text lines and characters, 1) the characters are clustered and the clusters with large number of members, corresponding to the frequent characters, are detected. 2) A text-specific multiple-image super resolution is applied to the members of each large cluster and the result is verified by the recognition confidence of an OCR system. 3) A training example set is then constructed by extracting patches from the low-resolution frequent characters and their


    Journal Paper , Volume 14 , Issue 1, 2016 January 1, {Pages 82-88 }


    This paper develops a new method of onset detection for the Tar, a traditional Iranian musical instrument. The proposed method is based on both types of pitch and energy features and an adaptive peak picking algorithm is utilized for primary onset detection. An improved template matching method is used to detect fundamental frequencies and finally, onsets are tagged based on primary onsets and fundamental frequencies. This step is especially useful to detect the reaz, repeatedly played notes with the same frequency and short durations. For the evaluation of the method, a data set with predetermined onsets was produced and the results were compared with an energy based method explained in terms of F measure.


    Journal Paper , Volume 14 , Issue 3, 2016 January 1, {Pages 177-192 }


    In this paper, a new method for resolution enhancement of single document images is presented. The proposed method is example based using an example set of low-resolution and high-resolution training patches. According to the Bayes rule, one function is considered as the likelihood or data-fidelity term that measures the fidelity of the output high-resolution to the input low-resolution image. As well, three other functions are considered as the regularization terms containing the prior knowledge about the desired high-resolution document image. Three priors which are fulfilled by the regularization terms are bimodality of document images, smoothness of background and text regions, and similarity to the patches in the example set. By minimi

    Stroke width-based directional total variation regularisation for document image super resolution

    Ali Abedi, Ehsanollah Kabir
    Journal PaperIET Image Processing , Volume 10 , Issue 2, 2016 February 1, {Pages 158-166 }


    In the Bayesian image super resolution (SR), a regularisation term is minimised along with a data-fidelity term to generate a high-resolution (HR) image from input low-resolution (LR) image. The regularisation term is incorporated into the SR to fulfil a prior knowledge over the HR image. For instance, smoothness in the background and foreground regions is a prior knowledge about document images. The bilateral total variation (BTV), as a known regularisation term, uniformly smooths the image in all directions while preserving the edges. In this study, the authors present a document image SR method by introducing a new regularisation term called the stroke width-based directional total variation (SWDTV). It is a modified version of the BTV,

    A weakly supervised large margin domain adaptation method for isolated handwritten digit recognition

    Hamidreza Hosseinzadeh, Farbod Razzazi, Ehsanollah Kabir
    Journal PaperJournal of Visual Communication and Image Representation , Volume 38 , 2016 July 1, {Pages 307-315 }


    Learning handwriting categories fail to perform well when trained and tested on data from different databases. In this paper, we propose a novel large margin domain adaptation algorithm which is able to learn a transformation between training and test datasets in addition to adapting the parameters of classifier using a few or even no training labeled samples from target handwriting dataset. Additionally, we developed a framework of ensemble projection feature learning for datasets representation as a front end for our algorithm to utilize the abundant unlabeled samples in target domain. Experiments on different handwritten digit datasets adaptations demonstrate that the proposed large margin domain adaptation algorithm achieves superior cl

    Current Teaching

    • Ph.D.

      Special Topics in Electronics (Statistical Pattern Recognition)

    Teaching History

    • MS.c.

      Image Processing

    • 2019
      Esmaili, Masoud
      Dewarping of text images by deep learning
    • 2020
      Rezaei, Ashkan
    • 2020
      Attaran Bondarabadi, Atie



      no record found