Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Stroke Segmentation by FCN

less than 1 minute read

Published:

This paper I will try to replicate their experiments on the paper.

Build TF from source on Windows

6 minute read

Published:

0. Environment

  • I verified the following steps on Windows Server 2012 R2 (Standard) 64bit with Microsoft Visual Studio Community 2015 Update 3 and TF 1.12 version.

CTC in Tensorflow

1 minute read

Published:

Connectionist Temporal Classification (CTC) was proposed by Alex Graves in

R2E and R4E

less than 1 minute read

Published:

https://blogs.nvidia.com/blog/2017/12/11/top-5-things-enterprise/

Autoencoder

less than 1 minute read

Published:

https://arxiv.org/pdf/1606.03498.pdf

Notes

less than 1 minute read

Published:

OpenAI

Software License

6 minute read

Published:

Software licenses and Rights

Install Scientific Development Environment for Mac OSX 10.6.8

1 minute read

Published:

It could take enormous time for installing some packages and preparing for a scientific development environment because this is an old version of Mac OSX (since 2009). In this tutorial, I will present how to install netCDF, python and boost using Homebrew.

ICDAR 2017

1 minute read

Published:

Topics

My interesting topics

NLP Basic

less than 1 minute read

Published:

Clustering

3 minute read

Published:

CLustering

Chu Nom

14 minute read

Published:

Nom database

Databases for AI

39 minute read

Published:

Classification:

  • :group: Classification :name: MNIST :evaluation_units: error % :description: “Classify handwriten digits. \nSome additional results are available on the original dataset page.” :figure_url: mnist.png
  • :group: Classification :name: CIFAR-10 :evaluation_units: accuracy % :description: Classify 32x32 colour images. :figure_url: cifar_10.png
  • :group: Classification :name: CIFAR-100 :evaluation_units: accuracy % :description: Classify 32x32 colour images. :figure_url: cifar_100.png
  • :group: Classification :name: STL-10 :evaluation_units: accuracy % :description: Similar to CIFAR-10 but with 96x96 images. Original dataset website. :figure_url: stl_10.png
  • :group: Classification :name: SVHN :evaluation_units: error % :description: | The Street View House Numbers (SVHN) Dataset. SVHN is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and formatting. It can be seen as similar in flavor to MNIST(e.g., the images are of small cropped digits), but incorporates an order of magnitude more labeled data (over 600,000 digit images) and comes from a significantly harder, unsolved, real world problem (recognizing digits and numbers in natural scene images). SVHN is obtained from house numbers in Google Street View images. :figure_url: http://ufldl.stanford.edu/housenumbers/32x32eg.png
  • :group: Classification :name: ILSVRC2012 task 1 :evaluation_units: Error (5 guesses) :description: |- 1000 categories classification challenge. With tens of thousands of training, validation and testing images. See this interesting comparative analysis. :figure_url: ilsvrc2012_task1.png :external_results_url: http://www.image-net.org/challenges/LSVRC/2012/results.html#t1 Detection:
  • :group: Detection :name: Pascal VOC 2007 comp3 :evaluation_units: ‘mAP percent ‘ :description: Pascal VOC 2007 is commonly used because the test set has been realased. comp3 is the objects detection competition, using only the comp3 pascal training data. :figure_url: pascal_voc_2007.png
  • :group: Detection :name: Pascal VOC 2007 comp4 :evaluation_units: ‘mAP percent ‘ :description: Just like comp3 but “any training data” can be used. :figure_url: pascal_voc_2007.png
  • :group: Detection :name: Pascal VOC 2010 comp3 :evaluation_units: ‘mAP percent ‘ :description: Pascal VOC 2010 version of the challenge. comp3 is the objects detection competition. :figure_url: pascal_voc_2010.png
  • :group: Detection :name: Pascal VOC 2010 comp4 :evaluation_units: ‘mAP percent ‘ :description: Just like comp3 but “any training data” can be used. :figure_url: pascal_voc_2010.png
  • :group: Detection :name: Pascal VOC 2011 comp3 :evaluation_units: ‘mAP percent ‘ :description: Last Pascal VOC challenge instance (2012 version had identical data). :figure_url: pascal_voc_2012.png
  • :group: Detection :name: Caltech Pedestrians USA :evaluation_units: average miss-rate % :description: “Project website.” :figure_url: http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/files/peds02_web.jpg :external_results_url: http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/rocs/UsaTestRocReasonable.pdf
  • :group: Detection :name: INRIA Persons :evaluation_units: average miss-rate % :description: |- Evaluated using the Caltech Pedestrians toolkit. Original dataset website. :figure_url: inria_persons.png :external_results_url: http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/rocs/InriaTestRocReasonable.pdf
  • :group: Detection :name: ‘ETH Pedestrian ‘ :evaluation_units: average miss-rate % :description: |- Evaluated using the Caltech Pedestrians toolkit. Only left images used. Original dataset website. :figure_url: eth_pedestrian.png :external_results_url: http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/rocs/ETHRocReasonable.pdf
  • :group: Detection :name: TUD-Brussels Pedestrian :evaluation_units: average miss-rate % :description: |- Evaluated using the Caltech Pedestrians toolkit. Original dataset website. :figure_url: http://www.d2.mpi-inf.mpg.de/sites/default/files/datasets/tud-brussels/teaser.png :external_results_url: http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/rocs/TudBrusselsRocReasonable.pdf
  • :group: Detection :name: Daimler Pedestrian :evaluation_units: average miss-rate % :description: |- Evaluated using the Caltech Pedestrians toolkit. Original dataset website. :figure_url: http://www.gavrila.net/Datasets/Daimler_Pedestrian_Benchmark_D/ped_det_benchmark.jpg :external_results_url: http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/rocs/DaimlerRocReasonable.pdf
  • :group: Detection :name: KITTI Vision Benchmark :evaluation_units: average recall % :description: A rich dataset to evaluate multiple computer vision tasks, including cars, pedestrian and bycicles detection. :figure_url: http://www.cvlibs.net/datasets/kitti/video/kitti_trailer.jpg :external_results_url: http://www.cvlibs.net/datasets/kitti/eval_object.php Pose estimation:
  • :group: Pose estimation :name: Leeds Sport Poses :evaluation_units: PCP % :description: “2000 poses anotated pictures from Flickr. From a selected set of activities and with the person at the center of the pictures.” :figure_url: leeds_sport_poses.jpg Semantic labeling:
  • :group: Semantic labeling :name: MSRC-21 :evaluation_units: accuracy % per-class / (and) per-pixel :description: “One of the oldest and classic dataset for semantic labelling.\n21 different categories of surfaces are considered.\nDespite the innacuracies in the annotations and how unbalanced the classes are, this dataset still is commonly used as reference point.\nNote that here we consider the original annotations (where most results are published), not the cleaned-up version. \n\n\nThe results are reported per-class and per-pixel (this is sometimes called "average" and "global" result, respectively).\n\n\nOriginal dataset website\n” :figure_url: msrc_21.png Saliency/Segmentation:
  • :group: Saliency/Segmentation :name: Salient Object Detection benchmark :evaluation_units: AUC (precision/recall area under the curve) and MAE (mean absolute error) :description: This benchmark aggregates results from 36 methods over five datasets (MSRA10K, ECSSD, THUR15K, JuddDB, and DUTOMRON). :figure_url: saliency_benchmark.png :external_results_url: http://mmcheng.net/salobjbenchmark/ —
  • :dataset_name: COIL-100

Awesome RNNs

24 minute read

Published:

Awesome Recurrent Neural Networks

LSTM Related

less than 1 minute read

Published:

  1. RNN
    1. Sequence labeling task
  2. LSTM
    1. Original structure
    2. Variant
    3. Learning algorithm
  3. Applications/Experiments
    1. Sequence to sequence… Grammar to Foreign Language here
  4. Framework/Library support
  5. References

Useful links

less than 1 minute read

Published:

2015/11 Links

Tensorflow Installation

less than 1 minute read

Published:

  1. Ubuntu Server tips
    • Monitor Linux Performance using commandline with top, sudo iotop, nvidia-smi here
  2. TensorFlow
  3. Performance
    • Compare different framework/library of deep learning by @soumith at here
    • Details of benchmark for Tensorflow and Torch here from @soumith

portfolio

publications

Gesture recognition in cooking video based on image features and motion features using Bayesian Network classifier

Published in In: Proceedings of the 2014 International Conference on Image Processing, Computer Vision and Pattern Recognition, 2014

Access to paper

Recommended citation: Nguyen Hung, Nguyen Binh, Pham Bao, Jin Kim, "Gesture recognition in cooking video based on image features and motion features using Bayesian Network classifier." In: Proceedings of the 2014 International Conference on Image Processing, Computer Vision and Pattern Recognition, 2014. https://worldacademyofscience.org/worldcomp14/ws/program/ipc23.html

Identification of Bacillus species using support vector machine and codon pair relative frequency

Published in In: Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication, ICUIMC 2014, 2014

Abstract: In this paper, we proposed new approach to identify Bacillus species by using a new feature - codon pair relative frequency - and support vector machine (SVM). Our problem is how to use the information from some genes of specie to identify what kind of the specie is. This problem can be applied to not only research the evolutionary process but also predict the kind of specie for damaged samples. First gene database of sixteen Bacillus species is collected from National Center for Biotechnology Information (NCBI) website. Then, we extract codon pair relative frequency feature of each gene for each species. Finally, SVM "one-againstrest" method is applied to train these feature vectors. By using the proposed method we gained good results in identification for our database. Copyright 2014 ACM.

Access to paper

Recommended citation: Tran Anh, Pham Bao, Nguyen Hung, "Identification of Bacillus species using support vector machine and codon pair relative frequency." In: Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication, ICUIMC 2014, 2014. https://dl.acm.org/citation.cfm?id=2558005

A Vietnamese Online Handwriting Database

Published in In: Proceedings of the 2015 Fourth ICT International Student Project Conference, 2015

Access to paper

Recommended citation: Hung Nguyen, Cuong Nguyen, Pham Bao, Masaki Nakagawa, "A Vietnamese Online Handwriting Database." In: Proceedings of the 2015 Fourth ICT International Student Project Conference, 2015.

Gesture recognition in cooking video based on image features and motion features using Bayesian network classifier

Published in Emerging Trends in Image Processing, Computer Vision and Pattern Recognition, 2015

Abstract: In this chapter, we propose an advanced method, which combines image features and motion features, for gesture recognition in cooking video. First of all, the image features including global and local features of Red-Green-Blue color images are extracted, then represented using bag of features method. Concurrently, motion features are also extracted from these videos and represented through some dense trajectories descriptors such as histogram of oriented gradient, histogram of optical flow, or motion boundary histogram. In addition, we use relative positions between objects and also their positions are detected in each frame to decrease misclassification. Next, we combine both image features and motion features to describe the cooking gestures. At the last step, Bayesian network models are applied to predict which action class a certain frame belongs to, base on the action class of previous frames and the cooking gesture in current frame. Our method has been approved through Actions for Cooking Eggs dataset that it can recognize human cooking actions as we expected. We evaluate our method as a general method for solving many different action recognition problems. In near future, therefore, we are going to apply it to solve other action recognition problems.

Access to paper

Recommended citation: Nguyen Hung, Pham Bao, Jin Kim, "Gesture recognition in cooking video based on image features and motion features using Bayesian network classifier." Emerging Trends in Image Processing, Computer Vision and Pattern Recognition, 2015. http://www.sciencedirect.com/science/article/pii/B9780128020456000247

Preparation of an unconstrained Vietnamese online handwriting database and recognition experiments by BLSTM

Published in IEICE Technical Report, 2016

Abstract: In this paper, we present preparation of a Vietnamese online handwriting database and recognition experiments employing BLSTM using the database. All patterns in our database are collected on pen-based computers. Besides, we introduce our software tool for collecting patterns and analyzing handwriting patterns in our database. To develop an online Vietnamese handwriting recognizer and prove the suitability of our database for further researches, we apply BLSTM, well-known recurrent neural networks for recognition. The result is promising although we cannot compare the result with previous research since this is the first publication to our knowledge. We are going to make our database publicly available for research purpose since it would be the base for future researches in Vietnamese handwriting recognition.

Access to paper

Recommended citation: Hung Nguyen, Cuong Nguyen, Pham Bao, Masaki Nakagawa, "Preparation of an unconstrained Vietnamese online handwriting database and recognition experiments by BLSTM." IEICE Technical Report, 2016. https://www.ieice.org/ken/paper/20160324CbHQ/eng/

Preparation of an unconstrained Vietnamese online handwriting database and recognition experiments by recurrent neural networks

Published in In: Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR, 2016

Abstract: This paper presents our attempts to collect and analyze unconstrained Vietnamese online handwriting text patterns by pen-based computers. Totally, our database contains over 120,000 strokes from more than 140,000 characters, which is one of the largest Vietnamese online handwriting pattern databases currently. For building and analyzing our database, we made a collection tool, a line segmentation tool, and a delayed stroke detection tool. Moreover, we investigated some statistical information from personal information of writers. In order to solve the unconstrained handwriting recognition problem, we conducted experiments using Bidirectional Long Short-Term Memory (BLSTM) networks. BLSTM network is architecture of Recurrent Neural Network (RNN) and applied recently for many related problems. The performance of BLSTM network on our database is nearly 80{\%} of accuracy even though this database contains many delayed strokes. In near future, we are going to avail our database for research purposes, as it would be the fundamental for the handwriting recognition research.

Access to paper data

Recommended citation: Hung Nguyen, Cuong Nguyen, Pham Bao, Masaki Nakagawa, "Preparation of an unconstrained Vietnamese online handwriting database and recognition experiments by recurrent neural networks." In: Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR, 2016.

Attempts to recognize anomalously deformed Kana in Japanese historical documents

Published in In: Proceedings of the 4th International Workshop on Historical Document Imaging and Processing - HIP2017, 2017

Access to paper

Recommended citation: Hung Nguyen, Nam Ly, Kha Nguyen, Cuong Nguyen, Masaki Nakagawa, "Attempts to recognize anomalously deformed Kana in Japanese historical documents." In: Proceedings of the 4th International Workshop on Historical Document Imaging and Processing - HIP2017, 2017. http://dl.acm.org/citation.cfm?doid=3151509.3151514

ICFHR 2018 – Competition on Vietnamese Online Handwritten Text Recognition using HANDS-VNOnDB (VOHTR2018)

Published in In: Proceedings of the 16th International Conference on Frontiers in Handwriting Recognition, 2018

Access to paper data

Recommended citation: Hung Nguyen, Cuong Nguyen, Masaki Nakagawa, "ICFHR 2018 – Competition on Vietnamese Online Handwritten Text Recognition using HANDS-VNOnDB (VOHTR2018)." In: Proceedings of the 16th International Conference on Frontiers in Handwriting Recognition, 2018. https://ieeexplore.ieee.org/document/8583810/

Online Japanese Handwriting Recognizers using Recurrent Neural Networks

Published in In: Proceedings of the 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2018

Access to paper

Recommended citation: Hung Nguyen, Cuong Nguyen, Masaki Nakagawa, "Online Japanese Handwriting Recognizers using Recurrent Neural Networks." In: Proceedings of the 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2018. https://ieeexplore.ieee.org/document/8583800/

Recognizing Unconstrained Vietnamese Handwriting By Attention Based Encoder Decoder Model

Published in In: Proceedings of the 2018 International Conference on Advanced Computing and Applications (ACOMP), 2018

Access to paper

Recommended citation: Anh Le, Hung Nguyen, Masaki Nakagawa, "Recognizing Unconstrained Vietnamese Handwriting By Attention Based Encoder Decoder Model." In: Proceedings of the 2018 International Conference on Advanced Computing and Applications (ACOMP), 2018. https://ieeexplore.ieee.org/document/8589493/

Robust and real-time stroke order evaluation using incremental stroke context for learners to write Kanji characters correctly

Published in Pattern Recognition Letters, 2019

Abstract: Writing Kanji characters of Chinese origin in the correct stroke order and direction is still one of the important subjects in Japanese elementary education. So far, the stroke order evaluation was made by stroke-to-stroke matching without stroke context so that it was unrobust to Kanji characters having multiple similar strokes. Here, we employ shape context features around each feature point in not only conventional fan-shaped bins but also in square bins with applying a Gaussian function. We also propose simple incremental context and augmented context from future strokes. Our approach can judge whether the stroke order and direction are correct or not every time a new stroke is written on a tablet by matching a partially written Kanji pattern with the reference pattern written to the same number of strokes. Evaluation shows that the best-tuned method with square bins and the Gaussian function records the highest performance and correctly evaluates stroke order by 98.5{\%} with the maximum time of 0.12 sec. /character for Kanji patterns after all strokes are written using an average desktop PC. The method is also shown to possess high reliability to detect wrong stroke order and direction incrementally every time after each stroke is written.

Access to paper

Recommended citation: Cuong Nguyen, Hung Nguyen, Kazuhiro Mita, Masaki Nakagawa, "Robust and real-time stroke order evaluation using incremental stroke context for learners to write Kanji characters correctly." Pattern Recognition Letters, 2019. https://www.sciencedirect.com/science/article/abs/pii/S0167865518303258

Text-independent writer identification using convolutional neural network

Published in Pattern Recognition Letters, 2019

Abstract: The text-independent approach to writer identification does not require the writer to write some predetermined text. Previous research on text-independent writer identification has been based on identifying writer-specific features designed by experts. However, in the last decade, deep learning methods have been successfully applied to learn features from data automatically. We propose here an end-to-end deep-learning method for text-independent writer identification that does not require prior identification of features. A Convolutional Neural Network (CNN) is trained initially to extract local features, which represent characteristics of individual handwriting in the whole character images and their sub-regions. Randomly sampled tuples of images from the training set are used to train the CNN and aggregate the extracted local features of images from the tuples to form global features. For every training epoch, the process of randomly sampling tuples is repeated, which is equivalent to a large number of training patterns being prepared for training the CNN for text-independent writer identification. We conducted experiments on the JEITA-HP database of offline handwritten Japanese character patterns. With 200 characters, our method achieved an accuracy of 99.97{\%} to classify 100 writers. Even when using 50 characters for 100 writers or 100 characters for 400 writers, our method achieved accuracy levels of 92.80{\%} or 93.82{\%}, respectively. We conducted further experiments on the Firemaker and IAM databases of offline handwritten English text. Using only one page per writer to train, our method achieved over 91.81{\%} accuracy to classify 900 writers. Overall, we achieved a better performance than the previously published best result based on handcrafted features and clustering algorithms, which demonstrates the effectiveness of our method for handwritten English text also.

Access to paper code

Recommended citation: Hung Nguyen, Cuong Nguyen, Takeya Ino, Bipin Indurkhya, Masaki Nakagawa, "Text-independent writer identification using convolutional neural network." Pattern Recognition Letters, 2019. https://www.sciencedirect.com/science/article/abs/pii/S0167865518303180

Strategy and Tools for Collecting and Annotating Handwritten Descriptive Answers for Developing Automatic and Semi-Automatic Marking - An Initial Effort to Math

Published in In: Proceedings of the 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), 2019

Access to paper

Recommended citation: Huy Ung, Minh Phan, Hung Nguyen, Masaki Nakagawa, "Strategy and Tools for Collecting and Annotating Handwritten Descriptive Answers for Developing Automatic and Semi-Automatic Marking - An Initial Effort to Math." In: Proceedings of the 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), 2019. https://ieeexplore.ieee.org/document/8892993/

CNN based spatial classification features for clustering offline handwritten mathematical expressions

Published in Pattern Recognition Letters, 2020

Abstract: To help human markers mark a large number of answers of handwritten mathematical expressions (HMEs), clustering them makes marking more efficient and reliable. Clustering HMEs, however, faces the problem of extracting both localization and classification representation of mathematical symbols for an HME image and defining the distance between two HME images. First, we propose a method based on Convolutional Neural Networks (CNN) to extract the representations for an HME. Symbols in various scales are located and classified by a combination of features from a multi-scale CNN. We use weakly supervised training combined with symbols attention to enhance localization and classification predictions. Second, we propose a multi-level spatial distance between two representations for clustering HMEs. Experiments on CROHME 2016 and CROHME 2019 dataset show the promising results of 0.99 and 0.96 in purity, respectively.

Access to paper

Recommended citation: Cuong Nguyen, Vu Khuong, Hung Nguyen, Masaki Nakagawa, "CNN based spatial classification features for clustering offline handwritten mathematical expressions." Pattern Recognition Letters, 2020. https://www.sciencedirect.com/science/article/abs/pii/S0167865519303782

Classifying the Kinematics of Fast Pen Strokes in Children with ADHD using Different Machine Learning Models

Published in The Lognormality Principle and Its Applications in e-Security, e-Learning and e-Health, 2021

Abstract: This exploratory study examines whether the sigma-lognormal model derived from the Kinematic Theory of rapid human movements discriminates between the handwriting strokes produced by children with and without Attention Deficit Hyperactivity Disorder (ADHD). Twelve children with ADHD and 12 controls aged 8–11 years were asked to produce handwriting strokes on a digitizing tablet. The strokes were analyzed using the sigma-lognormal model. Strokes made by children with ADHD reflected poorer motor control, action planning and execution than strokes made by controls. Different Machine learning models were trained to classify the subjects according to the discriminatory parameters used as features. Although the sample size and data are modest and will require replication in a larger forthcoming study, promising preliminary results are obtained, suggesting that the sigma-lognormal model may be a useful tool in the assessment of ADHD.

Access to paper

Recommended citation: Nadir Faci, Hung Nguyen, Patricia Laniel, Gauthier Bruno, Miriam Beauchamp, Masaki Nakagawa, Réjean Plamondon, "Classifying the Kinematics of Fast Pen Strokes in Children with ADHD using Different Machine Learning Models." The Lognormality Principle and Its Applications in e-Security, e-Learning and e-Health, 2021.

talks

teaching

TUAT - 3-month Internship

Undergraduate course, Tokyo University of Agriculture and Technology, Nakagawa Laboratory, 2015

The 3-month internship students from ASEAN (Thailand, Malaysia, Vietnam) join into our laboratory during summer/spring.

TUAT - B3 Experiments

Undergraduate course, Tokyo University of Agriculture and Technology, Nakagawa Laboratory, 2017

Every year, the Nakagawa laboratory recruits from 3-4 3rd year students focusing on the preliminary experiments. The students are freely to select their interesting topics.