For each of the trans<strong>for</strong>med images, we create a 2D valueco-occurrence histogram p[i, j], i.e. the histogram of howoften an original grayscale value i occurs in the region ofinterest of image I together with a value j at the same positionin the trans<strong>for</strong>med image J.p[i, j] =#{(x, y) ∈ ROI : I[x, y] = i ∧ J[x, y] = j}#{(x, y) ∈ ROI}(a) Contour image and dilation(b) ROI in original and binarizationFigure 5. A region is interest (ROI) mask isconstructed by dilating the contour image (a).The cross-correlation between the ROI-pixelsin the original image and in the binarized image(b) measures how close the contour regionis to an ideal step edge.From each such histogram, we extract the four features contrast,correlation, energy and homogeneity <strong>for</strong> classification:contrast = ∑ |i − j| 2 p[i, j]i,j∑i,jcorrelation =(i − µ x)(j − µ y )p[i, j]σ x σ yenergy = ∑ i,jp 2 [i, j]homogeneity = ∑ i,jp[i, j]1 + |i − j|whereµ x = ∑ i,ji p[i, j]µ y = ∑ i,jj p[i, j]Figure 6. Local binary pattern: A 3 × 3 windowsaround each position is extracted (left).Its center value is used as threshold <strong>for</strong> binarization(center). The 8 binarized neighborsare multiplied with a mask of powers of twoand summed up (right). This arranges themin clockwise order <strong>for</strong>ming a byte value, inthis example 1 + 2 + 4 + 16 + 128 = 151.Local Binary Maps We calculate the rotation invariant localbinary map[8]. It can be illustrated in the followingway: Assume a 3 × 3 window in the image. We calculatea binary version of this window using the centerpoint as threshold. The resulting 8 single-bit values arecombined into a bitstring in clockwise order, resultingin an 8-bit integer value. This process is illustrated inFigure 3.2. The resulting value depends on which bitthe combination started with. To make the feature rotationallyinvariant, we using each bit as a starting pointonce and keep the minimum resulting value as output.The same process can also be done <strong>for</strong> larger windowsizes. In our experiments, we use a 5 × 5 window. Thevalues at its boundary are binarized and combined intoa 16 bit output value, which is again made rotationallyinvariant by minimizing over all 16 possible rotations.√ ∑√ ∑σ x = (i − µ x ) 2 p[i, j] σ y = (j − µ y ) 2 p[i, j]i,ji,jThis results in a 4D feature vector per trans<strong>for</strong>mation, i.e.the total vector <strong>for</strong> texture features is 12-dimensional.3.3 <strong>Classification</strong>For classification, the 15-dimensional feature vectors arereduce to 12-dimensions using the PCA trans<strong>for</strong>m and normalizedto the range [0, 1]. Afterwards, they are used asinput to a support vector machine with Gaussian kernel(RBF-SVM). Support vector machines have proven to bea powerful tool <strong>for</strong> classification tasks where high accuracyis required. By use of a Gaussian kernel function, we avoidthe frequent problem of parameter section, because a RBF-SVM has only two free parameters, the penalty C and thekernel width σ, and there are straight<strong>for</strong>ward testing method<strong>for</strong> choosing them[9]. In our case, parameters were set toC = 20 and σ = 1.Another reason <strong>for</strong> choosing a RBF-SVM was, that theyutilize the advantages of a high (even infinite) dimensionalclassification space, and can be thought of as covering themore fundamental linear kernel SVMs as well[10].3.4 VisualizationA big problem of fully automatic classification problemsis that users don’t fully trust their decisions. This is evenmore the case in sensitive areas like document authenticity,
where matters of national security or financial transactionsare involved. We have there<strong>for</strong>e designed our system to notindependently make decisions, but to guide the user intomaking his or her own decision. This is done by presentingthe user with the classification results in <strong>for</strong>m of a color annotationof the original document image. Green representslaser printed characters and red represents inkjet.This representation allows the user to see at a singleglance if a document is uniquely colored in the way he orshe expects, indicating that it is an original, or if the coloris uni<strong>for</strong>mly wrong, indicating a complete <strong>for</strong>gery, or if itconsists of different colors, which can be an indicator <strong>for</strong>a partial <strong>for</strong>gery. In the last case, the color coding showsits special strength, because the user can immediately seewhich parts of the documents have been detected as potentially<strong>for</strong>ged. Based on the semantic meaning of these part,he or she might can decide to either consult the original document<strong>for</strong> an in-depth check, or to attribute the detection asa false positive and proceed. Typically, names and numbersare positions where a missing a <strong>for</strong>gery is potentially dangerous,whereas in the company logo or the running text ofa letter, a detection error is more likely. Encoding this in<strong>for</strong>mationinto the system would require a lot of backgroundknowledge about the documents to be checked, somethingthat a generic counterfeit detection system typically doesnot have. By instead leaving the choice to the user in an interactivefashion, the system stays overall generic. We alsobelieve that the acceptance rate with a user will be higher,because of the positive feedback to decide <strong>for</strong> oneself insteadof giving the control to ”a machine”.4 Experiments4.1 SetupWe implemented a prototype of the described system inMatLab, and tested it on a dataset of 26 printouts of 8 laserand 5 inkjet printouts. All documents show only text andwere scanned at a resolution of 3200 dpi. Since our algorithmworks on the level of individual letters, the documentimages were split into connected components, resulting in9217 image regions in total.For evaluation, we per<strong>for</strong>med two kinds of experiments.To measure the numerical classification accuracy, we per<strong>for</strong>medleave-one-out testing, each time selecting all regionsof one document <strong>for</strong> testing and all regions of all other documents<strong>for</strong> training. This resulted in a classification accuracyof 94.8%. However, the classification accuracy variedrather strongly, <strong>for</strong> some documents reaching 100% but inone case also dropping as low as 78%. We believe that thisis caused by a lack diversity in the training material, sincewe only had access to a limited number of printer and papercombinations.To demonstrate our approach of visualization and counterfeitdetection, we created an example document fromlaser printed stationary with inkjet printed text body. It wasclassified and the result is presented in Figure 7. As one cansee, the laser part is recognized perfectly, whereas in theinkjet section, some letters are falsely marked as laser print.However, the visualization makes it possible to see that theerrors happen mainly <strong>for</strong> very small bounding boxes in particularpunctuation, and not <strong>for</strong> critical parts like names orbank data. A human observer should there<strong>for</strong>e be able toclassify the letter as non-<strong>for</strong>ged.4.2 DiscussionFor the prototypical stage of the system, we believe thatthe results presented are good. It also looks promising thatthey can be improved by more careful parameter selectionand in particular more training material. However, the currenterror rate of 5.2% is not low enough to establish a fullyautomatic system. For that the task of counterfeit detectionitself is much too sensitive to errors: a single <strong>for</strong>ged lettermakes a whole document a <strong>for</strong>gery. There<strong>for</strong>e, a singlefalse decision can cause a genuine document to be detectedas <strong>for</strong>gery, and a single false negative can cause a <strong>for</strong>geddocument to be overlooked.A system working with such a strict decision rule wouldinstead require an error rate in the range of 0.01% (one errorper 10 documents) which appears unreachable with currentmethods. We have avoided this problem by not letting thesystem decide <strong>for</strong> itself, but by presenting the user with avisualization of the classification results in <strong>for</strong>m of an assistancesystem.5 ConclusionWe have presented a system that uses machine learningto detect different types of printing techniques on the levelof individual letters, or even parts of letters. Furthermore,we have described a setup to detect counterfeit or manipulationof printed documents. Here, the aim was not to build acompletely autonomous system in <strong>for</strong>m of a black box classifier,but a system that assists a possible user in his owndecision. The reported results indicate that automatic classificationof the technique used to create a printed documentis indeed possible with ordinary consumer hardware. So farwe used two classes, inkjet and laser, but by switching toa multi-class classifier and with additional research on suitablefeatures, it should be possible to distinguishing evenmore types of print and also other writing devices like ballpens. Our goal is to create a system that can indeed fulfilla useful purpose outside of the lab environment, e. g.by integration into a content management system handlinginvoices or receipts, as they are frequently used today at