IIT Kanpur

        Image Compression Graphical User Interface

Supported  & Funded by

E-Learning Division

 Department of Information Technology

Ministry of Communications & Information Technology

Government of India

 

 

 

 

 

 

 Contents

 

  1. Introduction

    1. Unique Features
    2. User Friendly Interface
  2. The Description of Component included in GUI front end and Typical Parameters Selection & How to Use GUI
  3. System Requirement
  4. Technical Description of GUI Algorithms
 

1. Introduction

Image Compression GUI is a Graphical User Interface developed for Image Compression applications in MatLabTM computing environment which is an easy to use interface. User can successfully use several cutting edge Image Compression Technologies through the GUI and learn techniques included in the GUI. The GUI include several Image Compression Techniques developed using the key concepts of the areas like Artificial Neural Networks, Fuzzy Logic, Wavelet Transform. The layout of the GUI is designed for users who unaware of technical details of the areas and this features makes the GUI more interactive Front End tool.

1.1 Unique Features

We believe in providing the solutions through continuous research and technology enhancements. The Image compression GUI brings together the latest in Image Compression Technologies by utilizing potential research concepts of soft-computing based techniques. The algorithms are results of thorough testing and validation and provide wide range of compression level. Each technique has advantages over respective areas of application which makes them suitable for wide range of adaptability and some techniques gives the comparable results with the image compression JPEG standards as for as compression level is concerned. The uniqueness of this GUI also lies in combining the several research areas with convention transform based data compression techniques thus implementing hybrid image compression techniques. The GUI system design, architecture, user-interface, flexibility and algorithmic capabilities also gives it the capability to use as stand alone application as well as it can be integrated with other image compression module.

1.2 User Friendly Interface

The GUI’s based user-friendly technologies are easy to learn. The use of the complex algorithms like Neural Networks and Fuzzy logic based algorithm, which are defined by the network architecture (layers) and the training algorithm (learning rules), provide the user friendly interface. The GUI provides the options for the user to select a particular technique through an interactive pull down menu option. The comprehensive list of parameters along with their optional values is imbedded in GUI by easy to select sliding bar or through the text box. The parameters related to compressed data and reconstructed image quality data are also displayed in GUI. The details description of various features of GUI is provided in various sections through the interactive documentations.

 

2. Interactive Elements of GUI Front End

The interactive embedded elements of GUI Front End (in figure given below) are as follows,

Ø      Input image through

        Input image can be selected in two ways as shown in figure below:

  • Source File from Hard Disc

  • Through Camera by selecting the Camera radio option as shown in figure

 

 

 

 

 

 

 

 

 

 Figure: Selection of the Input Image

Ø      Algorithms Selection

        Coder Algorithm selection can be selected through the combo  

         box as shown in figure below,

 

 

 

 

 

 

 

Ø                  Algorithmic parameter selection can be done by entering the value in textbox provided for Image Quality level as shown in figure below. The typical values of parameter for different algorithm is given in table shown below,       

Algorithm

Full Name

Artificial Neural Network Based Techniques

BPA

Back Propagation Algorithm

CPN

Counter Propagation Network

 

Statistical Theory Based Techniques

PCA

Principal Component Analysis 

ICA

Independent Component Analysis

2 D-PCA

2-DimensionalPCA

SVD

Singular Value Decomposition

 

Conventional Compression and Transform Based Techniques

JPEG

Joint Photographic Experts Group

Standard (DCT based)

BTC

Block Truncation Coding

GP

Gaussian Pyramid

Wavelet

Discrete Wavelet Transform

Wavelet-SPIHT

Wavelet-Set Partitioning in Hierarchical Trees

 

Hybrid Techniques

Wavelet-BPA

Wavelet- Back Propagation Algorithm

Wavelet-CPN

Wavelet-Counter Propagation Network

GP-CPN

Gaussian Pyramid-Counter Propagation Network

BTC-Wavelet

Block Truncation Coding-Wavelet

DCT-PCA

Discrete Cosine Transform Principal Component Analysis

SVM-DCT

Support Vector Machine-Discrete Cosine Transform

SVM-Fuzzy

Support Vector Machine-Fuzzy

Algorithm

Parameter Description

JPEG

Image Quality level from 1(worse) to 100(best)

BPA

Enter the number of iteration- 100-200

Enter the number of block size -4, 8

Enter the number of Hidden Layer -4, 8, 16

BTC

No Parameters

CPN

No Parameters

Gaussian Pyramid

No Parameters

Gaussian Pyramid-CPN

No Parameters

Wavelet

No Parameters

Wavelet-BPA

Enter the number of wavelet decomposition level-2

BTC-Wavelet

No Parameters

Wavelet-CPN

No Parameters

Wavelet-SPIHT

Bit level from 0.01(worse) to 1(best)

ICA 

Number of independent comp -2-100

PCA

Number of principal components 2-10

DCT-PCA

Number of principal components -2-10

2 D-PCA

No Parameters

SVD

Singular Values-20 to 40

SVM-DCT

Enter error value-1 to 100

Tolerance – 1 to 100

SVM-Fuzzy

Enter error value 1 to 100

Tolerance 1 to 100

 

 

 

 

 

 

 

 

 

 

 

 

Ø      To begin compression procedure by pressing the Compress button. The compression progress will be as shown in figure below,

 

 

 

 

 

 

 

 

 

Ø            Save compressed output file of coder through Save File option as  shown in figure below,

                   

 

 

 

 

 

 

 

 Ø            Load compressed or decode compressed data file through Load File option as shown in figure below.

 

 

 

 

 

 

 

 

 

 

 Ø            Compare the algorithm performance with JPEG through “comparison with JPEG”, and the output is as shown in figure below,

 

 

 

 

 

 

 

  

Results displayed in GUI

 

PSNR (Peak Signal-to-Noise Ratio)

PSNR is defined as the ratio of an original image and a coded/decoded image, PSNR is usually expressed in terms of the logarithmic decibel (dB) scale. Typical quoted PSNR figures are in the range +25 to +35dB.

RMSE (Root Mean Squared Error)

A measure of total error defined as the square root of the sum of the variance and the square of the bias.

Compression Ratio

Compression ratio is defined as the ratio of an original image and compressed image.

Compression Ratio = Original Image Size / Compressed Image Size

Entropy

The entropy measure conveys the information contained in the image and any compression scheme should retain the entropy measure while reducing the redundant information from the image.

Source Entropy - The entropy of original image.

Destination Entropy -  The entropy of reconstructed  image.

Execution Time In which a single instruction is executed. It makes up the last half of the instruction cycle.

 

3. System Requirements

  • Microsoft Windows XP Professional or Home Edition (Service Pack 1 or 2)
  • Windows 2000 (Service Pack 2)
  • Windows XP Tablet PC Edition,
  • Windows Server 2003
  • Windows NT (Service Pack 6 or 6a)
  • 128MB of RAM (256MB recommended)
  • Up to 35MB of available hard-disk space

 

4. Description of Algorithms

4.1 JPEG (Joint Photographic Experts Group)

In computing, JPEG (pronounced JAY-peg) is a commonly used method of compression for photographic images. The name JPEG stands for Joint Photographic Experts Group, the name of the committee that created the standard. The group was organized in 1986, issuing a standard in 1992, which was approved in 1994 as ISO 10918-1. It is designed for compressing full-color or gray-scale images of natural, real-world scenes. It works well on photographs, naturalistic artwork, and similar material.

The JPEG standard specifies both the codec, which defines how an image is compressed into a stream of bytes and decompressed back into an image, and the file format used to contain that stream. The compression method is usually lossy compression, meaning that some visual quality is lost in the process and cannot be restored, although there are variations on the standard baseline JPEG that are lossless.

The good introduction and links can be found out at web link: http://en.wikipedia.org/wiki/JPEG

Compression Parameters for JPEG in GUI

In JPEG algorithm the compression level can be directly varied in terms of reconstructed image quality on a scale of 1 to 100 i.e., 1 for worst case reconstructed image quality and 100 for best case reconstructed image quality.

4.2 Multilayer Artificial Neural Network based Back Propagation Algorithm

The Artificial Neural Network is a stream which tries to mimic the human brain behaviors and functionality as shown in figure below.

 

 

                  

 

 

                                   

 

 

 

             Text Box: n  100˙000 synapses/neuron
n  Computational power = connectivity
n  Plasticity 
n  new connections
n  strength of connections modified 
  

 

The artificial neuron model is given in figure as given below,

 

 

 

 

 

 

 

 

 

 
 

                                         

          

The conventional Back Propagation Algorithm is basically a gradient-descent method; it has the problems of local minima and slow convergence. The aim of network is to train the input vector in such a way that after training we are able to get image matrix in output layer with least error. The compression is achieved by using less number of hidden neurons in hidden layer than input. In above figure as we have used l input and k hidden neuron and if number of pattern is P then:

               Compression ratio =

The use of neural network in image compression is given in figure as given below,

 

           

 

Compression Parameters for BPA in GUI

In BPA algorithm, Compression Parameters are given below.

Enter the number of iteration       100 – 200

Enter the number of block size        4, 8

Enter the number of Hidden Layers     4, 8, 16

  

4.3 BTC (Block Truncation Coding)

Block Truncation Coding (BTC) is a lossy image compression technique. BTC is a simple and fast algorithm which achieves constant bit rate of 2.0 bits per pixel. The method is however suboptimal. It divides the original images into small sub-images and then using a quantizer, which adapts itself according to the image statistics, to reduce the number of gray levels in the image.

Steps for Block Truncation Coding

  • An image is divided into non-overlapping blocks. The size of a block could be (4x4) or (8x8) etc. Calculate the average gray level of the block 4x4)as,

      where represents pixels in block

  • Pixels in the image block are then classified into two kinds according to whether their gray level is greater than the block average gray level. The average gray levels of these two kinds of pixel are calculated as,

      and  where K is   

     the number of pixels whose gray level is greater   

     than m.

  • A binary block, denoted b, is also needed to classify the pixels. We can use “1” to represent a pixel whose gray level is grater than m and “0” to represent a pixel whose gray level is less than or equal to m.
  • The encoder writes mu, ml and b to a file. Assume that we use 8 bits to represent mu and ml, respectively. Then the total number of bits required for a block is 8+8+16 =32 bits. Thus, the bit rate for the very basic BTC algorithm is 2 bits/pixel.
  • In the decoder, an image block is reconstructed by replacing the “1”s with mu and the “0”s by ml.

4.4 Counterpropagation Network (CPN)

Counterpropagation network can be viewed as a self-programming lookup table as it essentially finds the closest fit of a particular input to an entry in its training set. A crucial feature for this application is the ability to accept input with noise and provide a solution and probability value for that solution. A special feature of the Counter Propagation Network exists if the function can be linearized. This CPN network adaptation is referred to as Interpolative Associative Memory and has the substantial benefit that it does not require training in the traditional sense.

        Conventional CPN architecture is a combination of a portion of a self–organizing map of Kohonen and the outstar structure of Grossberg. Kohonen network clusters training vectors by the adaptive vector quantization and approximates the centroid of each vector class in a self-organizing fashion and Grossberg’s outstar network adapts the associated vectors through a supervised training method. The CPN architecture is given in figure below as,

Purpose: fast and coarse approximation of vector mapping  

·         not to map any given x to its with given precision,

·         input vectors x are divided into clusters/classes.

·         each cluster of x has one output y, which is the average of  for all x in that class.

Learning in two phases:

Training sample (x, d) where is the desired precise mapping

Phase 1(Kohonen layer): weights  coming into hidden nodes  are trained by competitive learning to become the representative vector of a cluster of input vectors x (use only x, the input part of (x, d)).

1.    For a chosen x, feedforward to determined the winning

2.   

3.    Reduce , then repeat steps 1 and 2 until stop condition is met

Phase 2(Grossberg Layer): weights going out of hidden nodes are trained by delta rule   to be an average output of  where x is an input vector that causes  to  win (use both x and  d).

1.    for a chosen x, feedforward to determined the winning

2.     (optional)

3.   

4.    Repeat steps 1 – 3 until stop condition is met

After phase 1, clusters are formed among sample input x, each is a representative of a cluster (average). After phase 2, each cluster k maps to an output vector y.

 

After training, the network works like a look-up of math table.

         For any input x, find a region where x falls (represented by the wining z node);

         Use the region as the index to look-up the table for the function value.

         CPN works in multi-dimensional input space

         More cluster nodes (z), more accurate mapping.

         Training is much faster than BP

4.5 Gaussian Pyramid

To represent the image directly in terms of the pixel values is therefore inefficient: most of the encoded information is redundant. To design an efficient compression code, the first task is to find a representation which, in effect, decorrelate the image pixels. An image pyramid is the representation of an image at different resolutions. The idea behind image pyramids is to generate a number of homogeneous parameters that represent the response of a bank of filters at different scales and possibly different orientations. To achieve this, the first step is to apply a low-pass filter on  the original image g0 to obtain image g1. We say that g1 is a "reduced" version of g0 in that both resolution and sample density are decreased. In a similar way we form g2 as a reduced version of g1, and so on. Filtering is performed by a procedure equivalent to convolution with one of a family of local, symmetric weighting functions. An important member of this family resembles the Gaussian probability distribution, so the sequence of images g0, g1,…, gn is called the Gaussian pyramid.

 Gaussian Pyramid Generation

Suppose the image is represented initially by the array g0 which contains C columns and R rows of pixels. Each pixel represents the light intensity at the corresponding image point by an integer I between 0 and (K — 1). This image becomes the bottom or zero level of the Gaussian pyramid. Pyramid level 1 contains image g1, which is a reduced or low-pass filtered version of g0. Each value within level 1 is computed as a weighted average of values in level 0 within a window (e.g.,5-by-5). Each value within level 2, representing g2, is then obtained from values within level 1 by applying the same pattern of weights.

 

 

 

The level-to-level averaging process is performed by the function REDUCE.

gk = REDUCE (gk — 1) which means, for levels 0 < l < N and nodes i, j, , 0 < i < Cl,

0 < j < Rl,

.

         Here N refers to the number of levels in the pyramid, while Cl and Rl are the dimensions of the lth level. Note in Fig. that the density of nodes is reduced by half in one dimension or by a fourth in two dimensions from level to level.

The dimensions of the original image are appropriate for pyramid construction if integers ,  and N exist such that  

The dimensions of  are  = MC 2N - l + 1 and Rl = MR 2N - l +1 .

Repeat

  • Filter

  • Subsample

Until minimum resolution reached or can specify desired number of levels (e.g., 3-level pyramid)

4.6 Gaussian Pyramid-CPN

The Gaussian pyramid-CPN is hybrid algorithm which uses the Gaussian pyramid domain coefficients as input for CPN network.

4.7 Wavelet

A wavelet is a kind of mathematical function used to divide a given function or continuous-time signal into different frequency components and study each component with a resolution that matches its scale. A wavelet transform is the representation of a function by wavelets. The wavelets are scaled and translated copies (known as "daughter wavelets") of a finite-length or fast-decaying oscillating waveform (known as the "mother wavelet"). Wavelet transforms have advantages over traditional Fourier transforms for representing functions that have discontinuities and sharp peaks, and for accurately deconstructing and reconstructing finite, non-periodic and/or non-stationary signals. Wavelet transforms are classified into discrete wavelet transforms (DWTs) and continuous wavelet transforms (CWTs). Note that both DWT and CWT are of continuous-time (analog) transforms. They can be used to represent continuous-time (analog) signals. CWTs operate over every possible scale and translation whereas DWTs use a specific subset of scale and translation values or representation grid.

Wavelet Time-Freq Mapping

 

Using a wavelet transform, the wavelet compression methods are adequate for representing transients, such as percussion sounds in audio, or high-frequency components in two-dimensional images, for example an image of stars on a night sky. This means that the transient elements of a data signal can be represented by a smaller amount of information than would be the case if some other transform, such as the more widespread discrete cosine transform, had been used.

A detailed account of wavelet transform can be found at http://users.rowan.edu/~polikar/WAVELETS/WTtutorial.html

 

4.8 Wavelet-BPA (Back Propagation Algorithm)

The Wavelet-BPA is hybrid algorithm which uses the wavelet coefficients as input for BPA network.

Compression Parameters for Wavelet-BPA in GUI

In BPA algorithm, Compression Parameters are given below.

Enter the number of iteration       100 – 200

Enter the number of block size        4, 8

Enter the number of Hidden Layers     4, 8, 16

4.9 BTC-Wavelet

The Wavelet-BTC is hybrid algorithm which uses the wavelet coefficients as input for BTC algorithm.

4.10 Wavelet-CPN (Counter Propagation Network)

The Wavelet-CPN is hybrid algorithm which uses the wavelet domain coefficients as input for CPN network.

4.11 Wavelet-SPIHT (Set Partitioning In Hierarchical Trees)

SPIHT is applied after the wavelet transform where SPIHT algorithm is used to encode the wavelet coefficients. The SPIHT algorithm has received widespread recognition for its notable success in image coding.

The SPIHT method is not a simple extension of traditional methods for image compression, and represents an important advance in the field. The method deserves special attention because it provides the following features as,

  • Good image quality, high PSNR especially for color images

  • It is optimized for progressive image   transmission

  • Produces a fully embedded coded file

  • Simple quantization algorithm

  • Fast coding/decoding (nearly symmetric)

  • Has wide applications, completely adaptive

  • Can be used for lossless compression

  • Can code to exact bit rate or distortion

  • Efficient combination with error protection.

The principles of the SPIHT algorithm are partial ordering of the transform coefficients by magnitude with a set partitioning sorting algorithm, ordered bit plane transmission and exploitation of self-similarity across different layers. By following these principles, the encoder always transmits the most significant bit to the decoder. The important parts of algorithm are as,

Temporal Orientation Trees

A tree structure shown in figure below, called    ”temporal orientation tree”, defines the temporal relationship in the wavelet domain. Every point in layer i corresponds to 2 points in the next layer i+1, with the arrow indicating the parent-offspring relation. This definition is analogous to that of spatial orientation trees. Each node either has no offspring or 2 offspring. In a typical 1-D signal, most of the energy is concentrated in low frequency bands, so that the coefficients are expected to be better magnitude-ordered as we move downward following the temporal orientation tree to the leaves (terminal nodes).

                            Figure: The Temporal Orientation Tree

Set Partitioning Sorting Algorithm

The same set partitioning rule is defined for both the encoder and decoder. The subset of subband coefficients ci in the subset T is said to be significant for bit depth n if , otherwise it is said to be insignificant. If the subset is insignificant, a 0 is sent to the decoder.

If it is significant, a 1 is sent to the decoder and then the subset is further split according to the temporal orientation tree until all the significant sets are a single significant point. In this stage of coding, called the sorting pass, the indices of the coefficients are put onto three lists, the list of insignificant points (LIP), the list of insignificant sets (LIS), and the list of significant points (LSP). In this pass, only bits related to the LSP entries and binary outcomes of the magnitude tests are transmitted to the decoder. In implementation we grouped together the entries in the LIP and LIS which have the same parent into an entry atom. For each entry atom in LIP, we estimated a pattern in both encoder and decoder to describe the significance status of each entry in the current sorting pass. If the result of the significance test of the entry atom is the same as the specified pattern, we can use one bit to represent the status of the whole entry atom which otherwise had two entries and representation of significance by two bits.

Refinement Passes

After each sorting pass, we get the significant coefficients for the threshold 2n, and then send to the decoder the nth most significant bit of every coefficient found significant at a higher threshold. By transmitting the bit stream in this ordered bit plane fashion, we always transmit the most valuable (significant) remaining bits to the decoder. The outline of the full coding algorithm can be found at,

     http://www.cipr.rpi.edu/~pearlman/papers/bme99_lp.pdf

   Compression Parameters for SPIHT in GUI

In Wavelet-SPIHT algorithm Bit rate level lies between 0.01 (high compression, less quality of reconstructed image) to 1 (low compression, high quality of reconstructed image). User can select the parameter according his requirement.

4.12 ICA (Independent Component Analysis)

ICA is a statistical and computational technique for revealing hidden factors that underlie sets of random variables, measurements, or signals. ICA defines a generative model for the observed multivariate data, which is typically given as a large database of samples. In the model, the data variables are assumed to be linear mixtures of some unknown latent variables, and the mixing system is also unknown. The latent variables are assumed non-Gaussian and mutually independent and they are called the independent components of the observed data. These independent components, also called sources or factors, can be found by ICA.  

ICA is superficially related to principal component analysis and factor analysis. ICA is a much more powerful technique, however, capable of finding the underlying factors or sources when these classic methods fail completely.

The data analyzed by ICA could originate from many different kinds of application fields, including digital images, document databases, economic indicators and psychometric measurements.

Applications of ICA include Image compression, Image Denoising, feature extraction, face recognition, and image water marking.

 Method (For example)

For separation of two signals, ICA is modeled as x = A * S.

It is explained in the below shown diagram.

      

Application of ICA for Image Denoising

Compression Parameters for ICA in GUI

In ICA algorithm, Compression Parameter will be Number of independent component between 2 to 200.

 

4.13 PCA (Principal Components Analysis)

In statistics, Principal Components Analysis (Pearson, 1901) is a technique that can be used to simplify a dataset; more formally it is a linear transformation that chooses a new coordinate system for the data set such that the greatest variance by any projection of the data set comes to lie on the first axis (then called the first principal component), the second greatest variance on the second axis, and so on.

PCA can be used for reducing dimensionality in a dataset while retaining those characteristics of the dataset that contribute most to its variance by eliminating the later principal components (by a more or less heuristic decision). These characteristics may be the "most important", but this is not necessarily the case, depending on the application.

 PCA for Image Compression

 

Method followed (for example): The Image is Split into blocks of 8 x 8 pixels. PCA is applied on this distribution of 64-dimensionel vectors.

Then the image is reconstructed from 8 first PC compressions.

 

Compression Parameters for PCA in GUI

In PCA algorithm, Compression Parameter must be Number of Principle Component values between 2 to 10.

4.14 2D-PCA (Principal Component Analysais)   

     It is a technique generally used for dimensionality reduction and very new in research area. The 2D PCA is an extension of PCA and here it is used for image compression.

4.15 DCT-PCA (Discrete Cosine Transform-Principal Component Analysis)

DCT-PCA is a hybrid algorithm. PCA is described in above section.

Discrete Cosine Transform (DCT)

           A discrete cosine transform (DCT) is a Fourier-related transform similar to the discrete Fourier transform (DFT), but using only real numbers. DCT’s are equivalent to DFT’s of roughly twice the length, operating on real data with even symmetry (since the Fourier transform of a real and even function is real and even), where in some variants the input and/or output data are shifted by half a sample. There are eight standard DCT variants, of which four are common. The most common variant of discrete cosine transform is the type-II DCT, which is often called simply "the DCT"; its inverse, the type-III DCT, is correspondingly often called simply "the inverse DCT" or "the IDCT".

Two related transforms are the discrete sine transforms (DST), which is equivalent to a DFT of real and odd functions, and the modified discrete cosine transforms (MDCT), which is based on a DCT of overlapping data.

                                 

Components of a typical image/video transmission system

 

The Process

·   The image is broken into 8X8 blocks of pixels.

·   Working from left to right, top to bottom, the DCT  is applied to each block.

·   The array of compressed blocks that constitute the image is stored in a drastically reduced amount of space.

·   When desired, image is reconstructed through decompression, a process that uses the Inverse Discrete Cosine Transform (IDCT).

The DCT Equation

 

The i, jth entry of the DCT of an image is,

     

     2

P(x,y) is the x, yth  element of the image represented by the matrix p. N is the size of the block . For the standard 8 x 8 block that JPEG compression uses, N=8, range of x and y is 0 to 7. Therefore D (i,j) would be given as,

   

The DCT Matrix

To get the matrix form of equation (1), we will use the following equation

            

 

 

For an 8x8 block it results in this matrix as,

Compression Parameters for DCT-PCA in GUI

In DCT-PCA algorithm, Compression Parameter must be number of Principle Component values between 2 to 10.

4.16 SVD (Singular Value Decomposition)

SVD is one of the most useful tools of linear algebra, is a factorization and approximation technique which effectively reduces any matrix into a smaller invertible and square matrix. The SVD provides where other linear approximation techniques fail. With almost every tool and technique discussed in linear algebra, there is a delimiter (provided that the matrix is both invertible and square). The singular value decomposition not only approximates this special case scenario but it will also work its magic on every other possible scenario. The SVD works wonderfully with both under and over-determined matrices.

Using (SVD) for image compression can be a very useful tool to save storage space. We were able to get an image that is indistinguishable from the original image, but only using 45% of the original storage space. The Singular Value Decomposition is not only used for image compression it has many other useful applications.

Method followed for example:

  • Uncompressed m by n pixel image: m×n numbers

  • Rank q approximation of image:

  • q singular values

  • The first q columns of U (m-vectors)

  • The first q columns of V (n-vectors)

          Total: q × (m + n + 1) numbers

 

 

By 60 iterations the image is clearly evident, and by 100 iterations we have the original image, with significant compression. 

4.17 SVM-DCT

SVM-DCT is a hybrid algorithm. DCT is described in previous section.

SVM (Support Vector Machine)

Support vector machines (SVMs) are a set of related supervised learning methods used for classification and regression. They belong to a family of generalized linear classifiers. They can also be considered a special case of Tikhonov regularization. A special property of SVMs is that they simultaneously minimize the empirical classification error and maximize the geometric margin; hence they are also known as maximum margin classifiers.

Support vector machines map input vectors to a higher dimensional space where a maximal separating hyperplane is constructed. Two parallel hyperplanes are constructed on each side of the hyperplane that separates the data. The separating hyperplane is the hyperplane that maximizes the distance between the two parallel hyperplanes. An assumption is made that the larger the margin or distance between these parallel hyperplanes the better the generalization error of the classifier will be.

Support Vector Machine (SVM) models are a close cousin to classical multilayer perceptron neural networks. Using a kernel function, SVM’s are an alternative training method for polynomial, radial basis function and multi-layer perceptron classifiers in which the weights of the network are found by solving a quadratic programming problem with linear constraints, rather than by solving a non-convex, unconstrained minimization problem as in standard neural network training.

A detailed account of SVM is present at the following web links

http://www.autonlab.org/tutorials/svm.html

http://www.support-vector.net/tutorial.html

Compression Parameter In SVM-DCT algorithm

Ø      Enter error value 1 to 100

Ø      Tolerance 1 to 100

18. SVM-Fuzzy

The SVM-Fuzzy is hybrid algorithm which uses the wavelet coefficients as input for SVM network.

Fuzzy Logic

Fuzzy logic is derived from fuzzy set theory dealing with reasoning that is approximate rather than precisely deduced from classical predicate logic. It can be thought of as the application side of fuzzy set theory dealing with well thought out real world expert values for a complex problem.

Degrees of truth are often confused with probabilities. However, they are distinct conceptually; fuzzy truth represents membership in vaguely defined sets, not likelihood of some event or condition. For example, if a 100-ml glass contains 30 ml of water, then, for two fuzzy sets, Empty and Full, one might define the glass as being 0.7 empty and 0.3 full. Note that the concept of emptiness would be subjective and thus would depend on the observer or designer. Another designer might equally well design a set membership function where the glass would be considered full for all values down to 50 ml. A probabilistic setting would first define a scalar variable for the fullness of the glass, and second, conditional distributions describing the probability that someone would call the glass full given a specific fullness level. Note that the conditioning can be achieved by having a specific observer that randomly selects the label for the glass, a distribution over deterministic observers, or both. While fuzzy logic avoids talking about randomness in this context, this simplification at the same time obscures what is exactly meant by the statement the 'glass is 0.3 full'.

                                

Crisp Set: Crisp set defines the membership of its elements from

   

Fuzzy Set: Fuzzy set defines the membership of its elements from

Fuzzy logic allows for set membership values to range (inclusively) between 0 and 1, and in its linguistic form, imprecise concepts like "slightly", "quite" and "very". Specifically, it allows partial membership in a set. It is related to fuzzy sets and possibility theory. It was introduced in 1965 by Lotfi Zadeh at the University of California, Berkeley.

A general description of fuzzy logic can be found at  http://www.dementia.org/~julied/logic/index.html

A gentle introduction to fuzzy logic can be found in the tutorial “Fuzzy Logic Introduction” by Martin Hellmann, March 2001

Compression Parameter In SVM-Fuzzy algorithm

Ø      Enter error value 1 to 100

Tolerance 1 to 100

 

Chief Investigator
Prof. Prem K. Kalra
Head, Department of Electrical Engineering
IIT Kanpur, India

Developed by Karmaa Lab, Indian Institute of Technology, Kanpur, www.iitk.ac.in/karmaa