Visual masking in natural scenes: Database, models, and an application to perceptual image coding

Alam, Md Mushfiqul

dc.contributor.advisor	Chandler, Damon
dc.contributor.author	Alam, Md Mushfiqul
dc.date.accessioned	2016-09-29T18:30:27Z
dc.date.available	2016-09-29T18:30:27Z
dc.date.issued	2015-05
dc.identifier.uri	https://hdl.handle.net/11244/45128
dc.description.abstract	Studies of visual masking have provided a wide range of important insights into the processes involved in visual coding. However, very few of these studies have employed natural scenes as masks, and little is known on how natural scenes affect visual detection thresholds. This report describes a study designed to obtain local contrast detection thresholds for a database of natural images. Via a three-alternative forced-choice experiment, thresholds were measured for detecting 3.7 cycles/degree vertically oriented log-Gabor noise targets placed within an 85x85-pixels patch (1.9 degrees patch) drawn from 30 natural images. Thus, for each image, a masking map was obtained in which each entry in the map denoted the RMS contrast threshold for detecting the log-Gabor noise target at the corresponding spatial location in the image. Qualitative observations showed detection thresholds were affected by several patch properties such as visual complexity, fineness of textures, sharpness, and overall luminance. The quantitative analysis showed that except for the sharpness measure (Pearson correlation coefficient, CC of 0.7), the other tested low-level mask features showed a weak correlation (CC less than 0.52) with the detection thresholds. Three computational models of visual masking were used to predict the thresholds. The first model was a feature-regression model, the second model was an optimized gain-control model, and the third model consisted a three-layer convolutional neural network (CNN) architecture. In terms of CC and RMSE, the gain-control model performed the best with overall CC and RMSE of 0.83 and 5.2 dB, respectively. However, in terms of execution time, the CNN model performed the best with an average execution time of 5 seconds per image, compared to 40 seconds and 66 seconds for the feature-based and gain-control model, respectively. Furthermore, a structural facilitation model is proposed to improve the prediction for patches containing recognizable structures. Prediction performance increased for images with structures: for image geckos, child swimming, and foxy the CC became 0.77, 0.87, and 0.63 from 0.68, 0.85, and 0.58, respectively. Moreover, using a subjective local-quality-assessment experiment it was found that masking predicted the local quality scores more than 95% correctly above 15 dB threshold within 5% subject scores. Finally, a block based quantization scheme was proposed for still-image compression for high-efficiency-video-coding standard using the masking model. The compression gain was around 23%, and 30% for at threshold, and 1 dB beyond threshold, respectively.
dc.format	application/pdf
dc.language	en_US
dc.rights	Copyright is held by the author who has granted the Oklahoma State University Library the non-exclusive right to share this material in its institutional repository. Contact Digital Library Services at lib-dls@okstate.edu or 405-744-9161 for the permission policy on the use, reproduction or distribution of this material.
dc.title	Visual masking in natural scenes: Database, models, and an application to perceptual image coding
dc.contributor.committeeMember	Fan, Guoliang
dc.contributor.committeeMember	Hagan, Martin T.
dc.contributor.committeeMember	Rhinehart, R. Russell
osu.filename	Alam_okstate_0664D_13953.pdf
osu.accesstype	Open Access
dc.type.genre	Dissertation
dc.type.material	Text
thesis.degree.discipline	Electrical Engineering
thesis.degree.grantor	Oklahoma State University

Files in this item

Name:: Alam_okstate_0664D_13953.pdf
Size:: 229.3Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

OSU Dissertations [11222]

Show simple item record

SHAREOK^TM

advancing Oklahoma scholarship, research and institutional memory

Visual masking in natural scenes: Database, models, and an application to perceptual image coding

Files in this item

This item appears in the following Collection(s)