Show simple item record

dc.contributor.advisorChandler, Damon
dc.contributor.authorAlam, Md Mushfiqul
dc.date.accessioned2016-09-29T18:30:27Z
dc.date.available2016-09-29T18:30:27Z
dc.date.issued2015-05
dc.identifier.urihttps://hdl.handle.net/11244/45128
dc.description.abstractStudies of visual masking have provided a wide range of important insights into the processes involved in visual coding. However, very few of these studies have employed natural scenes as masks, and little is known on how natural scenes affect visual detection thresholds. This report describes a study designed to obtain local contrast detection thresholds for a database of natural images. Via a three-alternative forced-choice experiment, thresholds were measured for detecting 3.7 cycles/degree vertically oriented log-Gabor noise targets placed within an 85x85-pixels patch (1.9 degrees patch) drawn from 30 natural images. Thus, for each image, a masking map was obtained in which each entry in the map denoted the RMS contrast threshold for detecting the log-Gabor noise target at the corresponding spatial location in the image. Qualitative observations showed detection thresholds were affected by several patch properties such as visual complexity, fineness of textures, sharpness, and overall luminance. The quantitative analysis showed that except for the sharpness measure (Pearson correlation coefficient, CC of 0.7), the other tested low-level mask features showed a weak correlation (CC less than 0.52) with the detection thresholds. Three computational models of visual masking were used to predict the thresholds. The first model was a feature-regression model, the second model was an optimized gain-control model, and the third model consisted a three-layer convolutional neural network (CNN) architecture. In terms of CC and RMSE, the gain-control model performed the best with overall CC and RMSE of 0.83 and 5.2 dB, respectively. However, in terms of execution time, the CNN model performed the best with an average execution time of 5 seconds per image, compared to 40 seconds and 66 seconds for the feature-based and gain-control model, respectively. Furthermore, a structural facilitation model is proposed to improve the prediction for patches containing recognizable structures. Prediction performance increased for images with structures: for image geckos, child swimming, and foxy the CC became 0.77, 0.87, and 0.63 from 0.68, 0.85, and 0.58, respectively. Moreover, using a subjective local-quality-assessment experiment it was found that masking predicted the local quality scores more than 95% correctly above 15 dB threshold within 5% subject scores. Finally, a block based quantization scheme was proposed for still-image compression for high-efficiency-video-coding standard using the masking model. The compression gain was around 23%, and 30% for at threshold, and 1 dB beyond threshold, respectively.
dc.formatapplication/pdf
dc.languageen_US
dc.rightsCopyright is held by the author who has granted the Oklahoma State University Library the non-exclusive right to share this material in its institutional repository. Contact Digital Library Services at lib-dls@okstate.edu or 405-744-9161 for the permission policy on the use, reproduction or distribution of this material.
dc.titleVisual masking in natural scenes: Database, models, and an application to perceptual image coding
dc.contributor.committeeMemberFan, Guoliang
dc.contributor.committeeMemberHagan, Martin T.
dc.contributor.committeeMemberRhinehart, R. Russell
osu.filenameAlam_okstate_0664D_13953.pdf
osu.accesstypeOpen Access
dc.type.genreDissertation
dc.type.materialText
thesis.degree.disciplineElectrical Engineering
thesis.degree.grantorOklahoma State University


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record