Methods developed for normal 2D text detection do not work well for text that is rendered using decorative, 3D effects, etc. This paper proposes a new method for classification of 2D and 3D natural scene text images so that an appropriate recognition method can be chosen accordingly based on the classification results for better performance. The proposed method explores local gradient differences for obtaining candidate pixels, which represent a stroke. To study the spatial distribution of candidate pixels, we propose a measure, called COLD, which is dense for pixels toward the center of strokes and scattered for non-stroke pixels. This observation leads us to introduce mass features for extracting the regular spatial pattern of COLD, which indicates a 2D text image. The extracted features are fed into a Neural Network (NN) for classification. The proposed method is tested on (i) a new dataset introduced in this work (ii)a second dataset assembled from standard natural scene datasets (iii) Non-Text Image datasets which does not contain text, rather it contains objects. Experimental results of the proposed method on images with text and non-text show that the proposed method is independent of text. The proposed approach improves text detection and recognition performance significantly after classification.