2 2. Shades of red-colored garments showed the highest average confidence score consistently for all values of confidence thresholds thereby indicating that there was maximum interaction with a red-colored garment whenever one was detected as an active garment.
3 3. Customers found garments having shades of orange color the least interesting and spent the least amount of time on such garments. This indicates that they had minimum interaction with such garments which is reflected by the garments’ lowest average confidence scores as well.
The main advantage of the proposed approach is that it is able to indicate which customer is interested in which garment from a surveillance video. However, its disadvantage is that it finds it difficult to identify garments of interest correctly in highly crowded scenarios either due to the partial or complete obstruction of the garments or due to the difficulty in detecting the wrists of a person if they are obstructed by a garment.
3.5 Highlights
In this section, we highlight the major findings of this study. We utilize the GMG background subtraction algorithm for extracting the foreground information capturing non-stationary garments from the video of a surveillance camera. Using this foreground information, we are able to detect the garment regions which are then clustered to obtain the entire garment. We leverage the Mask R-CNN framework to detect customers from its object detection performed for the “person” class. Subsequently, we obtain the wrist feature points of these customers using OpenPose human pose estimation framework. Using these pieces of information, we propose a quantitative metric, known as the confidence score, which indicates the degree to which a customer is interested in a given garment. We take the Euclidean distance between the centroid of a garment and the coordinates of the wrist feature points of a customer and use the area of the garment to calculate the confidence score for the given pair of customer and garment. We analyze the variation of the average confidence score of garments of interest with a confidence threshold to generate meaningful evaluations.
3.6 Conclusion and Future Works
In this chapter, a robust framework for the detection of garments of interest is proposed. By using a suitable background subtraction algorithm in conjunction with a person detection framework, the foreground information comprising of the garments is obtained. The application of individual color masks and morphological operations is used to obtain garment regions, which could contain multiple detected contours within the same garment. A garment linking process is utilized to link contours belonging to the same garment, thereby obtaining the active garments. The active garments that the customers find interesting, referred to as garments of interest, are obtained by utilizing a confidence score metric. This confidence score is calculated by finding the Euclidean distance between a customer’s wrist landmarks and an active garment and using the area of the active garment in the foreground of the video frames during sales interactions.
The framework was tested on a surveillance video dataset obtained from CCTV footage of an Indian garment store and was found to be effective as demonstrated by the high precision and recall values for the detection of active garments and the competence of confidence threshold in filtering garments of interest from the collection of active garments. Furthermore, the framework successfully tracked the duration for which a customer was interested in a specific garment of interest.
Additionally, we believe that a cogent extension in the future could be to utilize the posture of the head in addition to the line of sight information to improve the determination of the garments of interest for a given customer. Furthermore, visual customer demographics information can be determined from the person masks obtained to filter the garments of interest of different customer groups, enabling us to perform market segmentation. In supplement to the suggested improvements, a mapping between a given customer and the sales merchant can be established in order to determine the collection of garments of interest that are not always adjacent to the wrists of the customer in consideration.
Acknowledgements
The authors would like to express our gratitude to Aniruddha Joshi, Goutham Kanahasabai, and Keerthi Priyanka for giving us consent to extend their work and would like to thank Dr. Earnest Paul Ijjina (Assistant Professor in Department of Computer Science and Engineering, National Institute of Technology, Warangal) for his guidance while undertaking this research. Lastly, we thank our families for their constant moral support and encouragement.
References
1. McDonald, M. and Dunbar, I., Market segmentation: how to do it, how to profit from it, Wiley, New Jersey, 2012.
2. Chang, C.-C. and Wang, L.-L., Color texture segmentation for clothing in a computer-aided fashion design system. Image Vision Comput., 14, 9, 685–702, 1996.
3. Ridzuan, S., Omar, Z., Sheikh, U., A review of content-based video retrieval techniques for person identification. ELEKTRIKA- J. Electr. Eng., 18, 49–56, 12 2019.
4. Dong, T. and Li, J., Software design of cloth design and simulation system, in: Proc. of IEEE International Conference on Computer-Aided Industrial Design Conceptual Design, pp. 605–609, 2009.
5. Cychnerski, J., Brzeski, A., Boguszewski, A., Marmolowski, M., Trojanowicz, M., Clothes detection and classification using convolutional neural networks, in: Proc. of IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), pp. 1–8, 2017.
6. Yang, M. and Yu, K., Real-time clothing recognition in surveillance videos, in: Proc. of IEEE International Conference on Image Processing, pp. 2937–2940, 2011.
7. Huang, C., Chen, J., Pan, Y., Lai, H., Yin, J., Huang, Q., Clothing landmark detection using deep networks with prior of key point associations. Proc. IEEE Trans. Cybern., 49, 10, 3744–3754, 2019.
8. Sidnev, A., Trushkov, A., Kazakov, M., Korolev, I., Sorokin, V., Deepmark: One-shot clothing detection, in: Proc. of IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3201–3204, 2019.
9. He, K., Gkioxari, G., Dollár, P., Girshick, R., Mask r-cnn, in: Proc. of IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988, 2017.
10. Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A., Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no.1, 172–186, July 2019.
11. Bu, Q., Zeng, K., Wang, R., Feng, J., Multi-depth dilated network for fashion landmark detection with batch-level online hard keypoint mining. Image Vision Comput., 99, 103930, 2020.
12. Yu, W., Liang, X., Gong, K., Jiang, C., Xiao, N., Lin, L., Layout-graph reasoning for fashion landmark detection, in: Proc. of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019, pp. 2932–2940.
13. Ge, Y., Zhang, R., Wang, X., Tang, X., Luo, P., Deepfashion2: A versatile benchmark for detection, pose estimation, segmentation and re-identification of clothing images, in: Proc. of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019, pp. 5332–5340.
14. Hara, K., Jagadeesh, V., Piramuthu, R., Fashion apparel detection: The role of deep convolutional neural network and pose-dependent priors. IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–9, 2016.
15. Kita, Y., Ueshiba, T., Neo, E.S., Kita, N., Clothes state recognition using 3d observed data, in: Proc. of International Conference on Robotics and Automation, pp. 1220–1225, 2009.
16. Sutoyo, R., Prayoga, B., Fifilia, D., Suryani, Shodiq, M., The implementation of hand detection and recognition to help presentation processes. Proc. Comput. Sci., 59, 550–558, 2015, International Conference