Robots still struggle at performing a wide spectrum of tasks effortlessly and smoothly, and this is mainly due to actuator technology as currently most electrical motors are used. Advances in artificial muscles and skin sensors that could cover the entire embodiment of the agent would be essential to fully mitigate the human experience in the real world and eventually unlock the desired cognition [87].
3.5.2 Evolution
One more key component for cognition is the ability to grow and evolve over time 88, 90. It is easy to evolve the agent's controller via an evolutionary algorithm, but it is not enough. If we aim to have completely different agents, we might as well give them the ability to evolve in terms of embodiment and the sensors as well. This again requires the abovementioned artificial cell organism to encode different physical attributes in them and flip them slightly over time. Of course, we are far from this to become reality, but it is always good to know the furthermost step that has to be done one day.
3.6 Conclusion
Embodied AI is the field of study that takes us one step closer to the true intelligence. It is a shift from Internet AI toward embodiment intelligence that tries to exploit the multisensory abilities of agents such as vision, hearing, and touch, together with language understanding and reinforcement learning attempts to interact in the real world in a more sensible way. In this chapter, we tried to do a concise review of this field and its current advancements, subfields, and tools expecting that this would help accelerate future researches in this area.
References
1 1 Park, J.H., Younas, M., Arabnia, H.R., and Chilamkurti, N. (2021). Emerging ICT applications and services‐big data, IoT, and cloud computing. International Journal of Communication Systems. https://onlinelibrary.wiley.com/doi/full/10.1002/dac.4668.
2 2 Amini, M.H., Imteaj, A., and Pardalos, P.M. (2020). Interdependent networks: a data science perspective. Patterns 1 100003. https://www.sciencedirect.com/science/article/pii/S2666389920300039.
3 3 Mohammadi, F.G. and Amini, M.H. (2019). Promises of meta‐learning for device‐free human sensing: learn to sense. Proceedings of the 1st ACM International Workshop on Device‐Free Human Sensing, pp. 44–47.
4 4 Amini, M.H., Mohammadi, J., and Kar, S. (2020). Promises of fully distributed optimization for IoT‐based smart city infrastructures. In: Optimization, Learning, and Control for Interdependent Complex Networks, M. Hadi Amini, 15–35. Springer.
5 5 Amini, M.H., Arasteh, H., and Siano, P. (2019). Sustainable smart cities through the lens of complex interdependent infrastructures: panorama and state‐of‐the‐art. In: Sustainable Interdependent Networks II, ( M. Hadi Amini, Kianoosh G. Boroojeni, S. S. Iyengar et al.), 45–68. Springer.
6 6 Iskandaryan, D., Ramos, F., and Trilles, S. (2020). Air quality prediction in smart cities using machine learning technologies based on sensor data: a review. Applied Sciences 10 (7): 2401.
7 7 Batty, M., Axhausen, K.W., Giannotti, F. et al. (2012). Smart cities of the future. The European Physical Journal Special Topics 214 (1): 481–518.
8 8 Deng, J., Dong, W., Socher, R. et al. (2009). ImageNet: A large‐scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp. 248–255.
9 9 Lin, T.‐Y., Maire, M., Belongie, S. et al. (2014). Microsoft COCO: Common objects in context. European Conference on Computer Vision, Springer, pp. 740–755.
10 10 Xiao, J., Hays, J., Ehinger, K.A. et al. (2010). Sun database: large‐scale scene recognition from abbey to zoo. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, pp. 3485–3492.
11 11 Griffin, G., Holub, A., and Perona, P. (2007). Caltech‐256 object category dataset.
12 12 Zhou, B., Lapedriza, A., Xiao, J. et al. (2014). Learning deep features for scene recognition using places database. Advances in Neural Information Processing Systems 27: 487–495.
13 13 Rajpurkar, P., Zhang, J., Lopyrev, K., and Liang, P. (2016). SQuAD: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250.
14 14 Wang, A., Singh, A., Michael, J. et al. (2018). GLUE: A multi‐task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461.
15 15 Zellers, R., Bisk, Y., Schwartz, R., and Choi, Y. (2018). SWAG: A large‐scale adversarial dataset for grounded commonsense inference. arXiv preprint arXiv:1808.05326.
16 16 Krishna, R., Zhu, Y., Groth, O. et al. (2017). Visual genome: connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision 123 (1): 32–73.
17 17 Antol, S., Agrawal, A., Lu, J. et al. (2015). VQA: Visual question answering. Proceedings of the IEEE International Conference on Computer Vision, pp. 2425–2433.
18 18 Shenavarmasouleh, F. and Arabnia, H.R. (2020). DRDr: Automatic masking of exudates and microaneurysms caused by diabetic retinopathy using mask R‐CNN and transfer learning. arXiv preprint arXiv:2007.02026.
19 19 Shenavarmasouleh, F., Mohammadi, F.G., Amini, M.H., and Arabnia, H.R. (2020). DRDr II: Detecting the severity level of diabetic retinopathy using mask RCNN and transfer learning. arXiv preprint arXiv:2011.14733.
20 20 Shenavarmasouleh, F. and Arabnia, H. (2019). Causes of misleading statistics and research results irreproducibility: a concise review. 2019 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 465–470.
21 21 Held, R. and Hein, A. (1963). Movement‐produced stimulation in the development of visually guided behavior. Journal of Comparative and Physiological Psychology 56 (5): 872.
22 22 Moravec, H. (1984). Locomotion, vision and intelligence.
23 23 Hoffmann, M. and Pfeifer, R. (2012). The implications of embodiment for behavior and cognition: animal and robotic case studies. arXiv preprint arXiv:1202.0440.
24 24 Brooks, R.A. (1991). New approaches to robotics. Science 253 (5025): 1227–1232.
25 25 Collins, S.H., Wisse, M., and Ruina, A. (2001). A three‐dimensional passive‐dynamic walking robot with two legs and knees. The International Journal of Robotics Research 20 (7): 607–615.
26 26 Iida, F. and Pfeifer, R. (2004). Cheap rapid locomotion of a quadruped robot: self‐stabilization of bounding gait. In: Intelligent Autonomous Systems, vol. 8, 642–649. The Netherlands: IOS Press Amsterdam.
27 27 Yamamoto, T. and Kuniyoshi, Y. (2001). Harnessing the robot's body dynamics: a global dynamics approach. Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the Next Millennium (Cat. No. 01CH37180), Volume 1, IEEE, pp. 518–525.
28 28 Bledt, G., Powell, M.J., Katz, B. et al. (2018). MIT cheetah 3: design and control of a robust, dynamic quadruped robot. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, pp. 2245–2252.
29 29 Hermann, K.M., Hill, F., Green, S. et al. (2017). Grounded language learning in a simulated 3D world. arXiv preprint arXiv:1706.06551.
30 30 Tenney, I., Das, D., and Pavlick, E. (2019). Bert rediscovers the classical NLP pipeline.
31 31 Pan, Y., Yao, T., Li, H., and Mei, T. (2017). Video captioning with transferred semantic attributes. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6504–6512.
32 32 Amirian, S., Rasheed, K., Taha, T.R., and Arabnia, H.R. (2020). Automatic