MEETVERSE: A new way of Interaction on Online Meeting Platforms

Enhancing Interaction and Engagement on Online Meeting Platforms

by Kunal Chawla*, Yash Raj Gupta, Tushar Tyagi, Aditya Jangra,

- Published in International Journal of Information Technology and Management, E-ISSN: 2249-4510

Volume 17, Issue No. 1, Feb 2022, Pages 56 - 63 (8)

Published by: Ignited Minds Journals


ABSTRACT

Online meeting platforms are used widely in today’s era of Digital India. These meeting platforms are used in providing online education, online dating and online business meetings, etc. During the last decade, there is quite a development in online meeting methods. At present the meeting applications solve almost everything be it sharing screen, muting mic, disabling your camera, and changing the background but still they sometimes become boring. This article presents ways to make meeting applications more interesting using Avatar formation, interacting using Avatar, and providing hand gesture controls to increase and decrease the volume of the meeting platform. Different deep learning techniques are required to make different avatars according to different people. Different Machine learning and Computer Vision techniques are used such as face recognition for extracting the features from the face to directly apply them to the Avatar. These methods and features are an add-on to the existing Meeting Applications, which makes them more interactive.

KEYWORD

online meeting platforms, interaction, avatar formation, hand gesture controls, deep learning techniques, machine learning, computer vision, face recognition, meeting applications, online education

1. INTRODUCTION

A few years ago, the creation of software and hardware was mainly limited to enhance the user interface in which most of the programmer‘s from different companies and forms were engaged in. The above situation got changed with the arrival of Windows Operating System which led to the certain switch in solving problems related to image processing. Avatar Creation is the process of creating the essence of an entity and using it as an avatar of the user's actual body. Through this, the users are able to create an incarnation, an embodiment, or a physical manifestation of a more significant entity. This ability may be a way for the users to interact with their own world in a way their actual forms may not be able to do. Online Meeting uses a large number of object recognition and face recognition technologies. Image classification involves activities such as predicting the class of the images based on the object. Applying face features over the Avatar is another thing to create a live-working Avatar that can interact with others. MeetVerse does the work to combine the two task and it localizes as well as classifies mores object in a given image

2. LITERATURE SURVEY

The first ever-video conference technology was first done by Bell Labs using their video phone in the year of 1927. The word ―video‖ was first adapted in the English language dues to this historic event done by Bell Labs and this changed the perception of day to day calls, the history of video conferencing applications is a very captivating example of technological advancement. In this modern world, video calls have become an integral part of our lives. From video calls used just for business calls to everyday interaction with family and friends, video calls have come a long way. To make these calls even more exciting and fun, filters on videos have been used. The first ever motion picture special effect was created by Alfred Clark in the year of 1895. Many video calling applications have already implemented the concept of filter on video calls. Our project, instead of using these filters to mask over videos, uses an avatar to display motions to the person. The term ‗avatar‘ is the graphical representation of a user in either two-dimensions (2- D) or in the three-dimensions (3-D). It was first used by a computer game named 1979 PLATO. There are not many applications that have used avatars in video calls, but some which have used it are as follows. ChatGame is a highly interactive video calling and messaging application that allows all it‘s users to make free video , phone calls as well provides free texting services to all it‘s users. It was released in

add and invite friends through ChatGame's Add Friends screen. ChatGame allows you to use yourself nice and attractive. On September 15th, Loom.ai, whose avatar technology appears on eighty million Samsung devices, launched Loomie Live Pro, a desktop app that brings on cartoon avatars to Zoom, Meet, Teams, and other video conferencing platforms. We have been using it to surprise students and amuse colleagues in situations where some levity might brighten others over Zoom call. This fall, you may use it for free. "LoomieLive Pro uses voice, video, and AR to bring the avatar to life," explained Bhat, "lip-syncing the avatar is most critical, but what makes the simulation work is the gestures, things like laughter, and subtleties like blinking empathetically while listening." LoomieLive allows the users to combine theiravatar with an immersive 3D environment, or use the virtual backgrounds featured on Zoom, which are foundin the camera drop-down at the bottom of screen. We have also been using Snap Camera to brighten our Zoom calls. Introduced in the fall of 2018, Snap Camera is a free application designed to let desktop users throw up rainbowsand enjoy other signature Snap Camera effects.

Figure 1

Now Zoom its-self has implemented the concept of avatars on video calls. It uses a virtual animal that mirrors your head movements and facial expressions. When you turn on the Avatar feature during a meeting, Zoom's technology uses your device camera to detect where your face is on the screen and apply the selected avatar effect. Images of your face do not leave the device when using this feature; they are not stored or sent to Zoom. This feature does not use is not a face, it does not recognize or distinguish between individual faces. Similar to Apple's Memoji, the 'Avatars' in Zoom act as a filter and replace the user's head with an animated character adding a dash of fun to the otherwise mundane video calls and make it extremely interactive. Currently, there are about 22 'Avatars' you can choose from, but with the BETA tag on the section, one can expect Zoom to add more 'Avatars' in the future.

3. PROPOSED WORK

  • Human Detection in Video

The human detection and tracking contains five parts which are false object detection, blob detection, human tracking, human recognition and foreground detection. In our system, the background subtraction approach is used for foreground detection. After the background subtraction is done completely a process named shadow detection is applied, and afterwards morphological operations are applied to filter out the camera noise and irregular object detection and this leads to the formation of foreground mask imgae. After the process blobs are completely segmented from the mask. Several blobs are included just because of some noises. So, blob segmentation is performed after all the blob merge creates the whole object. Human classification can be done in the following two ways: first by using the codebook to determine whether the blob is a human or not and the other way is to track the blob. If the blob gets perfectly tracked that it a Human. This approach is mainly used for blob and human tracking. To reduce the false alarm and background model false object detection is used. The architecture is described in fig. 1.

  • Human-computer Interface

With the increasing development of Industry 4.0 and the growing demand of digitalization the event of development of safe human –computer interfaces is of the most importance. For this purpose, we presenting a new solution into the market which totally avoids any physical contact between human and machine display. We created a very innovative image processing algorithm which is basically a live video streaming of the remote interaction between the human and the machine, it is just like an intelligent agent; the video is taken from the off-the- which used object detection to marker the user holds in one hand and will be used to wave from a distance so as to interact with the interface and the machine simultaneously. This has been evaluated successfully in real time.

  • 3d character

3D modeling is being used into the standard of living more frequently. Is basically used in most of the TV shows, movies and animation. Maya and Blender are two of the highest 3D modeling programs. Maya is used in movies like The Chronicles of Namia, Harry Potter, and therefore the Transformers, moreover because the broadcast South Park. Blender was used for some of the great movies such as Captain America, woman, etc. Blender‘s popularity is increasing day by day as it is has embarked 500,000 downloads during a month and approx 6.5 million every year. This software has made it possible to bring characters into real-life movies. Video Games generated about $30.4 billion revenue in 2016 itself. This causes the educator to think about the subsequent questions, what's the aim of those programs? For college teachers and students Blender and Maya are very useful when they are trying to learn 3D modelling and animation. A question might arise that what program among the two is the handiest and helpful one for the college faulty and students? These both programs have benefits such as they both have their unique set of tools which help the user to create amazing animations easily, the only difference lies in their layouts and also in different modes provided into the software. User experience and computer program are two very vital components of 3D modeling and animation softwares . Having a robust and strong experience provides good end results. The difference between user-experience and user interface is that user-experience is customer‘s end result experience with the software and user interface is basically a system via which the consumer interact with the software and the machine. When focusing on Maya and Blender, Maya strongly emphasizes the user interface and less on experience. Blender maintains an equal connection between the user experience and interface.

  • 3D character motion

Motion animation has become one of the most critical aspects of games, simulations, and advertisements, not only in virtual environments (Kolivand, 2012) but also in augmented reality systems (Kolivand, 2013a). Dynamic 3D character motions are the main effect of Generally, motions which are created by the animators are mostly static. These are created to achieve a particular type of movement. The end result will be that the character will keep on repeating the same type of motion in real-time animation. Therefore, it certainly looks impractical and the character cannot work on any physical interactions. The real-time animation of characters requires a combination of different motions and sources such as motion capture, dynamic simulation and manual keyframs,etc (Hu, 2010). The development involves a character skeleton joints and bones that will simulate movement through a virtual environment. Using 3D software, animators design the character and make models more optimized. After that, the skeleton will be set up for character rigging. The primary process is to edit input data for mapping into the character before producing the movement animation. The movement of characters in movies is way too better than computer games. To achieve very realistic looking visual character animations, it should be adjusted in a manner that it should be similar to the movements done by the humans in real-life. Analysis such movements is applied to these characters to give them a feel and depth as well as one of the biggest challenges in the interactive computer games industry is to produce dynamic character movements and reactions to physical interaction. Dynamic motion are the properties of 3-dimensional objects which includes mass which specifies how the internal as well as the external forces react with the object (Oshita, 2006). The dynamic input such as walking running or jumping makes the character more realistic to the real world. In real-time characters animators use solid section which are basically connected with joints and these sections and joints together make what we know as the skeleton. The motion of the skeleton is specified in terms of translation and rotation. Characters also have a kinematic approach which is basically position, velocity and orientation. Geometric transformations are directly controlled by the forward kinematics approach. Meanwhile, Inverse Kinematics solves the problem to a given point in the geometric locations. Generally, a dynamic motion control structure must have two core parts: controller and simulator. We

process is complete the simulators update the character motion stage. The controller and the simulator are based on a human body model and external physical input. Different approaches are developed for the study and implementation of character motion control. Like a muscle strength model is used for the inverse kinematics method and also helps in calculating the motion trajectory and speed usig the joints of the skeleton. A recent survey for the dynamic motion generation and control revealed a new virtual character motion control-based simulation process. This structure combines the active control torque (Kenwright, 2011) and other external physical interactions. The physical simulator has generated the output motions.It is important for the user to have a basic knowledge on how to control the movement of the dynamically simulated character. The movements and the external actions can be categorized into maximal coordinate position and optimize parent –child joint-trajectory. Researches mainly focused on creating short sequence motion which make it quite impossible for the character to replicate the complicated motion patterns (Oshita, 2002) correctly for games development and humanoid robots.

  • Control character

Using this technology we can capture the 3- dimensional motion of the body and simultaneously mimic it in real-time over the virtual skeleton characters.So, the question arises that all the shapes and skeletons such that of insects and animals are different from human anatomy, how can we mimic those movements? We particularly control the characters by avoiding rigging and skinning pipeline sources. We simply use highly defined pose correspondences that search out the mapping between some arbitrary 3-dimensional origin and mesh target sequences. Our method is extremely fast, flexible and instinctive interface for motion mapping, that directly provides ways to manage characters for real-time animation

  • Meeting platform concept of cloud

During the global pandemic the faculty, staff and students at different colleges and universities experienced increase in meetings using different video call platforms. Different obstacles were faced by them such as low internet connectivity, distractions within the online environment because of the transition from face to face to video calling meeting platforms, this led general stress and well being of others during the global pandemic. Some researches were done over it and they found out that the web-based-video-calling meeting platforms lead to frustration, less sleep and more stress levels. The study was conducted to find out the relationship between human behavior with respect to the frequency and duration of meeting. The study includes 164 men, women and non-binary participants over the age of 18 years, who as a faculty, staff in colleges/universities within the US during the pandemic in the year of 2020. They were recruited using social media campaigns or emails that got them a link to the survey tools that includes demographic and web-based meeting questions such as comfort, stress , frequency and length which measures scales like, subjective, well-being and sleep quality. This study was unable to find out the relationship between frequency of meetings and the well-being of the person. However, some relationships were found between the duration and the well-being of the person (p=0.003) and between comfort with the web-based-video-calling-meeting-platforms and the well-being (p=0.030). Some suggestions were given such as the meetings should be fewer than 2-hours to make sure meeting attendees are proficient within the meeting platforms to support overall well-being of everyone.

4. PROJECTAL ANALYSIS

For the meeting platform, we require an application, either a mobile application or a web application. For this, we have to make a web application that does not require any installation and can run on the browser itself.

The web application can be further divided into two parts: The client side and the server side.

Figure 2

Figure 3 (a) Client Side

In a simple client-server architecture, there are two factors, one is client side, and other is server side. The Client-side is nothing but the view to the user. In simple words, whatever thing a user can see on their screen after opening any application is the client side. The frontend is the other word for this system, which provides the basic structure and looks to any application in order to make it visible properly to the user so that it can be used easily. In this project, the client-side consists of the frontend of our meeting platform, which generates a user-friendly view for the user. Many technologies are available for frontend development in which Html, CSS, and Javascript are the core and essential concepts of frontend web development. HTML provides a basic structure to the application, CSS is used to design the page, and javascript provides the functionality to any web application. Moving further, many libraries are available for development nowadays, and javascript libraries are widely used, such as React, Angular, Vue, etc. In this project, we have made our frontend using React Js. React Js is a javascript library mainly used for developing single-page web applications. Single page web application means if we want to go to another page of any web application, then it does not require reloading the screen instead of reloading it directly to navigate to that page or section. Some of the most popular tech giants are using this technology, such as Youtube, Facebook, Instagram, etc.

Figure 4

Figure 5 (b) Server Side

In the client-server architecture, the second factor is server-side, which is the backbone of any application, it provides the functions to any application, and it generates the usage of an application. The backend is the other word for this system, which provides the meaning of an application. If we are creating an application for registration, then the frontend is nothing but the registration form, and after entering the details the user clicks on submit button then the working of the backend gets started. It collects all the input data and then returns it to the database. So, the backend is the thing that a user cannot see, but if it is there, then only an application can perform according to its build. In this project, the server-side consists of our meeting platform's real-time database and the backend. There are many database services and backend technologies available in the market. For database, there are Mongodb, Mysql, Firebase, etc. Furthermore, for backend technologies, there are technologies like Php, Nodejs, Django, etc. We have used React as our frontend framework, which is a javascriptframework, we have used Nodejs as our backend framework because it is

Figure 6 (c) Meeting Service

The essential service we are providing from Meetverse is a meeting platform where people can create meetings and share the link with others so that they can also attend the meeting. The main factors of a meeting platform are Sound Sharing, Video Sharing, and Screen Sharing. These things will be implemented in our platform with the help of WebRTC. WebRTC is has HTML5 specifications and can be used to add life to the real-time media communications directly between browsers and the client side devices.

Figure 7

Fig.7 shows the growth of WebRTC during the pandemic It also comes with a Javascript API layer on the top that the user can be used inside the browser. This makes it convenient to to develop and to integrate real-time communications anywhere. Internally, WebRTC is still primarily implemented using. C/C++, but most developers that use WebRTC will not need to dig deep into these layers in order to develop their applications. Browser and Operating System support for WebRTC

Figure 8

5. FUTURE SCOPE

As the world is moving towards a Virtual Reality World. MeetVersecan be applied in the Virtual reality simulations such as the MetaVerse, etc. As the education sector has shifted to a blended mode, MeetVersecan be used for online classes and fun activities using the avatar simulation. Unlike any other web meeting application, MeetVerse provides you the ability to hold business meetings for infinite time without any interruption of ads or anything. The MeetVerse is an excellent way to interact with your loved ones by using a cool and exceptionally looking Avatar to attract others. These avatars will behave as a bridge between the persons to create excellent memories and happy moments.

6. CONCLUSION

Online meetings can be boring sometimes because we cannot do many things while attending an online meeting on zoom call or google meet. MeetVerse is a meeting platform with intelligent features based on Ai & Ml. We will have features like gesture control and background change, as well as our avatar simulation. MeetVerse provides user-friendly and user- engaging meetings using new web technologies. It can be applied to significant fields such as the education sector has shifted to blended mode. MeetVersecan be used for having online classes and fun activities by using the avatar simulation. The MeetVerse is an excellent way to interact with your loved ones by using a cool and exceptionally looking Avatar to attract others. These avatars will behave as a bridge between the persons to create excellent memories and happy moments. Unlike any other web meeting application, MeetVerse provides you the ability to A pc / laptop on an i3 processor or above having a webcam and a mic will be needed in order to build/use MeetVerse. Various software/languages/frameworks will be used in the making of MeetVerse like Software: Blender, VsCode, Languages: Javascript, Frameworks: Nodjs, Expressjs, Reactjs, p5.js, Libraries: Mediapipe, Opencv, Webrtc, Database: Firebase, Module: ReactDOM, reprotWebVitals, firepadRef ,fontAwesomeIcons, Card.

REFERENCES

1. StyalianosMystakidis, Metverse, School of Natural Sciences, University of Patras, 28, December 2021 2. Rajendran, Ganesh B., Uma M. Kumarasamy, Chiara Zarro, Parameshachari B. Divakarachari, and Silvia L. Ullo. "Land-use and land-cover classification using a human group-based particle swarm optimization algorithm with an LSTM Classifier on hybrid pre-processing remote-sensing images."Remote Sensing12, no. 24 (2020): 4135 3. AjitkumarShitole and ManojDevare, ―Optimization of Person Prediction Using Sensor Data Analysis of IoT Enabled Physical Location Monitoring‖, ―Journal of Advanced Research in Dynamical and Control Systems‖, Dec 2018, Volume: 10, Issue: 9, pp. 2800-2812, ISSN: 1943-023X. 4. Steve Roberts, Character Animation Fundamentals, 11 August 2021 5. Chakraborty, C., Roy,S., Sharma, S., Tran, T., Dwivedi, P. and Singha, M., 2021. IoT Based Wearable Healthcare System: Post COVID-19. The Impact of the COVID-19 Pandemic on Green Societiesenvironmental Sustainability, pp.305-321. 6. Seyhan, K., Nguyen, T.N., Akleylek, S., Cengiz, K. and Islam, S.H., 2021. Bi-GISIS KE: Modified key exchange protocol with reusable keys for IoT security. Journal of Information Security and Applications, 58, p.102788. 7. W. Youyou, M. Kosinski, and D. Stillwell, ―Computer-based personality judgments are more accurate than those made by humans,‖ Proc. Natl.Acad. Sci., vol. 112, no. 4, pp. 1036–1040, Jan. 2015.135 8. H. A. Schwartz et al., ―Predicting individual well-being through theLanguage of social media,‖ Pac. 9. J. H. Kietzmann, K. Hermkens, I. P. McCarthy, and B. S. Silvestre,―Social media? Get serious! Understanding the functional building blocks of social media,‖ Bus. Horiz., vol. 54, no. 3, pp. 241–251, May 2011. 10. Chakraborty, C., Roy, S., Sharma, S., Tran, T., Adhimoorthy, P., Rajagopalan, K. andJebaranjitham, N., 2021. Impact of Biomedical Waste Management System on Infection Control in the Midst of COVID-19 Pandemic. The Impact of the COVID-19 Pandemic on Green Societiesenvironmental Sustainability, pp.235-262. 11. C. Perlich, B. Dalessandro, T. Raeder, O. Stitelman, and F. Provost,―Machine learning for targeted display advertising: transfer learning inaction,‖ Mach. Learn., vol. 95, no. 1, pp. 103–127, Apr. 2014. 12. L. Zhen, A. K. Bashir, K. Yu, Y. D. Al-Otaibi, C. H. Foh, and P. Xiao, ―Energy-Efficient Random Access for LEO Satellite-Assisted 6G Internet of Remote Things‖, IEEE Internet of Things Journal ,doi:10.1109/JIOT.2020.3030856. 13. L. Zhen, Y. Zhang, K. Yu, N. Kumar, A. Barnawi and Y. Xie, "Early Collision Detection for Massive Random Access in Satellite-Based Internet of Things," IEEE Transactions on Vehicular Technology, vol. 70, no. 5, pp. 5184-5189, May 2021, doi: 10.1109/TVT.2021.3076015. 14. L. Tan, K. Yu, A. K. Bashir, X. Cheng, F. Ming, L. Zhao, X. Zhou, ―Towards Real-time and Efficient Cardiovascular Monitoring for COVID-19 Patients by 5G-Enabled Wearable Medical Devices: A Deep Learning Approach‖, Neural Computing and Applications, 2021, https://doi.org/10.1007/s00521-021-06219-9. 15. Subramani, Prabu, K. Srinivas, R. Sujatha, and B. D. Parameshachari. "Prediction of muscular paralysis disease based on hybrid feature extraction with machine learning technique for COVID-19 and post-COVID-19 patients."Personal and Ubiquitous Computing(2021): 1-14. 16. Nguyen, Ngoc‐Tu, Ming C. Leu, and Xiaoqing Frank Liu. "RTEthernet: Real‐time communication for manufacturing cyberphysicalsystems."Transactions on Emerging Telecommunications Technologies29,

17. Rajendrakumar, Shiny, V. K. Parvati, B. D. Parameshachari, KMSunjivSoyjaudah, and ReshmaBanu. "An intelligent report generator for efficient farming." In2017 International Conference on Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT), pp. 1-5. IEEE, 2017. 18. A Systematic Review of Emoji: Current Research and Future Perspectives, Published online 2019 Oct15. doi: 10.3389/fpsyg.2019.02221, PMCID: PMC6803511, PMID: 31681068 , Qiyu Bai,1,2,* Qi Dan,1,† Zhe Mu,1,† and Maokun Yang1,† 19. Fang Han, Xuesong Bo, ―Research and Literature on Developing Motion Capture System for Analyzing Athelets Action‖ 20. Zoom Meeting Research Paper.docx - Alexandra Daino Zoom... 21. Body and mind: A study of avatar personalization in three virtual worlds, April 2009, DOI:10.1145/1518701.1518877, Source,DBLP 22. All One Needs to Know about Metaverse: A Complete Survey on Technological Singularity, Virtual Ecosystem, and Research Agenda, October 2021, DOI:10.13140/RG.2.2.11200.05124/8, Project: The Planet-

Corresponding Author Kunal Chawla*

Research Scholar, Bachelor‘s in Engineering in Computer Science with Specialication in Artificial Intelligence & Machine Learning, Apex Institute of Technology, Chandigarh University, Gharuan, Punjab,140413