Audio and Video Software Industry: 2023 in Review
As always, each day, we see something new happening in the AV world, but don't worry - we have prepared a review for 2023. We have noticed an increase in AI usage, the QUIC protocol getting a lot of attention, as well as AV1 integration in both software and hardware. And as always, there have been WebRTC advancements. We have highlighted important conferences and exhibitions that took place throughout 2023. Also, take a look at the list of some news articles we have collected - there may be some that you missed!
News
Top Stories
Twilio Programmable Video sunset
Improved video calling with faster AV1 encoding
Third time’s a charm: WebRTC Insights, 3 years in
How Leaders Can Balance Budget And Performance When Assessing Integrated Audiovisual Installations
AI Is Forever Changing How We Interact with Our Devices
Other Stories
Will WebRTC rock again in Firefox in 2024?
High Available WebRTC Media Servers on AWS
Cloud gaming, virtual desktops and WebRTC
Fitting WebRTC in the brave new world of webcams, security, surveillance and visual intelligence
The Dichotomy of AV Networks: Transition to IP versus Traditional Frameworks
WHIP & WHEP: Is WebRTC the future of live streaming?
For developers
WebRTC Server: What It Is and Why You Need One
Kubernetes: The next step for WebRTC
Example setup of a combined MPEG-DASH and WebRTC distribution
WebRTC conferences – to mix or to route audio
Product News
Chime
- Discover how Amazon Chime SDK has revolutionized telephony services by seamlessly incorporating voice bots - Add voice bots to your existing telephony services to using Amazon Chime SDK [Dec 15, 2023]
- Explore the transformative potential of Amazon Chime SDK Call Analytics in enhancing call recording quality - Get higher quality call recordings using Amazon Chime SDK call Analytics [Sept 14, 2023]
- Uncover the game-changing updates in Amazon Chime SDK for iOS and Android, revolutionizing mobile communication for developers and users alike - Amazon Chime SDK for iOS & Android [Sept 7, 2023]
- Discover how CloudWatch empowers monitoring Amazon Chime SDK voice connectors, optimizing performance and reliability in voice communication - Monitoring voice communications building a dashboard for Amazon Chime SDK Voice Connectors using Cloudwatch [Aug 9, 2023]
8x8
- Explore the strategic advantages of adding video to contact center operations - Adopt Video in Your Contact Center [Nov 16, 2023]
- Discover the game-changing impact of Voice IVR on customer interactions - Evolve customer Interactions with the power of Voice IVR [14 Nov, 2023]
- Empowering Remote Communication: Exploring 8x8 Voice for Microsoft Teams Integration - 5 Must-Have Features in a 3rd-Party Telephone System for Microsoft Teams [July 28, 2023]
Discord
- Experimenting with End-to-End Encryption for Voice & Video. During this testing phase, developers aren't required to support end-to-end encryption. When a bot enters a voice channel, it will switch to standard protocols automatically, potentially causing a brief audio delay but ensuring a seamless user experience - Encryption for Voice and Video on Discord [Aug 11, 2023]
Dolby.io
- OBS Studio 30.0 integrates WHIP for ultra-low latency real-time streaming, enhancing security and performance in a user-friendly setup - OBS Studio Adds Native WebRTC Streaming with WHIP
- Emphasizing WebRTC's role in low-latency streaming, the spotlight is on a recent collaboration for real-time and rewind capabilities - Enabling Live DVR and Rewind with a WebRTC Real-Time Stream
- Integrate real-time graphics seamlessly into Dolby.io WebRTC streaming using Singular.live, offering client-side rendering for personalized overlays, language options, interactive elements, and targeted ads - Real-Time graphics to your WebRTC Stream with Singular Live
- Experience the global reach of Dolby.io's WebRTC, providing high-quality, low-latency live streams with seamless recording options for asynchronous viewing - Recording WebRTC Streams with Dolby.io
Microsoft
- AI in Microsoft Teams - Voice Isolation
- Spatial Audio in Teams Meetings
- Support Breakout Rooms on VDI
- Ultrasound Howling Detection
Netflix
- Detecting Scene Changes in Audiovisual Content
- Detecting Speech and Music in Audio Content
- All of Netflix HDR video streaming is now dynamically optimized
Meta
- Bringing HDR video to Reels
- Enhancing the Security of WhatsApp calls
- Meta's first ASIC for video transcoding
- Innovative AIs on Messenger and WhatsApp
RingCentral
- Introducing RingCentral's enhanced Direct Routing for Microsoft Teams Sep 21, 2023
- RingCentral reshapes its video collaboration future with the acquisition of Hopin Events and Session Platforms Aug 09, 2023
- Say hello to RingCentral's redesigned app May 05, 2023
- Introducing more intuitive and empowering user call handling Jul 20, 2023
TikTok
- TikTok Upgrades App Experience for Larger Devices Dec 18, 2023
- Create effects on the go with TikTok's mobile effect editor Nov 16, 2023
- Introducing more ways for creators to monetize their content authentically May 16, 2023
Youtube
- Youtube approach to responsible AI innovation Nov 14, 2023
- Better audio control on mobile devices Oct 17, 2023
- 6 ways to level up Shorts with Youtube newest creation tools Aug 01, 2023
Zoom
- Bring industry-leading video to your application with the Zoom Video SDK Oct 24, 2023
- Zoom Launches Video Conferencing Platform on Sony TVs June 21, 2023
- Connecting more rooms to more meetings with new partners like Google Meet Jan 06, 2023
- Meet Zoom Virtual Agent Jan 24, 2023
Webex
- Webex Integration Partners join the AI revolution at WebexOne in California Oct 27, 2023
- Introducing a new Webex App experience for Samsung Galaxy Foldables & Tablets Oct 11, 2023
- Driving Hybrid Work Forward with Audi Jun 1, 2023
- New innovations for the future-of-meetings Jun 6, 2023
100ms Live
- How to build WhatsApp like audio-video calling in Flutter using CallKit July 25, 2023
- Enhanced live classroom experience at scale with the WebRTC-HLS stack Jul 31, 2023
- How We Built Local Audio Streaming in Android Feb 5, 2023
Dialpad
Digital Samba
- Introducing the New Digital Samba Google Calendar Add-on Dec 5, 2023
- Digital Samba Introduces End-to-End Encryption for Private and Group Video Calls Aug 23, 2023
- Understanding and Preventing Packet Loss in WebRTC July 31, 2023
Google Meet
- Avoid video distraction in Meet April, 2023
- Use noise cancellation in Google Meet on Android March, 2023
Mux
- Track video playback progress and display a video heatmap with React Nov 10, 2023
- More pixels, fewer problems: Introducing 4K support for Mux Video Sep 13, 2023
- Faster video processing and cost controls with Mux's Upload SDKs for iOS and Android Aug 21, 2023
- Approaches to building video into your app Aug 1, 2023
RealTyme
- WhatsApp Video Call Encryption: How Secure is it Really? March 28, 2023
Telnyx
- Introducing noise suppression for Voice API and TeXML Nov 16, 2023
- Experience true HD Voice with Telnyx Oct 16, 2023
- Microsoft Teams adds Telnyx as a provider for Operator Connect Apr 12, 2023
TrueConf
- TrueConf 3.5 for iOS: Smart layouts and support for waiting rooms Sept 21, 2023
- TrueConf Room 4.2: TrueConf Room Service add-in and face tracking Sept 5, 2023
- TrueConf introduces a new solution to protect on-premises video collaboration systems from external threats June 2, 2023
Whereby
- How We Added Video Calls to a Unity-Powered VR Product in Just Five Hours Dec 11, 2023
- Revolutionizing Healthcare: The Power of Video Call Technology in Transforming Medical Services Oct 04, 2023
- P2P vs SFU Video Calls: Which is Best? June 14, 2023
Events
Demuxed, 24-25 Oct, USA
Demuxed is a conference and community founded in 2015 by video engineers, catering to the technical side of video technology. It offers a platform for engineers involved in encoding, delivery, playback, and related areas. This year we also participated - Alvaro Laserna Lopez, Director of R&D and Test Automation and Country Manager at TestDevLab, attended and discussed the challenge of finding reliable metrics for audio quality assessment, noting limitations in existing algorithms like POLQA and ViSQOL.
To address this, we have developed two new deep learning-based algorithms: SpeechQ for speech transcription and ASQ-ViT for audio classification using spectrogram transformers. The results showed a strong correlation with subjective scoring, demonstrating the effectiveness and simplicity of these algorithms in providing objective quality measurements for audio outputs.
Some topics included:
- Kyber, A New Approach for Real-Time Video and Controls Streaming Based on QUIC: Kyber is an innovative open-source project introducing a real-time video and controls streaming approach based on the Quic protocol. Offering low-latency encoding and decoding, it supports many applications such as Remote Desktop, Cloud Gaming, and Drone/Robot control, providing cross-platform compatibility for both client and server side, achieving impressive 20/25ms glass-to-glass latency using codecs like H.264, HEVC, VP9, and AV1, and leveraging FFmpeg and VLC for implementation.
VIDEO @Scale, 29-30 Nov
Video @Scale 2023 is a technical conference for engineers who develop or manage large-scale video systems. The @Scale community focuses on sharing people's experiences in creating innovative solutions. This year's conference featured speakers from Beamr, Eluvio, Google, Heygen, Ittiam, Meta, and StreamShark discussing topics like managing media processing in real-time, personalized and intelligent content creation and innovations driven by scaling challenges on both server and client sides. Some interesting presentations included:
- Virtual Video Files, Materialize Videos Out of Thin Air: Mike Starr introduced the concept of virtual video files, a cost-effective solution for managing diverse video content by enabling instant creation and manipulation, reducing physical storage costs. He explained the differences between Progressive and Fragmented MP4 formats, highlighting practical applications like serving Progressive MP4s to players without Dash support and distributed encoding for parallel transcoding. He also introduced a custom file protocol for FFmpeg to enhance data handling efficiency.
- Build a Performance and Reliable Real-Time Generative AI Video System: Building a Performant and Reliable Real-Time Generative AI Video System: Rui Zhang discussed HeyGen - an AI-driven platform for instant video creation with many avatar and voice options in over 40 languages. The talk covered optimizing connection states, AI video generation processes, implementing a five-tier cache, enabling real-time audio and text streaming capabilities, and developing an SDK for easier integration. The platform is designed for marketing, sales, training, and onboarding initiatives.
Video Quality Experts Group (VQEG) Meetings
The Video Quality Experts Group (VQEG) holds face-to-face meetings approximately twice per year. These meetings are open to the public and hosted by VQEG members. They feature progress reports, presentations on state-of-the-art research, debates on best practices, and discussions among video quality experts regarding video quality research.
This year there was an in-person meeting hosted by Sony Interactive Entertainment (SIE) at their headquarters in San Mateo, CA, USA from June 26-30. There was also an online meeting hosted by the University of Konstanz from Dec 18-21. Some of the topics discussed included:
- Comparison of Conditions for Omnidirectional Video with Spatial Audio in Terms of Subjective Quality and Impacts on Objective Metrics Resolving Power: This discussion revolved around audiovisual quality metrics for video streaming services, particularly models like VMAF_ViSQOL and P1203. The study evaluated the performance of these metrics, emphasizing resolving power and efficiency in assessing 360° videos with high-order ambisonics (HOA) audio across different setups. The conclusion suggested exploring the relationship between metrics' discriminability and resolving power, proposing new parameter-based models, and outlining future work analyzing additional data, including video-only scores and audio impacts across setups.
- QoE for Remote Control and Human Interaction in Industrial Tele-Operated Driving: Kjell Brunnström from RISE Research Institutes of Sweden presented RISE's mission to support small and medium enterprises. His focus was on Quality of Experience (QoE) for remote control scenarios, including measurements of perceived sound and eye movement. He also touched on human interaction in industrial tele-operated driving and perceptual aspects of real-time video streaming with drones.
Streaming Media Connect
There were three Streaming Media Connect events this year - Feb 14-17, Aug 22-24, and Nov 13-16. These conferences offered many valuable insights into streaming, covering live streaming, OTT, content delivery, and next-gen TV, including topics such as:
- The Future of Live Concert Streaming, a conversation with deadmau5: Deadmau5 discusses the limitations of live streaming, challenges and addresses the future.
- Beyond ChatGPT, How AI Is Transforming Streaming Workflows and Businesses: The role of AI in transforming streaming workflows and business strategies is crucial in the rapidly evolving media landscape. The discussion explores AI's impact on the streaming and video production industry, covering marketing, content editing, and workflow optimization. Key points include AI's role in intelligent video, ethical concerns, customer-centric AI app development, and its transformative effect on software development.
- Quality and Quantity, Optimizing Live Streams at Scale: As streaming services continue to grow, optimizing live streams at scale becomes a key challenge. The discussion emphasizes the importance of choosing the right technology for high-quality and low-latency live streams, the evolution of streaming technology, and the transition of CDNs from stateful to stateless architecture.
Zoomtopia, 3-6 Oct
Zoom's award-winning event, Zoomtopia, unveils cutting-edge innovations and insights, showcasing the power of AI for enhanced collaboration and efficiency in the workplace.
This year new features to boost productivity in hybrid work settings were unveiled:
- Zoom Docs
- Zoom AI Companion
- Workvivo
- productivity enhancements, for example, revamped Meetings tab and Zoom Notes,
- Zoom Contact Center and Virtual Agent, including integrations with WhatsApp and Messenger
- open ecosystem - Zoom App Marketplace with 2,500+ apps, third-party app support, and admin-authorized curated app lists for users.
Zoomtopia Partner Connect, a component of the main Zoomtopia event, expanded beyond San Jose - hosting additional events in London (November 29, 2023), Singapore (December 5, 2023), and Tokyo (December 7, 2023). These events offered participants the opportunity to explore Zoom's transformative platform updates and strengthen partnerships for continued success. Attendees heard from executive leadership and engaged with industry peers, making it a valuable experience for networking and insights.
Understanding Latency, 11-13 Dec
Understanding Latency explores the latest advancements in latency management, L4S, and low-latency networking through a webinar series led by industry experts, gaining insights into their impact on application performance.
International Conference on Signal Processing and Communication Systems (ISCPCS), 6-8 Sep, Poland
ICSPCS explores the evolution of communication systems through signal processing, covering AI, 5G, Green Communications, IoT protocols, medical and forensic applications, image, video and audio processing for multimedia.
International Workshop on Signal Processing and Machine Learning (WSPML), 22-24 Sep
WSPML seeks to create an open forum for researchers, engineers, scientists, and industrial delegates in the dynamic landscape of signal processing and machine learning. The conference facilitates the sharing of the latest insights and research findings through invited speeches, oral and poster sessions, and networking events, acknowledging the continuous evolution driven by technological innovations and scientific discoveries.
International Conference on Systems, Signals and Image Processing (IWSSIP), 27-29 Jun, North Macedonia
IWSSIP is a conference that unites researchers and developers from academia and industry to showcase the latest scientific advancements, discuss key issues, and present cutting-edge systems. The event, following successful editions in various global locations, invites papers on a range of topics for consideration.
Streaming Media East, 18-19 May, USA
Streaming Media East is a transformative event offering career and business insights for executives, managers, and technical professionals in the streaming media industry. With dedicated tracks, pre-conference workshops, and an Innovation Track, it provides comprehensive coverage of business strategies, trends, and technical advancements in streaming media.
IEEE Events
The Institute of Electrical and Electronics Engineers is a global association dedicated to advancing technology for humanity. With members spanning various technical disciplines, including electrical engineering, computer science, and electronics, IEEE publishes significant literature in these fields, develops industry benchmarks through standards, and hosts conferences, webinars, lectures, and forums to facilitate knowledge exchange and collaboration.
Conferences
- International Conference on Acoustics, Speech and Signal Processing (ICASSP) 4-10 Jun
- International Conference on Quality of Multimedia Experience (QoMEX) 20-22 Jun
- International Conference on Immersive and 3D Audio (I3DA) 5-7 Sep
- International Symposium on Image and Signal Processing and Analysis (ISPA) 18-19 Sep
- International Workshop on Signal Processing Advances in Wireless Communications (SPAWC) 25-28 Sep
- International Workshop on Multimedia Signal Processing (MMSP) 27-29 Sep
- International Conference on Image Processing (ICIP) 8-11 Oct
- Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 22-25 Oct
- Automatic Speech Recognition and Understanding Workshop (ASRU) 16-20 Dec
Exhibitions
Some exhibitions related to audio video hardware and software products that took place during 2023 are:
- Integrated Systems Europe, Spain, 31 Jan - 3 Feb
- Music Inside Rimini, Italy, 2-4 Apr
- NAB Show, USA, 16-19 Apr
- Prolight + Sound, Germany, 25-28 Apr
- InfoComm Asia, Thailand, 22-26 May
- Prolight + Sound, China, 22-26 May
- Palm Expo, India, 25-27 May
- Augmented World Expo, USA, 31 May - 2 Jun
- InfoComm, USA, 10-16 Jun
- Augmented World Expo Asia, Singapore, 30-31 Aug
- Integrate, Australia, 30 Aug - 1 Sep
- Plasa Show, Great Britain, 3-5 Sep
- HKTDC Hong Kong Electronics Fair, China, 13-16 Oct
- InfoComm India, 25-27 Oct
- JTSE, France, 29-30 Nov
2024 is here, and new conferences and exhibitions to attend will come. We are sure many more advancements will be made in the world of audio and video. Let's follow them closely as they unfold!