ELI5: What is WebRTC?
4 min read

ELI5: What is WebRTC?

Beginner's introduction to WebRTC

What is WebRTC?

Have you ever called anyone before? Like calling others on the phone or using facetime to directly talk with people that are thousands of miles away.

Whenever you call someone, there needs to be some way to move your voice and video to the person you're talking to. Your phone essentially takes a picture of your face every second, and gives it to the person, who can watch the pictures and talk with you!
Pretty interesting, right? But how do you think that works? How do we send those pictures and recordings to the opposite side of the earth?

The answer is WebRTC.

WebRTC is what allows you to see and talk with your friends and families from miles away.

It lets you to instantly see and hear people because the speed that the data moves is incredibly fast. This is partly because of its usage of p2p communication.

P2P?

P2P is also known as peer-to-peer networking. Okay, so it's just like talking to others directly. Normally, computers do not talk to each other directly but work like sending mail to someone. When you write a letter to another person, that letter goes through many different mailmen to deliver that letter to your friend, who can send another letter to you.

However, in P2P, you write the letter and hand it to your friend directly. There is no mailman that delivers your mail in a truck. This can be efficient as it saves time from moving those letters through the highways.

So in WebRTC, P2P works like that your computer hands the picture and recording of you directly to another computer you are communicating with. This makes things very fast and allows those pictures that you are taking to look like videos.

How is it used?

WebRTC was introduced years ago, and through many hard-works and constant improvement, it has become one of the most used tools for talking to others with computers.

Also, it is free and open to free any to use!

How does it work?

Okay, so we know that WebRTC allows us to talk with others instantly, allowing video and voice calls. But how does it all work? How do computers really communicate in such a fast manner?

Signaling

I explained earlier that computers talk directly with each other instead of relying on another mailman. However, computers still need a way to find others. Just like you have to know who you are going to talk to and find where they are, computers need to find where another computer is at to start talking.

Consider an instance, you want to talk with your mom but have no idea where she's at. So you go to your dad, who knows where your mom is, and ask him, "Where is mom?" He'll answer "She's at the garage," and you'll go to the garage to see her there!

Now, your dad is the "middle man" here. And we call this person "Signaling Server" in WebRTC. Signaling Server tells where different people are at and allows the computers to "signal", or find each other.

Ice Candidates

Now, we know why we do such thing as "signaling": to find where the other computer is at. Since we know where people are located, we can start sending and receiving letters. However, we want to reduce the time it takes to get to another person.

Let's go back to the previous example. Now that you know where your mom is, you probably want to get there as fast as you can. But there are many ways to get to the garage. You could go down the stairs and through the hallway and open the garage door, you could jump out of the window and enter the garage, or you could circle the house ten times before going into the garage. See, there are many ways to get to mom, and it would be pretty inefficient if we spend 10 minutes circling the house instead of spending 10 seconds to directly go to the garage.

Ice Candidates are like the map. It essentially tells you the paths to get to the computers. With different paths, we can find which one is the shortest and fastest and pick that route. From then, you will only walk down the stairs once to get to the garage.

Media Information / Session Description

Another information that we can use is Media Capability and Session Descriptions.

Okay, that's a little too complicated for a 5-year-old...

Media Capability allows you to specify what you want to send to your friend and where to send it from. If you want to just call your friend or want to send your video as well, you can specify that in your Media Information.

Session Description further allows you to specify the quality, type, and method that you might use when you send your video and audio.

It's just simple to understand these as the information and specifications about the connection you're about to make.

Connecting

Now, we know everything about your friend and you. You know where they are and how to get to them fastest. You know what you want to send them and what you're about to receive from them.

Now you just leave it to WebRTC and if you did the signaling part correctly, you will be able to talk with your friend instantly.

Communication

Now that we're connected, we can talk with our friends and see them in real-time, even if you're on the other side of the Earth!

Communicating Data

With WebRTC, we can send many different types of content to our friends. It can include anything from photos, videos, audios, documents, and so much more.

And WebRTC does this with a Data Stream.

Streams

In WebRTC, we can see many different types of streams from Media Stream, Data Stream, Video Stream, Audio Stream, and so much more.

Stream is just like the stream that we usually know. Like a stream of water, data streams send and receive data continuously, just like a river. While we actually receive each data in parts, because they are received almost simultaneously, it seems like a continuous "stream."

Think about it this way: flipbooks. Flipbooks are "animatable books" made out of many individual pictures that are often drawn in a series. However, when you "flip" them very fast, the consecutive images seem to be moving, creating animating effect.

Even though WebRTC receives a single part of a video or audio each time, because they are played so fast and smoothly, they get displayed as a complete video or audio.


There are many more things to talk about WebRTC such as Data Channels, Reconnecting, Video Statistics, but we'll look at it a little later.
{: .notice--success}