bloggeek

Subscribe to bloggeek feed bloggeek
The leading authority on WebRTC
Updated: 45 min 54 sec ago

How WebRTC Works?

Mon, 11/20/2017 - 12:00

WebRTC has many moving parts in it.

When WebRTC works it seems like magic. You point your browser to a URL. Get someone else to point his browser to a URL – and – you now see each other.

How cool can that be?

If you look below the hood, there’s a lot going on in there.

Looking for a WebRTC course to dig deeper and build a solid architecture for your product?

Check out my WebRTC course

I’ll try to give the explanation of how WebRTC works in a few different angles here. Together, they should create a pretty good picture of what’s going on.

WebRTC Basic Concept

Here’s the first thing I usually say about WebRTC:

WebRTC is the means to drive real time communications (voice, video and arbitrary data) directly inside a web browser. No need for any plugin or download to do that.

From a different perspective, WebRTC is just a media engine with a JavaScript API on top of it, so everyone knows how to use it (although browser implementations still varies from one another).

Somehow, that’s not saying much.

So let’s start with what makes WebRTC truly unique from a browser perspective.

If up until now, when you thought of a web application you were thinking client and server –

You have the browser as a client. It connects to the server to ask for stuff. Lets call these things requests. And the server obliged by sending responses. We’ve grown beyond that using WebSockets, but it still is rather the same. If I want to send a message to a friend who is looking at his own browser just now, the message needs to go to the server and from there to my friend. Much like the post office works.

WebRTC is where browsers and HTML diverges from this paradigm:

While we still need to somehow signal from one browser to the other so we will be able to locate each other, once that signaling is over, we can send them messages directly between the two browsers – without the web server ever touching the messages. Magic.

This is why many refer to WebRTC as a peer-to-peer technology. Or P2P in short. Because browsers can communicate directly.

Separation of Signaling and Media

When loading web pages, we are now used to the fact that the browser goes fetching a 100 different resources just to render a web page. These resources can come from various different servers – the host of the page, a CDN holding static files and a few third party sites. That said, this will mostly boil down to three types of files:

  1. HTML and CSS, which makeup the main content of the site and its style
  2. JS, which is usually there to run the interactive part of the website
  3. Image files and other similar resources

It ends up being a mixture of static stuff and a bit of code to hold it all together.

WebRTC is… different.

It requires two types of interactions that go over the network. Signaling and media.

Signaling takes place over an HTTPS connection or a websocket. It is implemented via JS code. What you do in signaling is decide how the users are going to find each other and start a conversation.

One important thing about signaling – it isn’t part of WebRTC itself. The developer is left to decide how to pass the information needed to create a WebRTC session. WebRTC will generate the bits of information it needs to send and process such bits of information that gets received but it won’t really do anything over the network about them. These bits of information are packed into SDP messages by WebRTC today.

The actual media goes off on a very different medium and connection. It goes through “media channels”. These use either SRTP (for voice and video) or SCTP (for the data channel).

Media takes a different route than signaling over the network and behaves very differently. This is true for the browser, the network AND the servers you need to make it work.

Audio and Video

Audio and video is the main thing you’ll notice with WebRTC. It is also what gets showcased in almost all demos and examples of WebRTC.

The reason for that is simple – video is VERY visual and interactive.

Audio and video in WebRTC works by using codecs. These are known algorithms that are used to compress and decompress audio and video data. There are different codecs you can use in WebRTC and I won’t get into it now.

Audio and video also gets interesting because it is sent with low latency in mind. If packets get lost along the way due to network issues – it might not be worth retransmitting them (another first in the HTML).

WebRTC uses known VoIP techniques to get media processed and sent through the network, and this is all done over SRTP – the secure and encrypted version of RTP. WebRTC did make some minor changes by using specific mechanisms in SRTP that were not in wide use before, making it a bit harder to interoperate with if you have a VoIP service deployed already.

Data too

You can also send arbitrary data with WebRTC. This is done over what’s called the data channel in WebRTC.

The data channel can be used when what you want to do is send direct messages between browsers without going through any server (you may still need to relay it through a TURN server though).

NAT Traversal

Being able to communicate directly across browsers is great, but it doesn’t always work.

The internet was built on the client-server paradigm some 30-40 years ago. Since then it has changed somewhat. Today, most users access the internet from behind a firewall or a NAT. These devices usually change the IP address of the user’s device and mask it from the open web. This masking can be just that, or it can also offer some measure of “protection” where unsolicited traffic is not allowed towards the user’s device. The problem with this approach, is that WebRTC uses different mediums for signaling and media so understanding what’s solicited and what’s unsolicited traffic isn’t easy.

Furthermore, there are enterprises who make it a point not to let any type of traffic into (or out of) their network without vetting it.

Which brings us to these types of scenarios:

The guy there on the left? He now might actually know the public IP address of the guy on the right due to that STUN request that was made. But the public IP address might only be opened to the STUN server and having anyone else try to connect through that “pinhole” that was created may still fail.

In order to overcome these issues, a user’s device will not be able to directly communicate with another device located inside some other private network. And the workaround for that is to relay that blocked media through a public server. This is the whole purpose of TURN servers:

You can expect anywhere between 5-20% of your sessions to require the use of TURN servers.

Due to this complication, a WebRTC session takes the following steps:

  1. Send out an SDP offer to a web server. This SDP message outlines what are the media channels the device wants to exchange and how to find them
  2. Receive an SDP answer via the web server from the other device. Remember that that other device may be a media server
  3. Initiate a procedure called ICE negotiation, meant to find out if the devices are reachable directly, peer-to-peer or do they require media relay via TURN. This process is best done using trickle ICE, but that’s for another day
  4. Once done, media flows directly between the devices

All this mucking around requires asynchronous programming on the browser using JS code and can be done using JavaScript promises. On the server side, you can use whatever you want to manage media and signaling.

Oftentimes, developers won’t develop directly against the WebRTC APIs and will use third party frameworks and modules to do that for them – open source or commercial.

Quick Recap
  • WebRTC sends data directly across browsers – P2P
  • It can send audio, video or arbitrary data in real time
  • It needs to use NAT traversal mechanisms for browsers to reach each other
  • Sometimes, P2P must go through a relay server (TURN)
  • With WebRTC you need to think about signaling and media. They are separate from one another
  • P2P is not mandated. It is just possible. You can place media servers if and when you need them. It “breaks” P2P, but we’re looking to solve problems, not write an academic dissertation
  • Servers you’ll need in a WebRTC product:
    1. Signaling server (either as part of your application server or as a separate entity)
    2. STUN/TURN servers (that’s what gets used for NAT traversal
    3. Media servers (optional. Only if your use case calls for it)
WebRTC API Viewpoint

WebRTC has 3 main API groups:

  1. getUserMedia
  2. PeerConnection
  3. Data Channel
getUserMedia

getUserMedia is in charge of giving the user access to the camera, microphone and screen. It alone gives value for those who need to do things locally, without implementing real time conversations.

Here are a few uses of standalone-getUserMedia:

  • Take a user’s profile picture
  • Collect audio samples and send them to a speech to text engine
  • Record audio and video with no quality degradation due to packet loss

I am sure you can come up with more uses to it.

PeerConnection

PeerConnection is at the heart of WebRTC and the most complex to implement and to understand. In a way, it does EVERYTHING.

  • It handles all the SDP message exchange (not sending them through the network itself, but generating them and processing the incoming ones).
  • It implements ICE in order to connect the media channels, going through TURN relays if needed
  • It encodes and decodes the audio and video data in realtime
  • It sends and receives the media over the network
  • It handles network issues by employing adaptive jitter buffer, bandwidth estimation, packet loss concealment, forward error correction and other algorithms that you really don’t want to know, but eventually will need to learn
  • It handles local audio issues using algorithms such as acoustic echo cancellation

Much of what goes on inside peer connection that affects the resulting media quality is based on heuristics. A specific set of arbitrary rules. Different implementations may have different behaviors and different media quality due to this.

DataChannel

I’ve discussed the data channel somewhat earlier.

The only thing to add here is that:

  1. Data channels can be configured to be reliable or unreliable. If you set them to unreliable then messages will not be automatically retransmitted on them. Sometimes, that would be your preference. They can also be configured to be ordered or unordered in the way they deliver messages
  2. Data channels were designed to work on the API level similar to WebSocket, so once you open it, you can think about it in a similar fashion.

You can find a few ideas of what people are doing with data channels here. There are more ways you can make use of it.

The WebRTC Implementer’s Viewpoint

If what you’re looking for is to implement an application that makes use of WebRTC, then here are some activities you’ll need to deal with:

  1. Client side
  2. Signaling
  3. NAT traversal
  4. Media

Before you continue, you may want to check out this article about programming languages in WebRTC.

Client Side

The client side can be a browser, mobile application, PC application or an embedded device.

For web browsers, you’ll be developing using JavaScript. Either using WebRTC’s APIs directly (unlikely) or by using an existing framework of sorts (github is where many people start – just make sure you pick something popular that got updated recently).

For mobile applications, this is mostly about finding an SDK you’re comfortable with. There are again a few available on github, along with the official ones coming from Google for iOS and Android. There are also some commercial mobile SDK out there that are pretty good.

You can go for a PC application. Most do it by using Electron. And there’s also the embedded approach, which means either taking the official Google WebRTC codebase and porting it to whatever device you have or developing something on your own – I’ve seen both approaches work.

Signaling

You will need a signaling server. The first thing a WebRTC client will do is call the mothership. That is used to coordinate whatever session you have in mind for it.

The signaling server isn’t in the scope of the WebRTC specification so it is up to you to figure out what to use here. Most of the code you’ll find in the github for the browser client is actually going to be an implementation of a signaling server.

Remember that the signaling server can be separate from your web server or they can reside within the same process – up to you. And in any case, the first thing to do is to check if there’s already some kind of a signaling mechanism that you have in place for your application for things that aren’t WebRTC. You might be able to piggyback your SDP messages and other WebRTC related signaling over that mechanism (I know that’s what I’d try to do first).

NAT Traversal

For NAT traversal you will need to deploy STUN/TURN servers.

We’ll first start with what NOT to do:

  • Don’t assume you won’t be needing TURN
  • Don’t use public STUN servers
  • Don’t have a single server for everything
  • Don’t start by building a world-class global network of servers. You’ll get there, but it can wait

Now what you should do:

  • Deploy STUN and TURN in the same server. On the same process
  • Use coturn. That’s what everyone else is using
  • Or instead, just get a hosted NAT traversal service from someone. XirSys and Twilio are good alternatives
Media

if you are planning on group voice and video sessions, connectivity to PSTN or other networks, recording or other fancy features, then media servers are in your immediate future.

Look for something that fits well with your use case.

I’d even say start here before picking anything else in your technology stack.

There are a few open source and commercial alternatives out there. They are different from one another in many ways.

Looking for a WebRTC Training?

The purpose of this article is to get you the most basic understanding of WebRTC if you’re a newb. I didn’t want to take the approach of building a “hello world” application – you can find many of these on the internet already. What I wanted to do instead is go somewhat higher and take a look at the bigger picture – you’ll be needing it soon enough.

In many cases, people start with a “hello world” implementation of WebRTC and try to fit it to their own scenario. I find that it is the wrong way in many cases, as it all depends on what it is you are trying to build – it will dictate the starting point you’ll need to make in your journey.

Spend the time to read this article, and then go read a “hello world” manual or two for WebRTC. It will make it a lot more effective if you do.

Looking for a WebRTC course to dig deeper and build a solid architecture for your product?

Check out my WebRTC course

The post How WebRTC Works? appeared first on BlogGeek.me.

Jeff Lawson on the Past, Present and Future of Programmable Communications

Thu, 11/16/2017 - 12:00

An interview with Jeff Lawson, Co-founder and CEO of Twilio.

After going to Twilio Signal event in London in September, I was asked by Twilio’s analyst relations about the event. I shared my thoughts in a lengthy article already, so it was easy to send out a link.

I did one more thing.

I decided to ask her if I can interview Jeff Lawson in person the next time I’ll be in San Francisco (which happened to be the following month during Kranky Geek). My expectation was to be ignored, or to just be declined.

But when she came back with an approval… I was clueless as to how to proceed.

We ended up deciding together on a recorded video interview.I was given free reign as to what questions to ask, with the request to share them if possible before the interview. No restrictions were placed. I reached out to a few friends asking for their thoughts of good questions, added a few of mine and prepared for the interview.

Jeff gave me his full attention for the better part of an hour. I ended up using everything we recorded – not removing any of the answers.

The result? A longish interview of around 37 minutes. I’ve added the transcript below the interview as well, if you’re more of a textual person.

I’d like to thank Jeff and the team at Twilio that made this one happen.

Transcript

Tsahi Levent-Levi: Good morning, Jeff.

Jeff Lawson: Good morning.

Tsahi: Okay. I’d like to start with something, a question that I was very interested in. You have two kids, right?

Jeff: Yeah.

Tsahi: Are they young?

Jeff: Yeah.

Tsahi: How do you explain to them what you do every day?

Jeff: That’s a great question. It’s hard to explain to a young kid what Twilio is, but here’s what I’ve found is they use their phones … They don’t use their phones. They steal our phones, but the only thing we really let them do is communicate. If you think about it, that’s the very first thing that a kid wants to do. Call Grandma, and I’ll FaceTime Grandma from the phone. I explain that Twilio … Twilio is a technology. We let everybody who wants to be able to build things that communicate, we let them do that.

Tsahi: Okay. So that’s CPaaS in a way, right?

Jeff: CPaaS. Yeah. In an essence, we let companies call Grandma.

Tsahi: Yes. Okay. Letting companies call Grandma. I’ll tell that to my daughter.

Jeff: If Grandma is your customer and you need to engage with her.

Tsahi: Yes. When you started Twilio, like nine or 10 years ago, what was the original vision behind it? I guess it was slightly different than what it is today.

Jeff: It’s actually pretty similar to what it is today, I have to say. We started Twilio because I’m a software developer. I’ve been a developer for 20 years, and I also started multiple companies prior to Twilio. At each company, a common thread arose. At every single one of those companies, first of all, we were using the power of software to build a customer experience that was better than anything in the industry that had come before us.

I had started a variety of companies. An academic content company for college students online, StubHub, the online ticket exchange for secondhand tickets, and a brick and mortar retailer, of all things. The common thread among all of these was we were using software to build a great customer experience. We were using software to build amazing web applications, to represent the business, to enable us to touch customers. StubHub is the whole ability just to be able to connect folks together to buy and sell tickets. Software was key to that, and the key of software is agility. The ability to constantly iterate, constantly listen to your customers, put something out there in the world that you think solves a problem for them, get feedback and iterate. Sprint over sprint, every couple of weeks, you’re putting out something better, learning from your customers. That’s the super power of software. In every one of those companies, I had another problem. At some point or another, I had always needed to reach out and communicate with my customers. Just makes sense. Every time it happened, I said, “Well, that’s neat, but I’m a software developer. What do I know about making the phone ring?” That’s like magic. I have no idea how that works.

So I’d go to the industry, and I’d say, “How are we supposed to build this idea that we have?” We want to integrate with these systems. I have this idea for how I want to touch our customers, and the industry would say, “Oh, okay. Yeah, yeah. We think we can help you with that. First thing, let’s pull a bunch of copper wires from the carrier to your data center. Then we’re going to rack up a bunch of carrier gear in your data center, and then, let’s see. None of this was designed to do this idea you have, so we’re going to bring in this professional services army. They need to come integrate it, and they’re going to beat up all that equipment and get it to work and do exactly what you want. That will take about two million bucks, and it will take a couple of years to build. Sign here.”

Every time, I remember thinking, “Huh. First of all, millions of dollars for this one part of my customer experience? That’s a lot of money. I don’t think I have that, but if it’s not for the money, though, what’s much more important? The time.” Think about it. Two years before I get version one in front of my customers, before I get that prototype in front of my customers? Get any feedback whatsoever? That’s insane. To software people, to spend two years before you get anything in front of a customer? That’s crazy.

After having that experience at three companies in a row over the course of 10 years, I realized, “Huh. The ethos of communications is diametrically opposed to the ethos of software.” It kind of makes sense. If I was shooting satellites into the air and laying down millions of miles of wire everywhere, I would operate slowly and methodically, and that’s what I would do. That’s what the industry of communications industry has done for 100 years. The thing is, how you and I, how individuals, how companies, get value out of these networks has shifted. It’s no longer about the physical networks. It’s about the software that’s running that defines how we get value out of that network, what we can do, what’s possible. That’s all about software.

So we started Twilio in 2008 to solve the problem of bringing communications out of its legacy in hardware and physical networks and into its future, which is software. Now, we do that with a powerful set of APIs that run in the cloud that let any software developer be able to start building that future.

Tsahi: I’d say you succeeded in that.

Jeff: Oh, well, thank you. We feel like we’ve just started.

Tsahi: Okay. In all of these years, what would be one of the most surprising use cases that you can say that you’ve seen or come in front was like, “Whoa. That’s cool. That’s neat”?

Jeff: There’s so many. We build the platform. We never know what people are going to build. In fact, one of the little Easter eggs in Twilio’s history is that in every press release when we launch a new product, my quote ends with the words, “We can’t wait to see what you build.” Every press release, year after year after year, that was always the line. Nobody ever caught on.

There’s so many use cases. There’s the obvious ones. The whole on demand economy. Things like Uber and Lyft and Airbnb, where Twilio is not only notifying you that your car is arriving, but also connecting drivers and riders together. That whole idea that I would use the internet and my phone to get a stranger to pull a car up and get in the car, I was always told to not get in stranger’s cars. But now, that’s what we do every day, and use cases around how communications, and Twilio has made that safe, made that convenient, made that easy. I never would have thought of those the day we launched Twilio, because really, mobile phones, their current incarnation, smart phones, were just getting started, and that whole idea of it; the applications of it were still completely unknown.

But then there’s the crazy use cases that I still can’t imagine. One of my favorite crazy use cases is there’s some researchers in the United States who study the migratory habits of bears.

Tsahi: Okay.

Jeff: Right? It turns out that if you study the migratory habits of bears, you spend your days in a helicopter flying around looking for bears with binoculars. When you see a bear, you land your helicopter. You shoot the bear with a tranquilizer, then you climb up on the bear. You hope it’s tranquilized, and you put a collar on its neck that’s going to track its location. Then you run away very quickly, hopefully before the bear wakes up. Then a year later, you’re circling in your helicopter. You spot the bear again. You land. You shoot it with a tranquilizer again. You climb up on the bear again, hoping it’s actually tranquilized. You pull the data card out of the collar. You put a new one in, and you run away before the bear wakes up.

They’re like, “There’s got to be a better way. We would love to stop shooting bears with tranquilizers.” So they built a collar that had a 2G radio in it that collects all the data. When the bear wanders into an area with some cell service … They don’t exactly walk around in shopping malls. When it wanders in, it picks up coverage, and it texts all that data off the collar to a receptor they built on Twilio. That was, I thought, such a cool use case, because they’re using this technology, 2G radios. They’re low power. They’ve got maximum range, and it is texting the data off to build an app. You’re like, “Who would have thought of this?” We call this the internet of bears. I’m like, this is a use case I never would have imagined that there were people whose days were spent doing this. They found a use case for Twilio to solve this problem.

Here’s another crazy use case I love. There’s a researcher in the UK who built an app that allows you to call a phone number, and based on taking a recording of your voice, can detect with a very high degree of accuracy whether you’re likely to be predisposed to Parkinson’s disease.

Tsahi: I should use that one.

Jeff: You’ve done it?

Tsahi: No, but do you have the number?

Jeff: It’s a medical trial. They ran this trial. They found it to be an incredibly accurate way of assessing whether or not you are likely to develop Parkinson’s just by calling a phone number on Twilio and recording your voice for about 30 seconds. What’s amazing, as a researcher, he said trials like this would have usually cost millions of dollars to set up and run, because you would have needed all this sort of expertise and specialization. The doctor and his staff built it in a couple of weeks using Twilio for less than $1,000. They ran the whole trial, so it’s amazing.

Tsahi: Yes it is. I want to talk to you a little bit about the market itself and the different players in that market. The main ones that you would have thought that you would have lead or be part of that are the actual Telcos, the carriers, the ones that offer the phone service to the consumers. When you look at what they are doing in CPaaS and in APIs, they have services, but none of them are quite as successful as the other vendors out there. Why do you think that is?

Jeff: Well, I love the carriers. They have a very valuable product in that they are building out all the infrastructure that we all use every day to communicate in every way we can. I would say, though, that the carriers are not well situated to solve these software problems. Historically, carriers have not been software organizations. They’ve been very effective at ground operations, at getting infrastructure out in the field, repairing it, installing it. They’re very good at sales and marketing and servicing customers, but they historically have not been great software organizations, and that’s why I think a new type of company has been needed to come and solve this problem. A company that is a software company.

Twilio, half of our company is our software R&D group. That’s a different ethos. Building a world class software engineering organization, one that can ship and be agile and build resiliency with agility, which is what we call that process of having a high velocity of innovation but also achieving five nines of availability and things like that. That is a hard software problem, and so it takes a different kind of company to solve that.

Tsahi: Okay. What about all of the IaaS vendors? AWS, Google Cloud Platform, Microsoft Azure? They offer infrastructure. They give you compute and storage and databases today, and it’s like shouldn’t they also do communications? It’s the next step. Why do you think that they aren’t there yet or aren’t there today?

Jeff: I think two things. First is, these companies have been primarily focused in the communications for online consumers. A lot of them have a consumer play, whether it’s Microsoft with Skype or Google with Hangouts and things like that. Then on the infrastructure side, I think they’ve gone to the things that they do particularly well on the infrastructure to build, which is to say it’s compute and storage, the most common areas of software computation, which has been a huge meaty market to go after, which has meant that communications hasn’t been the focus of theirs.

I think companies like Twilio, we focus on communications all day every day. That’s what we wake up to do, and so I think we’re uniquely situated to be able to build out great services that target exactly the use cases of communications while the other platforms have been really focused more on compute and storage and the key areas of general purpose computation.

Tsahi: Okay. Another trend that I’ve seen in the last year or so is around UCaaS, Unified Communication as a Service. These companies that offer you desk phones, the video conferencing systems, the things that you need in order to run and operate your enterprise internally. Communication between people inside the enterprise. It seems that all or most of these vendors today start offering APIs. They bundle APIs on top of their service. When you go and talk to them, they usually say, “We’ve got APIs just like Twilio. When you use us, you don’t need to pay for blah, blah, blah, whatever.” It’s like they compare themselves and position themselves as direct competitors to Twilio. Where do you see these two markets going? UCaaS and CPaaS. Where do they meet?

Jeff: Yeah. It’s a very different thing. If you think about Unified Communications as a Service, you’ve got an application. When you build an application, you make all sorts of assumptions about how the world works. You have a domain. You’ve got models. You’ve got all the core components of unified communications. Then when you add APIs to it, which by the way, it makes a ton of sense. Every SaaS product has APIs. In fact, UCaaS has been a little late to that game, I actually believe. Most SaaS companies have had APIs for 10 years. But when you add APIs to a software application, those APIs bring with it all the assumptions that you made about that application. That’s both good for some things … If you want to extend the application in a certain way and you want APIs to do it, that’s what those kinds of APIs are good for.

Twilio is designed from the ground up to be a set of APIs, to be ultimate flexibility. To not make all those assumptions about the one application that the end user is going to use it for, but rather to say these APIs are designed like building blocks to be put together in any way you see fit. That’s why we can address a wide variety of use cases, whether it’s two-factor authentication, identity verification, call centers, anonymous communications, notifications, alerts, anything you can imagine, you can build with Twilio. That’s because we were created from the ground up for this recombination of these building blocks as opposed to taking something that’s already built and fixed in place and then saying, “We’re going to add APIs to it.” It’s just a different way of approaching the API problem. Both of them have merits, but I like our approach, because it gives us the ultimate flexibility to really enter any of these use cases in a really wide breadth of things.

Tsahi: Do you see a unified communication platform as a service; A vendor that does such a service deciding not to build the whole communication infrastructure on its own, but instead using someone like Twilio, a communication platform as a service, to build on top what it is that he is doing?

Jeff: Yeah. I believe that companies whose primary business is communications can and definitely should and would get competitive advantage by using a platform like Twilio to build upon. The reason why is this. It used to be when those UC companies started, their core competency was making the phone ring. Then they’d add some software functionality on top of it, sure, but the vast majority of what they worried about was how do I make the phone ring? The problem is Twilio has democratized that ability.

Every developer … Every mobile developer, every web developer … now has the ability to make the phone ring in 100 countries around the world where we have phone numbers and touch every phone on the planet … Mobile, landline, et cetera … with an API that is reliable, that is scalable, that is global. Now, you’ve got developers out there who get to focus solely on customer experience, features, integration, UX, mobile. Build the things customers really care about and bring this core competency of focusing on user experience that software developers do so well. A one or two developer team can actually create a customer experience that is better than some large company that is focused purely on Unified Communications as a Service.

The existing UCaaS vendors, they would be wise to build on top of the same platform that any developer in the world can come and start to compete with them on. If they don’t, those independent software developers, they can actually start and build companies that are really compelling competitors, because they don’t have to focus on the low level bits. They’re focused on the things customers really care about, which is features, functionality, and the user experience that matters.

We have seen this play out, for example, in the call center market. We’ve seen … At our first conference back in 2011, Tiago was the founder of the company TalkDesk. One developer. Do you know Tiago?

Tsahi: Yes.

Jeff: Back in 2011, Tiago was the founder of TalkDesk. Single developer. He was a web developer. He knew web development really well and focused on building a product that he thought would be really compelling. Because of Twilio, he didn’t have to worry about any of the underlying infrastructure. Now, TalkDesk is hundreds of employees, has raised a lot of venture capital, has Fortune 1000 companies running call centers on them all because he was able to focus on the things customers really care about, is the features and functionality of the application. He did not have to worry about making the phone ring. That’s a really powerful competitive dynamic, as new players come in fundamentally uplevelled, because they’re building on platforms.

Tsahi: When I look at the feature set that you have at Twilio, the different types of functions that you offer, at the end of the day, that is something that is always commented when people talk about Twilio and they’re trying to attack Twilio as a company. They say, “All of the money comes at the end of the day from SMS and voice. That’s what they do, and at the end of the day, that’s too competitive as a market today.” If you actually look and search all of the CPaaS vendors, all of the direct competitors that you have, almost all of them have the same type of characteristics. They make most of their revenue today from SMS and voice and a lot less from the IP based services that they have, from the new things that come out. How do you as the leader in the CPaaS space deal with that and meet that challenge?

Jeff: I think there’s two things. First of all, most mature products for any company are generally going to be the largest contributors of revenue. Especially with developer products. We have a very long commitment to developers, and that takes a little longer than other products to adopt, because you launch a product, then developers have to see that product, understand it, and build their product, and then bring their product to market. You’ve got a little bit of an extra delay as a developer-focused company before products become commercially viable.

That is a long commitment, and that, quite frankly, is why a lot of companies don’t have the stomach to serve developers, because it’s a long commitment to developers to get those products to grow and be large. But we have that commitment. The way we look at developer products is that they have a slower start but then a fantastic ramp up capability. So I wouldn’t worry about the short term. We’re planning for the long term. In the long term, it is blatantly obvious that the software APIs and software communications are going to win. We’re there with all the products that developers need to build it. We see developers building amazing things using our software products, our video SDKs, Twilio Clients for Voice Over IP, the rest of our software products.

The other thing I’ll point out is that our software products often drive usage and adoption of our voice and SMS products as well. They don’t exist in a vacuum. When a customer builds a call center using Twilio’s TaskRouter product, which is a globally scalable cloud-based ACD … When you use TaskRouter to build a call center, guess what? It drives more voice revenue. When you use Twilio Client as the basis of your call center, it drives more PSTN revenue, generally, as well, because you’ve got an inbound phone number.

It’s interesting is that these new technologies, software-based communications, are actual drivers of competitive advantage for our customers who adopt them, whereas if you think about the customers of ours who’ve adopted Twilio Client to allow any computer with a web browser to be able to now become a call center by just plugging in a headset and using our Twilio Client product that’s powered by WebRTC, that has leveled the playing field because you no longer have to manufacture or sell hardware phones or PBXs in a closet. These new software technologies have been huge drivers of a new set of players to arise in this industry who previously wouldn’t have been able to do it. That’s creating a new market dynamic here of new players entering the field and new products entering the field that wouldn’t have existed 10 years ago.

That’s really exciting, and it’s creating a huge market shift, but it also draws more usage of the PSTN right along with it. The same thing you can say for our Twilio Chat product. The same thing you can say for a number of our products, Twilio Studio. So all of these products together, you usually don’t use them in a vacuum. You use them together with other products. That’s part of the nature of APIs. But having them all together and being able to plug them in together to do these interesting things is fundamentally changing the landscape of the companies and the products that are out there that are really pushing the ball forward on communications.

Tsahi: I think I saw the first thing that you said when I worked at RADVISION years ago, but in the opposite sense. At RADVISION, you had two business units. One of them was a technology business unit. We sold SDKs to others to build their own products. The second business unit dealt with selling videoconferencing equipment. Whenever there was a downturn in the company because of the market, the CEO came out and said, “We have this business unit that sells videoconferencing. It’s now slow because of the market. Then the TBU, the technology business unit, we’re still going strong because we see that this will go upstream three years from now when developers actually launch it.”

There, the business model was flipped. We usually licensed the software in advance so developers had to invest when they started, and not when they saw the revenue. What you are saying is that today, in order to be in the developer space, you don’t make the money up front from developers that build stuff in the future. You wait and you grow with them. That waiting for that growth is what makes a company big at the end, is being patient.

Jeff: Exactly right. It’s the combination of our usage-based revenue model that tightly aligns us with our customer’s success. This is key. When we think about what is the driver of innovation, what makes developers be successful in building their next idea, it is experimentation. Experimentation is the prerequisite to innovation. Everything that we do is about lowering the barriers to a developer getting started and running as many experiments as they can for an idea that they want to try out. That’s why we have such a low upfront. You get started … Every developer who has used Twilio started by spending their first penny to make that first phone call, send that first text message, fire up that first video session.

You never know which one of these ideas that developers are building is going to be the next great big idea. Our job is to make it so developers can try as many of these ideas and run as many experiments as they can until they find product market fit with the thing that they’re building. That’s why it’s a long commitment to developers, because you need to give them the runway. You need to have that patience, but you also need to have that attitude that it’s not about, “Hey, a developer came to our door. I’m here to get all the money from you today.” You’re like, “No. We’ll do well if you do well. I’m just here to make sure you do well. I’m here to do everything I can to make you successful in building your ideas.” Ultimately, that’s how I’m going to be successful, but it’s a long commitment.

We like to say, though, it is a compounding interest business, essentially. You invest in developers, and they build. With the usage-based model, as they grow, as they’re successful, that, then, turns into our success. For us, that means customer success is the very first thing. It’s the prerequisite to our own success. Everyone at Twilio is always focused on customer success first.

Tsahi: I’ve been to two Twilio SIGNAL events, both very interesting events. I really loved them. What I noticed that you know exactly what the product does. When there is a product launch, you play with it. You do it on stage. You use it. You’re a developer yourself. How can you do that and still be a CEO of more than 900 employees?

Jeff: I think as an API developer-first company, I have to do that. That’s how I can make sure that we’re building the right things, and that’s how I can make sure I’m close to our customers and I’m close to our products. I love playing around with the new Twilio products. I am the first person they give access to when we build stuff, or at least, I hope I am, because that’s how I love playing around. I just dive in there. I read the docs. I started building stuff. That’s really exciting.

Recently, I was building something for Halloween with my kids with some Arduinos. I love building internal things at Twilio. A few years ago, I built our goal-setting software that we were using at the time. I just dove in. They don’t let me touch production code anymore, which is probably a good thing, but I just love being a developer. Even though I’m a CEO, I love continuing to invest in that part of my life. Obviously, I don’t get to do it as much as I used to, but it would make me very sad if I had to stop. I’ve just arranged my schedule and arranged my life so that I always make sure I’ve got some time to stay current on new stuff, both inside Twilio and outside Twilio and build. I’ve always thought that just building, just having a project idea in mind and committing yourself to building it and picking even some new technologies you’ve never used before, that’s a great way to keep learning and keep building and keeping your skills up.

Tsahi: I can easily relate to that. Talking about products and what is it you do, the last year it seems that you have somewhat shifted. If up until now, you could have said that when Twilio launches a new product or introduces a new product, that would be yet another building block that you can use to do some kind of communication. A new communication service that you couldn’t build before. It seems that you’ve started moving upstream. There is the Engagement Cloud with Notify and Authy. Then there is even Twilio Studio that goes for me even one level above that. Why did you make that move? Why the shift?

Jeff: Well, we don’t see it as a shift, because to us, it’s always about having the right API for a developer to get the job done. As a platform, you start off with a set of building blocks that provide maximum flexibility, because you don’t necessarily know what developers are going to want to build. As you learn from developers what are the most common things that they want to get done, but also what was really hard? What did they think would be easy to build and it turned out was very hard?

We view our job as making our customers successful. When we see the things that we can do to make their lives easier, help them get the job done faster or not have to reinvent the wheel because they’re trying to figure out, “Hey, how do I figure out how to distribute calls?” and I see every other customer trying to figure that out, too, as they’re building a call center, it becomes obvious. You say, “Wow. My job is to make my customer’s life easier and make them more successful. Why don’t I build a product that does that thing?” So you end up with Twilio TaskRouter, for example.

In the case of Studio, we view it as making the developer’s job even easier and allowing more people to participate in the development and the maintenance of these applications they’re building. Why? Because we saw developers build an application, and certain parts of it are really exciting, like how do I figure out the exact experience I want? How do I integrate all this stuff? Then parts of it are really boring and become a tax to the developer and to the whole organization, such as when folks are saying, “Hey.” Product manager says, “Hey, can we update the text? We’re going to run an A/B test. Can you try 50% on this and 50% on that? Can you change the SMS text? Can you change how the call center greets the people coming in?”

The developers don’t see that as exciting. They see that as, “Oh, it’s continual maintenance. It keeps pulling story points off of me every week, because I’ve got to keep maintaining the thing.” We said, “Isn’t there a way that we can allow the developer to do the really important parts, the parts that are about integrating systems and things like that, and then take the other parts that are a little more standard and make it so not only the developer doesn’t have to write it … They can just drag and drop and build it easily … but they can also hand some of that off to other people in the organization.” Maybe the marketing people have ideas about how they want the content to work. Maybe the ops people want to change how the IVR call flow works. There’s all sorts of different people who are invested in these communications applications, because customer engagement touches so many parts of the company.

If we can offload a bunch of that work from the developer, that ultimately will accelerate our customer’s roadmap and make them more successful. Again, you go back. That’s our goal. By the way, when we make our customer successful, that makes us successful, so we’re all aligned in this. Studio is a great way to do that. So we keep listening to customers, hearing the things that they love about the API approach, the flexibility it gives them, the fact that they can now build things that they were never able to do in the past because pre-built software applications weren’t flexible enough. But then we say, “Great. How do I make it so that you can get that flexibility faster and easier than ever before?” You do that by listening to your customers and solving the most common pain points.

Tsahi: I really love Studio. I’ve played with it. It’s a great tool. Really.

Jeff: Awesome.

Tsahi: How do you make the definition of it? Going … Building a UI tool, an IDE that can mix and match stuff and do this logic is never easy. I’ve used tools before that are similar. Some of them are good. Most of them not the good. How did you nail that experience in a way that, at least for me, was just point on?

Jeff: I think there have been fits and starts in the history of computation around visual designing of programming. Sometimes they work. Sometimes they don’t. To us, there were two things that were involved in that. Number one is working with a lot of customers and a lot of users. We actually started with paper and sticky notes and starting to design with them how they would want to design something like an IVR or an SMS bot or a chat bot, things like that. We actually did it with sticky notes before we wrote a single line of code. To us, that was the equivalent of for APIs, it’s writing the API docs first, putting them in front of a user and saying, “Hey, is this the API you would want?” We do that before we build the product. We did the same. We applied the same logic to building a user interface for drag and drop development.

Then the second thing was I think we constrained it down a bit to say, “This isn’t about general purpose computation,” because you get in all sorts of hairy things. We’re focused on the customer engagement. If we scope it down and we say, “We want to make the very best visual designer for Twilio for customer engagement. What are the things it should encompass?” I think that the key of building both power and simplicity is really understanding your domain that your customers are operating in and then designing the perfect thing for that domain.

I think that obviously, we’re just at the very beginning. We launched it just over a month ago, and so we’re continuing to learn from customers and get that feedback, but that’s our approach that I think has helped us to build something that customers find both powerful but also easy to adopt and easy to use. That comes from the same approach we’ve used to design APIs that I think customers would articulate in the same way. They’re powerful and easy to use.

Tsahi: What’s the feedback that you get about the engagement cloud? It’s out there for what, half a year now?

Jeff: Mm-hmm (affirmative). Look, when we talk to customers and we take a step back and we say, “What is Twilio all about? Why is Twilio important to you, ING Bank? Why is Twilio important to you, Morgan Stanley bank?” Some of these very large organizations, so obviously have a lot of options and a lot of legacy systems they could have kept using. The answer we get is, first of all, flexibility. With Twilio, we get this unprecedented flexibility.

When you think about the importance of customer engagement to a company, almost nothing is more important. When I talk to a CEO of a bank, and you ask them, “What’s important?” they are so concerned about, “How can I maintain my relationship with my customer?” That’s the biggest fear that C-level executives have. That is done with customer engagement. How do you keep up? If you think about the problem space here, it’s insane.

As consumers, the technology that we use has advanced incredibly rapidly in the last five to 10 years. We’ve got a wide variety of new applications that we use. We use video. I use video almost daily. I would have thought that was crazy 10 years ago. I would have thought that was stupid, and now here we are. We use video on a daily basis. We’ve got great chat applications. We’ve got apps in our chat and chat in our apps. It’s amazing. Yet, for companies to communicate to their customers, it is incredibly broken. Why? Because companies can’t keep up with the pace at which our expectations are changing for how communications is going to work and how great of an experience it’s going to be.

We’re still stuck in the days where you essentially call an IVR of a company and they don’t know who you are. You enter your 40-digit account number and then you talk to an agent. They’re still asking your name five times. You’re like, if I had that experience with a friend, if I called my friend and they asked me my name five times during the call, I would think there was something medically wrong with them. Yet when you call a company, that’s the experience you expect. Nothing is more broken about communications than how companies talk to their customers. We want to fix that.

When you talk to executives at companies and you say, “What keeps you up at night?” It’s, “Yeah. I’m worried about losing my connection to my customer. Being disintermediated by all these other technologies that are coming out. I need to keep the connection in order to stay top of mind and stay relevant to my customer.” When I think about how that works, it’s like, “Well, you’ve got rapidly proliferating ways in which you need to reach your customer.”

10, 15 years ago, talking to your customer generally meant you had a phone number and customers could call it. Now, you’ve got not just phone calls. You’ve got text messaging, you’ve got chat, you’ve got mobile apps with push notifications. You’ve got WeChat, WhatsApp, Facebook Messenger. You’ve got so many different … Now Alexa, Google Home, personal assistants. You have so many ways and very finite development resources to keep up with this changing world. By the way, it’s not just the ways in which you need to communicate that is proliferating. Think about all the departments in a company that need to actually keep up. You’ve got sales, marketing, customer support, onboarding, product teams. Every part of the company is trying to keep up with every part of this changing technology landscape. It is an unsolvable problem for most companies.

That’s what the engagement cloud is here to sell. We want to provide one system that allows companies to keep building, keep iterating, but to reduce the barriers, reduce the time to do that and give one tool to all these different teams who need to touch customers, to be able to keep up with this rapidly changing landscape and constantly iterating on those customer experiences with easy to use tools and infrastructure that they don’t have to worry about scaling. They don’t have to worry about reliability. They don’t have to worry about onboarding new platforms. We’re going to do that for them as the world is changing. They get all that stuff from us, and so they focus on, “Okay, what’s my special sauce? What’s the thing that makes my brand and my company engaging to my customer?” I’m going to focus on that last bit, and we’re going to iterate on that constantly, and I’m going to empower all these different teams inside the company to be able to have that at their fingertips. That’s what the engagement cloud vision is all about.

Tsahi: Thank you for your time, Jeff.

Jeff: Thank you, Tsahi.

Tsahi: I thoroughly enjoyed it.

The post Jeff Lawson on the Past, Present and Future of Programmable Communications appeared first on BlogGeek.me.

Vidyo and RTC in 2018

Mon, 11/13/2017 - 12:00

Vidyo has made several announcements in the past couple of weeks. Time to see why the time is right for RTC across markets.

It has been a busy month for Vidyo. It has made two interesting announcements:

  1. The introduction of VP9 into its products
  2. Streamlining its product line

Vidyo has been known for their video routing technologies for many years. Well before WebRTC came into the ring. It is great to see how they have come in merging the two, along with how they are trying to fit their business model to the realities of WebRTC.

Vidyo, WebRTC, VP9 and SVC

How do you compete in a world where WebRTC is becoming the dominant media engine? Especially when the baseline implementation is dictated by what you get by default in the browser?

Vidyo has always had its own proprietary codec implementations. Ones that are optimized for SVC – Scalable Video Coding. Alex Eleftheriadis guest posted here last year with an explanation of SVC. To simplify, SVC gives two big advantages:

  1. Better error resiliency on poor network conditions
  2. Better support for multiparty and broadcast interactions

In many cases, you can get these things done without SVC and the end result would be good enough. But there are times when this extra kick to quality and optimization of how the network gets used makes all the difference.

When it comes to current browser implementations of WebRTC, the only video codec that has any kind of SVC support is VP9 and that takes place in Chrome. To take advantage of SVC, there are only two routes a company can take:

  1. Rely on the browser implementation and exposure of VP9/SVC features, and then implement these capabilities in its application
  2. Build its own XXX/SVC implementation into a non-browser application

Option (1) is great, but it assumes that:

  • Browsers prioritize VP9/SVC over other features. The challenge here is that things like aligning with the upcoming WebRTC 1.0 spec is most likely a lot more important
  • VP9/SVC will be implemented soon, and controlling its SVC capabilities will be exposed to the developers via JS APIs or additional SDP parameters
  • The existence of media servers that support SVC and optimize and fine-tune well for it

Reality is that on Chrome, the VP9 implementation in WebRTC supports SVC on the decoder side, but it doesn’t yet supports WebRTC in the encoder side.

Vidyo took the middle ground here, trying to enjoy both worlds: It always had its own SVC implementation in H.264 but allowed using WebRTC. Now, with its VP9/SVC implementation, it gets the freedom to improve video quality of its sessions in ways that others can’t.

If you use Vidyo.io today (and its other products in the near future), then Vidyo will try and prioritize the use of VP9 over other video codecs. And if some of the users in the session are making use of Vidyo’s SDKs instead of the native browser WebRTC implementation (i.e – joining from mobile or a desktop app), they will encode VP9 with SVC capabilities, and Chrome will be able to decode the bitsream – though the browser’s own encoded bitstream won’t be using SVC (at least not for now).

This places Vidyo ahead of the pack in SVC support that plays well with WebRTC.

Vidyo’s Product Line

Here’s the gist of the new product live view from Vidyo:

Vidyo has taken the approach of offering a single technical infrastructure to host and run all of its products. This is the right move forward and an embrace of the cloud. In a way, Vidyo is continuing its shift from on premise deployments towards a Vidyo hosted and managed cloud platform.

Vidyo.io can be defined as CPaaS, a Communication Platform as a Service; while its VidyoCloud can be defined as UCaaS, a Unified Communication Platform as a Service.

Vidyo started life in the UC business, moving to the cloud and then adding an API platform. In many other cases, UC / UCaaS vendors take the approach of adding an API on top of their UCaaS product and then just calling it CPaaS. Vidyo decided on “separating” the two which feels to me as the better approach. It casts a wider net over the potential target market and the types of use cases that Vidyo can now cater for.

To this product line, Vidyo has added earlier this year VidyoEngage, its answer to video based contact centers.

The end result? Vidyo can now be used in the 3 biggest domains for visual communications:

  1. Unified Communications, with its VideoCloud offering; providing a complete video communications platform
  2. Contact Centers, with VidyoEngage; providing a higher level abstraction of the call center modal to its customers
  3. All the rest, through its Vidyo.io platform for developers

You can use Vidyo.io to build a UC or a CC application if that’s your need, or you can just pick up VidyoCloud or VidyoEngage to get there.

What’s Next?

The challenge for Vidyo will be in competing in 3 different fronts at the same time, and the threat of losing focus. I am guessing this is one of the reasons for this streamlining – it is meant to simplify its internal infrastructure that is used in these 3 products on the technical level.

Managing these separate businesses and keeping abreast in all 3 markets will be hard, but Vidyo is off to a good start here.

When it comes to Vidyo.io, the addition of VP9/SVC support positions Vidyo as the technology leader in its space with the ability to offer the best media quality. Its competitors will require

The post Vidyo and RTC in 2018 appeared first on BlogGeek.me.

What’s New With the Jitsi Videobridge?

Mon, 11/06/2017 - 12:00

Jitsi is getting a boost in its development.

When a developers focused company gets acquired it is time to start worrying.

Was the acquisition due to the technology, the customers or the business model?

Will the product continue to grow and flourish in the new regime?

Are the current signed agreements going to be renewed?

For open source, there are even more questions.

How will the community that was created around the open source project be treated?

Will existing business models around support, customization and dual licensing be maintained or will they be killed?

Two and a half years ago or so we had 3 popular open source media servers for WebRTC: Janus, Jitsi and Kurento.

Kurento got acquired by Twilio and Jitsi got acquired by Atlassian. Janus is still independent.

The progress made around Kurento since its acquisition was minimal at best. My guess is that Twilio is just too busy in getting its own multiparty video ready for GA to focus on the Kurento open source project itself. It also haven’t quite acquired everything that is Kurento – parts of it were left for the community and the original parent company Naevatec. The time passed is making a lot of the Kurento adopters frustrated and in search of different alternatives.

Best time to join my WebRTC Course? Today. Office hours are starting next week, and there’s a great bonus ebook of how meet.jit.si built its scalable infrastructure.

Enroll now

So time to ask –

How did Jitsi fair since its acquisition?

Surprisingly well.

And it seems to be getting a lot more interesting lately.

In the past 4 months, I’ve been adding almost on a weekly basis a post about Jitsi into the WebRTC Weekly. The team there has been continuously churning out new features into the project.

Here’s what was announced on the Jitsi blog since June when it comes to new features:

June

July

August

September

October

There’s a mix of announcements here. They range from addition of UX feature to some deep optimizations of the media server itself. And part of it is due to GSoC, Google Summer of Code, a project started by Google some years ago where university students can join open source projects as interns. Jitsi has been part of this project for some time now.

UX Improvements

In a way, these are the least interesting features when it comes to a media server, but the ones that makes it easier to use.

What Jitsi did in this round was tweak the UI to be a bit more modern and easier to use. For video layouts, there was a decision to better cater for 1:1 scenarios and to move video thumbnails from the bottom of the page to the right side of the page. This is also what Google decided to do once they shifted away from Hangouts to Meet. This makes for a more modern approach that sits well with the wider displays we have in recent years.

An audio only button was added to the UI. I am assuming it is just a shortcut to muting incoming and outgoing video. Having this UI element there makes it easier for users to operate (and easier for adopters of the Jitsi Videobridge to customize).

The interesting addition to me is the speaker times one.

I am intrigued in this case to know how easy would it be for an application to get that information from the Jitsi Videobridge – is this supported via the signaling offered by Jitsi towards the web client or is it also available as a backend-to-backend REST API? I can see this being used later in various ways, assuming the API is detailed enough and easy to use.

Integrations

A WebRTC media server is but a part of what you need to run a full application. While central and important, there are other aspects to it. In recent months, Jitsi have added a few additional integrations, making it easier to use and connect to.

Three such integration points were announced:

1. Mobile SDK

Jitsi had mobile applications for quite some time. While nice, it is different than having a mobile SDK.

Something I’ve been telling media server vendors for a few years now, is that they should offer a mobile SDK as part of their media server. In WebRTC, it is an important part of their offering and one that is hard to ignore.

In the case of Jitsi, users had to use the mobile application as a reference and modify it to their heart’s content. The problem with this approach starts when you need to maintain the codebase in the long run. When a new version of the mobile app comes out – how do you know which parts are critical to upgrade (=without them the app will break with the new Jitsi Videoserver) and which ones are just UI fixes that you can ignore or just pass since you’ve created your own UI experience already?

This is exactly why an SDK is such an important aspect of the solution:

With a mobile SDK, application developers can now just use the Jitsi Meet mobile application as a reference or even write something from scratch on top of the mobile SDK itself. Each is independently updated and maintained, making it easier to upgrade to newer releases.

2. Speech to text

Translation and NLP seems all the rage these days.

The way you get these things connected to WebRTC varies, but follows a similar approach for media servers:

You somehow collect the audio streams on the media server, mix and process them to the format supported by a 3rd party speech-to-text engine (Google Cloud speech-to-text seems quite popular these days), and once you get the resulting text, you do something with it.

In the case of Jitsi, this was a GSoC project. Information about its current status can be found on the developer’s website – Nik Vaessen.

This probably requires some more improvements and polish, but offers a good starting point for developers.

I’d wager that in GSoC 2018, the Jitsi team is planning on adding translation and text-to-speech to it.

3. Telephony

Telephony was already available in Jitsi before. It is implemented via a Jigasi server (JItsi GAteway to SIP). Now Atlassian is eating its own dogfood and not only with its internal HipChat service but in its free meet.jit.si showcase service.

In the case of meet.jit.si, the length of calls was limited to 2 minutes, enabling hunting down meeting participants who haven’t joined the session.

This serves two purposes:

  1. Show that Jigasi works and showcase its use
  2. Work out the kinks of getting this into the UX
Media Server Optimizations

At the heart of Jitsi is the media server itself. This is what developers aim for to begin with and the additions there are quite interesting.

The first one is that Jitsi now supports peer to peer media traversal for 1:1 sessions – in effect – no media server. The reasoning being that many of the calls end up being 1:1 and it is far easier and cost effective to share media directly between the participants.

In the past, supporting such a thing with Jitsi required running a separate signaling mechanism for 1:1 sessions and then once the need arise to grow, shift and renegotiate everything in front of Jitsi. It was tedious at best.

The other work effort is way more interesting.

Bandwidth estimation is nasty. Network conditions are varying and dynamic. You can start a session with 2Mbps and have it considerably drop throughout the session, coming back up again and changing characteristics.

To get that right, WebRTC (and any other VoIP alternative) needs to use bandwidth estimation. This is a process where the device tries to understand how much bandwidth is available to him at any given point in time. The algorithm can be naive, smart, complex, whatever. And a lot of the perceived quality of a call would rely on the quality of the algorithm used for bandwidth estimation.

WebRTC has its own built in bandwidth estimation mechanism. It works. But you need your own algorithm in a media server. Jitsi has its algorithm, and it is work in progress.

The Jitsi team are now taking it to the next level, trying to not only understand availability of bandwidth but also what the best course of action should be – it is trying to discern if it is better to reduce bitrate or add forward error correction instead.

It also does that with the coolest set of tech tools available to us today – Tensor Flow and Machine Learning.

Here’s what Emil Ivov shared during our Kranky Geek event last month:

Where to Next?

Looking for an open source alternative for your media server?

The most popular approaches out there for you are Janus and Jitsi.

Which one to pick out of the two seems to be based on personal taste more than anything else.

Best time to join my WebRTC Course? Today. Office hours are starting next week, and there’s a great bonus ebook of how meet.jit.si built its scalable infrastructure.

Enroll now

 

The post What’s New With the Jitsi Videobridge? appeared first on BlogGeek.me.

Kranky Geek 2017: What Does the Pulse of WebRTC Tells Us?

Mon, 10/30/2017 - 12:00

Kranky Geek 2017 has been a roller coaster event for me. Time to discuss what I learned about the WebRTC last week.

Yap. We had a full room.

Well… More like 2 full rooms.

When talking to Lawrence some time in the afternoon, he joked with me, saying that apparently we have a problem – the overflow room is overflowing.

The best problem an event organizer could ever ask for.

If you are looking for the event videos, then they are already on YouTube.

I want to share some of my thoughts prior to the event and during to the event. And if possible, try and shed some light on where we’re headed from here.

Want to keep abreast of the WebRTC ecosystem? Join the WebRTC Weekly Challenges Abound

Putting up an event is a stressful undertaking. There are a lot of aspects that needs to be covered with this constant worry that you’ll end up forgetting something or that something will screw you over. Both are guaranteed to happen no matter how much planning and effort you put into it.

This time, our challenges started early on. It was somewhat harder than usual to decide how to price the event to make it worthwhile doing. Kranky Geek events are expensive to run. From the beginning, we’ve aimed for events that are free to attend (I consider a $10 admission fee that gets donated as a free to attend event). This left us with covering our expenses and making some revenue out of it something that relies on sponsors.

Kranky Geek is all about quality content. High quality content. Top notch. The best you can find.

Which means that we select the topics we want. We then hunt for the speakers that fit into that. And we work with our speakers to make them shine.

This process doesn’t always work with sponsors… it is sometimes hard to explain how we operate and why. And at times, sponsors can focus on hard selling their warez, which doesn’t fit into the Kranky Geek spirit (and definitely not to our audience).

This time, it took us slightly longer than usual to get the sponsors onboard and to be certain that we can pull off the event.

It also caused some more stress than usual among us partners. Kranky Geek is a joint effort of 3 people: Chris Koehncke (aka Chris Kranky), Chad Hart (the living spirit behind webrtcHacks) and me.

We don’t always agree, but somehow we fit well together, each one covering the other one’s shortcomings. We make a good team for getting these events done. I hope

Why am I sharing all this?

To set the stage to what comes next for Kranky Geek, but also to explain the amount of work, effort,time, stress, pain and love that has been put into the Kranky Geek events in general and to this one in particular.

It hasn’t been all happy, but I am proud of the result and happy that we did this.

We Had a Fire Drill!

During the day, we’ve had our share of technical challenges.

The projectors in the main room didn’t work at the beginning (that was before we started the day), and then a few other issues cropped up on us.

Doing this event in Google’s San Francisco office meant we had the best A/V team in the world on site to help us. The crew Google is working with there is top notch. The best I worked with. They made the problems seem easy to solve.

We had this to deal with…

Great @KrankyGeek schedule at #webrtclive this year includes exercise and fresh air, with @Google providing simulated earthquakes & flames! pic.twitter.com/r0QHATG5Wj

— Lawrence Byrd (@LawrenceByrd) October 27, 2017

A week before the event we were told we will have a fire drill in the building on the day of the event. The time kept moving around, settling at 2pm. We’ve scheduled our breaks and sessions around it, with a huge worry of having people leave once the fire drill started.

(that’s Kranky going down the staircase during the drill)

We decided to embrace the fire drill and tried to celebrate it with our audience, and I hope we succeeded. Back from the fire drill, we had almost everyone back.

We should probably make fire drills an integral part of Kranky Geek events.

Time to stop rambling.

The Event Recordings

The recordings are available online.

You can find them here.

We’ve had to reorder the sessions from our original agenda due to constraints we had with some of our speakers – late arrivals and early exits.

So I’ve reordered the sessions here. Following this, are the 13 sessions we had, in the original order we wanted (not that it really mattered).

I added some of my commentary on what I liked and learned in each of the sessions.

Kranky Geek Team

Nothing to say here really, besides the fact that I envy Chad’s ability to create slides and present them.

Facebook

This is the first time we had Facebook join us and share a story at Kranky Geek. We had the pleasure to have Li-Tal Mashiach an Engineering Manager at Facebook do the talk.

The numbers there are impressive as hell. 400 million monthly active users doing voice and video calls on Facebook Messenger using WebRTC. 400 million.

The next one who asks me if WebRTC is being adopted – I’ll just say 400 million. And then he’ll complain that this isn’t an enterprise application…

Anyways, what I found really interesting is how Facebook is dealing with optimization. The effort placed in the decision making process around video codecs, bitrates, etc.

WebRTC comes in a neat open source package that anyone can use. But it needs a lot more love and care when it comes to making it work at scale – just like any other technology.

TokBox

Badri Rajasekar, CTO of TokBox, shared an experiment that TokBox has been running recently. It was about using head tracking technology to improve video quality.

The idea behind it is that you can scale up a region of interest in an image sacrificing other regions, which ends up putting more pixels encoded for these regions.

The great thing here, that you do it without touching the encoder or the decoder. Why do we want that? Because the more generic you can make an encoder, the easier it is to implement it in hardware.

VoiceBase

Walter Bachtiger, Co-founder and CEO of VoiceBase talked about NLP (Natural Language Processing), and how great insights can be derived out of voice.

It was a bit of creepy, understanding how accurate machine learning can be at scale in a contact center.

The part I liked best in this one was how a contact center can decide within 30 seconds how likely you are to buy – if only the people who call me would have used it… it would have saved me a lot of time as a customer.

Atlassian

Emil Ivov, Chief Video Architect at Atlassian, and a serial speaker at Kranky Geek gave a very interesting talk about machine learning and bandwidth estimation.

The team at Jitsi now use Tensor Flow to sift through metadata they have of calls to try and understand how the network behaves and what strategy would work best in improving network quality.

It seems like reducing bitrate doesn’t always have the necessary effect on things, and FEC might end up working better.

Vidyo

Roi Sasson, CTO of Vidyo, talked about scale.

This wasn’t about how to scale a service, but rather how to scale a single call. Want 10 people on a call? You may not need to worry, but if you go to a 100 or a 1,000 – you need to think differently about it.

Which is where taking SFUs and cascading them, both within a single data center and geographically, starts making a lot of sense.

WebKit

For the first time, we had a representative from Safari. We got to hear what Apple’s default browser does with WebRTC and how from Youenn Fablet, a contributor to WebKit.

It was great to have WebKit join us at Kranky Geek, and to hear their fresh thinking about privacy in WebRTC and how they’ve taken care of that in Safari.

Peer5

Hadar Weiss, Co-founder and CEO of Peer5 talked about P2P CDN and using the WebRTC data channel.

We never did have a focused talk at the data channel in Kranky Geek, so this was a first.

I found really interesting how Peer5 does things differently than the rest of the WebRTC community. Mostly because they care less about call setup times and TURN connectivity and a lot more about throughput.

Hadar showed a few techniques I really liked, like the simple compression of SDP messages (which starts to make sense when you process and send millions of these a day).

Slack

From Slack we had Lynsey Haynes and Andrew MacDonald.

Two things interesting about this session:

  1. The shift they made from a custom WebRTC implementation towards the use of Electron with a vinyl WebRTC implementation in Chromium – all due to maintenance costs
  2. Switching from a custom Janus media server towards a self developed one written in Elixir

During the Q&A (which didn’t make it to the recording), Slack were asked about their support of Firefox. Andrew answered that support for Firefox is unlikely to come due to the shift of Slack towards focusing on less browsers and on their Electron-based desktop application. I see this thought process taking place elsewhere as well – it doesn’t bode well to the future of browsers.

Twilio

Rob Brazier from Twilio showed an AR (Augmented Reality) use case.

I’ve never been a fan of these acronyms such as IOT, AR, VR. Marrying them with WebRTC always seemed to me somewhat forced.

That said, Rob did a great job in making a case for AR in communication interactions. I am sure more exist.

Frozen Mountain

Anton Venema, CTO of Frozen Mountain was there to give an interesting demo.

He cobbled up text to speech, translation and speech to text to their media server platform, doing a demo of live language translation taking place in a WebRTC session.

Google

Niklas Blum, Huib Kleinhout and Justin Uberti from Google shared the progress made in WebRTC towards WebRTC 1.0.

This one had a lot of details for developers about things they need to know with the latest versions of Chrome and what to prepare for moving forward.

Appear.in

This year’s closing session was given by Philipp Hancke of appear.in. He’s a repeat speaker at Kranky Geek.

Philipp delved into NSFW (Not Safe For Work) related technologies, experimenting with recognizing such content and deciding what to do with it.

It was an interesting mix of technologies, human behavior and compromises.

Our Event Sponsors

Did I already say that Kranky Geek relies of its sponsors?

This year we had 6 of them:

I’d like to again thank our sponsors.

Diversity and Kranky Geek

For the first time, we had female speakers. Great female speakers.

I want more of this.

If you are a woman, or know of a woman. One that has technical WebRTC chops. And a desire to share your experiences. Contact me…

What’s Next for Kranky Geek?

We weren’t sure if we will have another Krank Geek event. But due to the success of the one we just had, there’s high probability that we will do another one next year.

So…

Get ready for Kranky Geek 2018.

With more great content, and maybe – a fire drill.

And while at it, if you increase your visibility in the market, know that sponsoring a Kranky Geek is a great way to go about it. So put some budget aside for it. Q3/Q4 2018 is where it will take place.

Want to keep abreast of the WebRTC ecosystem? Join the WebRTC Weekly

The post Kranky Geek 2017: What Does the Pulse of WebRTC Tells Us? appeared first on BlogGeek.me.

6 Ways Vendors Sell WebRTC Developer Tools

Mon, 10/23/2017 - 12:30

How can you make a living from WebRTC? You offer WebRTC developer tools.

One of the interesting questions is around monetizing WebRTC. The truth is, it is hard to monetize a concept, or a piece of technology. Kranky said it well over 3 years ago – WebRTC Market Size (is 0).

What does this mean? That you can either make money by selling tools to developers who need WebRTC. Or you make money by offering a service that makes use of WebRTC, but we can now debate if that’s WebRTC or not.

Anything that isn’t WebRTC developer tools talls into other market niches – healthcare, education, gaming, … all these compete and create business far from the WebRTC core itself.

Want to learn who’s offering WebRTC Developer Tools? Check out my WebRTC Developer Tools Landscape infographic.

WebRTC developer tools though – that’s where a small WebRTC market niche exist. And there are several ways to make money in this market. Here are 6 different types of services you can offer to sell WebRTC to developers – some will offer multiple services.

#1 – Sell a Managed Service (SaaS)

You can sell a managed service.

Find something that developers need.

Create a service that offers that solution.

Sell it in XaaS model.

  • We do it at testRTC for testing and monitoring WebRTC services.
  • Callstats.io does that for monitoring.
  • XirSys and a few others offer a managed service for NAT Traversal (=someone else hosts the TURN and STUN servers that your application uses)
  • Mobilinq and others offer a customized hosted offering
  • And then there are CPaaS vendors. Many of them offering WebRTC as well (check out this report on WebRTC CPaaS)

This market is rather challenging, as the name of the game is scale, and getting there is hard. For some reason, this is also where most customers end up penny pinchin.

#2 – License Software

You can develop a product that others need and offer it under a commercial license.

There are those who want or need to run their own service, not relying on managed services. And at times, they are happy to pay for a commercial license that comes with an SLA and someone you can shout at and threaten.

The best thing about most commercially licensed software is that the people behind it work on that software. And once they have paying customers, they are bound by contracts to support and maintain it, usually for long periods of time.

In this category, you can find companies such as Dialogic, Frozen Mountain and SwitchRTC.

#3 – Support and Customization of Open Source

Open Source doesn’t mean free.

People need to be able to make money out of their work – even if they are idealists who are just contributing to the community as a whole.

The way to go about doing that is by writing software that then gets distributed freely under an open source license. This allows anyone to take that software, use it, modify it and even try and contribute back to it and improve upon it.

For popular open source projects, this creates a nice feedback loop that everyone enjoys. For the most obscure projects, it remains the work of a single maintainer.

So how can someone make a living out of open source? By offering one of three different alternatives (usually a mix of them):

  1. Support contracts – if you’re the owner and main maintainer of the open source, then you can sell support contracts. Those who use your open source project may have questions, and giving them priority support can be an income source. For companies, having support available on the open source projects they use can be an important aspect of choosing one open source project over another
  2. Customization work – copmanies who adopt open source projects sometimes need modifications to these projects. They can attempt to do it on their own, or they can just have the main maintainer of the project do it for them at a price
  3. Commercial license – LGPL, GPL, AGPL and other open source licenses are often considered as cancerous licenses for commercial products. The reason for that is that they “contaminate” the code written around them forcing their license terms on that code as well. There are other open source licenses that are more tolerable to companies (more about it here). Which is why in many cases, a company would prefer paying to get a commercial license instead of using the free open source licenses of a project. Dual licensing is another way of making a living

Jitsi, for example, was distributed under an LGPL license. This allowed the team behind it to make a living through all 3 approaches: support contracts, customization work and offering commercial licenses. After its acquisition by Atlassian, it switched from LGPL to a more lenient APL license. The main reason? Atlassian had other objectives for Jitsi and they weren’t about deriving direct monetary value from it. The Jitsi team no longer offers paid support or customization – it doesn’t mean they don’t support the code base, it just means that you can’t pay them for priority support.

Kurento got acquired by Twilio. Naevatec, the company behind Kurento made most of its direct revenue from Kurento by offering support and customization work. After the acquisition, Naevatec was left without its engineers that were experienced with Kurento and has since been struggling to maintain the Kurento codebase.

Janus is still an open source project. The company behind it offers support and customization work if someone needs it.

To be able to make a living out of an open source project, it needs to be one that is mission critical to the companies who use it, and it needs to be popular enough. If you plan on taking that route, remember that maintaining such a project can make you proud at the number of companies that end up adopting it, but may well frustrate you if you look at how many of these companies won’t be willing to pay for it at all.

#4 – Conduct Analysis

This is something I wasn’t aware of up until several months ago.

There’s this interesting market niche in WebRTC, and I am not sure how prevalent it is with other technologies.

It is of companies and enterpreneurs who set out building a product with not enough knowledge and experience in WebRTC. They try to learn as they go along, floundering while at it. Many reasons why this happens:

  • They are doing it with an itnernal team that doesn’t have the skill set
  • They outsourced the project to an open source vendor who knows nothing about WebRTC, but knows how to build a mobile app, a website or even a VoIP service
  • They outsource the project but don’t scope it properly, getting a product that isn’t what they really wanted – and then blaming the outsourcing company about it
Need to beef up your WebRTC experience? Enroll your developers to the Advanced WebRTC Architecture course.Enroll to the WebRTC course

When this happens, companies start looking for alternatives. And there really are only 4 things to do here:

  1. Close shop and go home. Consider this a failure and just move on to other projects
  2. Reboot. Look at all of it as sunk costs and start from scratch
  3. Fix. Get your team or pay the outsourcing vendor (or other outsourcing vendors) to continue working on the project until it is working
  4. Salvage. Get an expert to look at the existing codebase, analyse it, offer his advice and even let him do the fixing

Salvage is somewhat different from fixing, as it focuses on analyzing the whole architecture along with the implementation instead of just diving right in and continuing with the same approach that brought you to where you are in the first place.

And there are companies who offer such packaged services. Look at Blacc Spot Media and WebRTC.ventures for that if this is what you’re after.

#5 – Outsource Your R&D Skills

You’re good with coding and know WebRTC?

Great.

Outsource it to others.

Many of the people who contact me are after developers with WebRTC experience. Some of them want to have these developers work as freelancers. Others want to outsource to a company. Others still are looking to recruit skilled workers, but understand they may end up outsourcing anyway.

There are quite a few companies and individuals who offer their outsourcing services around WebRTC.

The known freelancers who do WebRTC work are usually fully booked. It is hard to get their attention and time for new projects, but it is worth a try.

The outsourcing companies come in different shapes and sizes. Many don’t have the relevant skillset. Some will place inexperienced developers on your project. Some will do the best work for you.

Quality here varies greatly, so you should take the time to pick the right outsourcing vendor to work with.

In many cases, my role in such projects is to assist in deciding on the exact requirements, selecting the outsourcing vendor and “translating” the requirements between the company and the outsourcing vendor.

#6 – Consult

There are those who simply offer consulting (I do that by the way).

Their role is to assist in the thought processes – be it the initial phases of helping in fleshing out the product’s roadmap and differentiation, assisting in the competitive analysis, in writing down the RFPs (or the response to an RFP), selecting vendors, suggesting architecture, etc.

Many of the experienced outsourcing vendors will usually add a consulting component into their service, and their customers will usually benefit from that consulting.

What’s Next?

Looking to start a WebRTC project? Trying to understand how to get that done? Know that the market is dynamic and always changes.

Which is why I am in the process of updating two resources on my site:

  1. Choosing a WebRTC API Platform report
    1. If you think a vendor that isn’t in the report needs to be added to it – tell me
    2. If you plan on purchasing this report, then the best time would be from now until the publication of the update (see below)
  2. WebRTC Developer Tools Landscape will be updated soon – if you miss vendors here – tell me
My WebRTC API Report is getting an update and you’re getting a discount. From now, until the report gets updated during December, there’s a 20% discount. The discount will include the upcoming update (and a full year of updates).

Get your discounted report

 

The post 6 Ways Vendors Sell WebRTC Developer Tools appeared first on BlogGeek.me.

Do We Need WebRTC Events?

Mon, 10/16/2017 - 12:00

Yes. We do need WebRTC events. Which is why you should join us at Kranky Geek next week.

I’ve been asked a few times in the past several months by people about events to go to.

Should I go to that event? Will it help me with my current WebRTC project?

What event should I go to, considering I am in need of WebRTC technology?

Where can I travel to learn about WebRTC? Is there a specific event?

Which event will guide me towards what I need with WebRTC? Have me understand the market dynamics? Be a place to mingle with the industry?

Register for a Kranky Geek AMA webinar – a week ahead of our event, Chad Hart will be joining me to discuss WebRTC statistics and what to expect from this year’s Kranky Geek event

Register to the pre-event AMA webinar

The problem with events and WebRTC

If you’re in telecom, then this is how you see WebRTC:

For telecom, WebRTC is just a piece of telecom. An evolution of it. Some way of getting the telecom and VoIP infrastructure into a web browser.

If you’re in web development, then this is how you see WebRTC:

For web developers, WebRTC seems just like another piece of the HTML5 technology stack. You learn a few JS APIs. Maybe some nifty CSS and a few HTML5 tags and you’re done.

And this is how I see WebRTC:

Now, most WebRTC related events so far have been initiated by people in the telecom industry. The end result is usually a very narrow prism of what WebRTC is what it is capable of achieving. And the side tracks done in the web related events? Most of them end up explaining what WebRTC is, not going nearly deep enough.

The end result has been unsatisfying. At least for me.

This was one of the reasons I started Kranky Geek along with the help of Chris Koehncke some 4 years ago. We’ve since had Chad Hart join.

4 years into it, the question starts to crop up – do we still need WebRTC events?

Why do we still need WebRTC events?

Is there still room with a WebRTC centric theme to it?

Shouldn’t WebRTC just be wrapped into all the telecom, communications and web events out there and be done with it?

I mean, we’ve got enough meetup groups around the world for this technology, but who wants to attend a longer event on WebRTC?

I think it boils down to that illustration up there – the one where WebRTC is smack in the middle of VoIP (telecom) and the web (internet). In a way, we’re still figuring out what that means exactly. How does the infrastructure of such a thing needs to be designed; how do you scale it; what kind of monitoring mechanisms do you need to have in place; what’s the team sizes, resources and time needed to get something from a proof of concept to production.

WebRTC might not be new, but the fact that it relies on a mix of technologies and disciplines make for a rather complex and interesting ecosystem.

Join us at Kranky Geek SF 2017

Our next Kranky Geek event takes place on October 27 in San Francisco.

Kranky Geek is about WebRTC developers. Our role is to educate and share the experience coming from developers to developers.

The theme we’ve selected this time is twofold: implementation and beyond RTC.

  1. Implementation: Production ready systems. Those that have battle scars and live to tell their story. We have companies who’ve been running WebRTC in production, at scale for quite some time, and now they are here to explain what they are doing – the challenges they faced and the solutions they came up with
  2. Beyond RTC: You’ve probably heard a word or two about VR, AR, NLP, AI – acronyms that seem to be capturing the news and the imagination lately. We’ve decided to bring in a few experts in this field to explain how that fits into the story of WebRTC

We reached out to Youenn Fablet, who works on the WebKit WebRTC implementation. He will be speaking about iOS and Safari support of WebRTC.

Google will talk about their progress and roadmap of WebRTC.

Talking about Implementations, we will have Atlassian, Facebook, Peer5, Slack and Vidyo- each talking about different aspects of implementations and scaling.

Affectiva, TokBox, Twilio and VoiceBase will cover issues beyond RTC.

For our end-of-day session, we will have a repeat speaker at Kranky Geek – Philipp Hancke from appear.in – working his way around NSFW. Knowing Philipp (and seeing his draft slides), you definitely want to stick around for this one.

Register for a Kranky Geek AMA webinar – a week ahead of our event, Chad Hart will be joining me to discuss WebRTC statistics and what to expect from this year’s Kranky Geek event

Register to the pre-event AMA webinar

There’s a token admission fee in place, to control headcount and showups (free events tend to be under-attended, and we’re shifting away from that). The way this event ends up being funded is by our sponsors, who make this thing happen at all. They are part of our speakers and play an important role in the event itself.

This time, we’ve got Frozen Mountain, Google, Tokbox, Twilio, Vidyo and VoiceBase as our sponsors.

See you at Kranky Geek.

 

The post Do We Need WebRTC Events? appeared first on BlogGeek.me.

Thoughts about Twilio Studio and the Future of CPaaS

Mon, 10/09/2017 - 12:00

How does Twilio Studio fit into Twilio’s Ask Your Developer campaign?

Last month I participated in Twilio’s Signal event that took place in London. I was invited to speak there on test automation in WebRTC. You can watch my video session on YouTube. That isn’t the point of this article though.

Signal is where Twilio announces most of its major new releases. Last time, earlier this year, it was all about the engagement cloud – a restructuring of how Twilio explains its services – and a migration from a single channel world into an omnichannel one. I’ve written at length about it in Is Twilio Redefining CPaaS (hint: it is). I wrote there:

Twilio has introduced a new paradigm for the way it is layering its product offerings.

In the process, it repositioned all of its higher level APIs as the Engagement Cloud. It stitched these APIs to use its lower Programmable Communications APIs, adding business logic and best practices. And it is now looking into machine learning as well.

It is a powerful package with nothing comparable on the market.

Twilio are the best of suite approach of CPaaS – offering the largest breadth of support across this space. And it is making sure to offer powerful building blocks to make developers think twice before going for an alternative.

I think that at Signal London 2017, they outdid that with the introduction of Twilio Studio.

Trying to figure out the best approach for developing your application? Check out this free WebRTC Development Paths Matrix to understand your alternatives

Get your WebRTC Development Paths Matrix

Before We Begin

You might want to take the time to watch Signal London 2017 keynote by Jeff Lawson.

A large part of the London keynote was a rehash of what was said in San Francisco earlier this year. It was about the shift towards omnichannel and the engagement cloud. The words that struck to to me when explaining the engagement cloud were BEST PRACTICES, BUSINESS PROCESSES, REINVENT THE WHEEL (=what not to do).

I’d like to touch in this articles a few main themes and approaches that Twilio is taking, which are shaping its vision and execution at the moment.

“Ask Your Developer” is The Wrong Approach

I’ll start with where I think Twilio is missing the mark.

Ask Your Developer took center stage. Jeff Lawson wanted companies and the business people inside it to go ask their developers what they can do. How they can improve the business.

It gives us developers a great feeling of being in control. Of being valued. But for the most part, and for most developers, this is probably the wrong approach.

Most developers would be happy to work by spec.

The few that aren’t will be promoted quite fast to system architects, managerial roles in development or god forbid to product managers. Why? Because they can see the big picture.

They are the people that get asked. Or the people that answer without asking.

We should be asking our developers, but it should not be our strategy.

Which is where the miss came.

Twilio announced later on in the keynote Twilio Studio. A tool that takes some of that control from developers, putting it at the hands of decision makers.

You no longer have to ask your developer. You can work with him. Together.

More about this later.

The Code that Counts

Some 20 minutes into the keynote, Jeff Lawson invited Patrick Malatack. He started with this:

It was core to how Twilio approaches its customers. Patrick explained that this is the most important code – it is the code that counts.

The idea being that your life as a developer should be made easy, so Twilio is adding not only APIs that serve the functions you need, but also a runtime behind it to facilitate rapid development and deployment – from helper libraries, to logging and debugging facilities, the new Twilio Functions, etc.

I think the code that counts here is developers focusing on their specific business problem – abstracting everything else.

It ended up being a concept of what Twilio Runtime is:

The yellow parts in that screenshot above are the newest announcements. The rest were there earlier. Twilio isn’t only adding more features to its platform – it is beefing up its runtime, making it another competitive advantage in front of many others where it comes to pure SMS and voice capabilities.

The message here is an interesting one, but it wasn’t polished enough. I think this is where we will see more in future Signal events from Twilio.

Twilio Studio

At about 1:24:00 of the keynote, Jeff Lawson introduces Twilio Studio.

It starts by explaining that building is fun but maintaining isn’t (he is correct).

The goal, based on Jeff Lawson, is to massively accelerate roadmaps of Twilio’s customers.

I think it is a lot more than that.

Because this is so new and fresh, still in developer preview (and something I’ve started playing with a bit), it is hard to write this in an ordered fashion. Which means I’ll be going for a bulleted list instead

  • This is a really cool tool. From the demos and the time I’ve spent with Twilio Studio, it is really powerful
  • Getting UI tools that handle state machines for developers is not easy. The Twilio Studio experience has a nice feel to it – I liked the experience
  • Twilio Studio reminds me of Zapier. But where Zapier has a 1D linear approach to tooling and integration, Studio is its big brother, offering 2D visualization to communication state machines
  • There’s no support for the visible communication parts in Twilio Studio. Yet
    • You can send and receive programmable SMS and voice with it
    • A bit of messaging as well
    • But you can’t connect it to the voice in your SDK or manage a video chat room with it
    • This will need to be added later at some point to complete the puzzle
  • Is Twilio Studio the centerpoint of a customer’s flow or a corner piece of it?
    • Twilio Studio can be used to express your whole business process, fleshing out the important parts and branching away to your integrations
    • It can also be used to solve a minor piece of your bigger puzzle
    • It is up to you to decide how you use it
  • At the hand of an experienced architect, Twilio Studio will offer super powers
    • There are many ways to define and template what you need
    • Some approaches will work better, offering more flexibility
    • The focus should be around inclusion of as many stakeholders in the company as possible – being able to show them and interact with them by looking at a Twilio Studio Flow
  • Here’s a question: Is Twilio Studio a tool for Developers? Designers? Implementers? Analysts?
    • Twilio Studio today is fit for developers, but it won’t stay that way long
    • It can be used by implementers that know a bit about code but aren’t developers
    • It can be used to open a discussion between a developer and a business analyst
    • This is a way for expanding the target market within a Twilio’s customer from solely one of developers towards a larger audience. The motto is no longer “Ask your developer”
  • Twilio Studio can be enhanced
    • It is a great first step, but the next ones are a lot more interesting
    • They are also a lot more threatening to competitors
    • If Twilio succeeds here, it will dominate this space with the companies that matter the most
  • Twilio Studio is the ultimate vendor lock-in
    • Enterprises will adopt it, due to its many benefits
    • They will find it hard to switch because of these benefits
    • Enterprises won’t want to switch… Twilio Studio will be too valuable. Too transformative

This tool can do to contact centers what marketing automation is doing to email newsletters. If I were a contact center vendor… I’d consider Twilio Studio my biggest threat moving forward.

Pricing

There were 3 price points for Studio:

  1. FREE – up to 1,000 Engagements. To get developers hooked up to this tool and make them not bother with actually “developing” using “code”. It is also a great way of getting developers to NOT look at other competing vendors
  2. The minimal plan, at +$100/month price point. Covers up to 20,000 Engagements. This is probably where most small companies will be “living”, which is just fine
  3. The enterprise, unlimited plan, at $10,000/month or more. Expensive, but it depends how much traffic you’re handling

Then there’s the question of what an Engagement is exactly. Is it a flow of a single event in a Flow? Is it a widget being accessed inside a Flow? In a 2-way bot conversation, each message exchange is probably an exchange I am assuming – the more talkative your app – the more Engagements it will eat up.

Not sure if I am missing a tier between PLUS and ENTERPRISE here. There seems to be too big of a gap in there.

Positioning

One last thing – Twilio Studio has been positioned by Jeff Lawson inside the Engagement Cloud, below all of its current logical components:

I’d place it as a vertical bar next to the whole Twilio stack. Probably adding Functions write next to it:

My guess? Product management had a lot of internal discussions on this one, trying to decide where to place Studio – inside the engagement cloud, above it, right next to it. They ended up picking inside it.

A Word About GDPR

GDPR stands for General Data Protection Regulation. It is a piece of legislation that will become effective May 2018, in less than a year. A period of two years of grace has been given to reach that date.

It deals with the protection and processing of private information of citizens of the EU, which practically covers any global player out there, and even many who aren’t.

In a nutshell, it is a headache. Especially if you’re making use of analytics, personalization, automation, chat bots, AI or any other big data related technology. It is also relevant if you just hold an SQL database of your customers.

If you were working in a specific regulated vertical, such as healthcare or finance, then you might be used to such things. If you’re not, then you should start paying attention. Especially with the communication part of whatever it is that you do – this is where personal information gets passed along with the metadata that needs to be handled with care.

Twilio pushing GDPR this early on means two things to me:

  1. They are looking at the enterprise, and making sure their platform is fit for their purpose (large multinational enterprises will be the first to adopt and adhere to something like GDPR)
  2. They are making sure that they are leading the CPaaS pack here. I am unaware of any other CPaaS vendor who has been pushing GDPR besides stating that they will be ready by May 2018. Twilio is trying to make sure it is synonymous with “GDPR compliant CPaaS”.

It also means that communication – telecom or IP based – is becoming slightly harder to handle. Something that works well for a vendor like Twilio whose purpose in life is simplifying complexity (=the more complexity the more value derived by Twilio).

Where do we go from here?

Twilio was and still is the undisputed CPaaS king. They are bigger than anyone else by a large margin and they are working hard on maintaining a technology edge on everyone else.

Twilio’s stock has been somewhat volatile lately with Uber’s announcement and later Amazon’s text messaging announcement (which ended up about Amazon using Twilio). Twilio seem vulnerable.

The two main announcements here were Studio and GDPR. Studio brings Twilio to a larger audience and increases their vendor lock-in, whereby reducing the effectiveness of their competition. GDPR is put in place as another headache Twilio solves for its customers – the more regulatory and bureaucracy like GDPR the better for a company like Twilio – it reduces the competition from in-house developers – which is doubly important now.

These two announcements are there to deal with its perceived vulnerability. They make developing using Twilio easier than ever – almost risk-free. And it makes it harder for competition to succeed in future land grabs trying to go after Twilio’s bigger accounts.

It will be interesting to see how competitors would react to this in the long run, and even more interesting to see what will Twilio Studio grow into.

Trying to figure out the best approach for developing your application? Check out this free WebRTC Development Paths Matrix to understand your alternatives

Get your WebRTC Development Paths Matrix

The post Thoughts about Twilio Studio and the Future of CPaaS appeared first on BlogGeek.me.

H.264 or VP8 in Your WebRTC Application?

Mon, 10/02/2017 - 12:00

No simple answer.

Apple recently announced that Safari will be supporting WebRTC. That support isn’t there yet to the point where it is stable enough, but we already know one thing:

Safari supports only the H.264 video codec.

Codec wars are over? 2 MTI (mandatory to implement) codecs in the form of VP8 and H.264?

Who cares?

Reality is that Apple decided at this stage not to support VP8 – and it hasn’t said anything about plans to support or not support VP8 in the future. That said, all signals indicate that support for VP8 in Safari is unlikely to happen.

This brings us to a simple yet challenging question:

When writing a WebRTC application. Should you make use of VP8 or H.264?

The answer isn’t a simple one. Choosing VP8 will leave you without Safari. Choosing H.264 will leave you without other important features and capabilities, as well as create a potential legal headache.

This is why I decided to create a new free video mini course – to guide you through the process and help you make the best decision here.

This video course, Picking a WebRTC Video Codec, is free and includes 4 lessons and a cheat sheet.

Find out which codec to use: VP8 or H.264

The post H.264 or VP8 in Your WebRTC Application? appeared first on BlogGeek.me.

What’s in my Online WebRTC Course?

Mon, 09/25/2017 - 12:00

Looking for a WebRTC training? Search no more. My online WebRTC course is here.

I will be relaunching my Advanced WebRTC Architecture Course next week, so it is time to see what you’ll find in this WebRTC training program I’ve created and fine-tuned for over a year now.

Prefer watching and listening more than reading? Join my free webinar on Wednesday for a quick lesson on WebRTC architecture related topics, where I’ll also be explaining the WebRTC training course and its contents.

Register and Grok media in WebRTC

The sections below explain the various parts of this unique WebRTC training. These are decidedly focused on delivering the best learning experience possible.

WebRTC Training Main Modules

The course is designed and built around 7 main modules:

Each module includes multiple lesson, and each lesson is a recorded video session of anywhere between 10-40 minutes of length. Most lessons also include additional links and some written content in them.

Module 1 gives you the baseline information about what WebRTC is. Consider it your introduction to the topic.

Modules 2-3 focus on signaling. They’ll take you from an understanding of UDP and TCP up to deciding what signaling protocol to use in each case and why.

Modules 4-5 are all about media. They explain voice and video codecs – in the context of their relevance to WebRTC. They also deal with the various media architectures available in group calling and recording scenarios.

Module 6 is all about the ecosystem. It lists the different strategies developers have in front of them when designing a WebRTC application, and then goes into details of each one of these.

Module 7 brings it all together. It takes different scenarios and use cases, analyzes them and builds the necessary architectures to support each use case. This is where the theory comes into practice.

The total length of the recordings in all modules and lessons? Over 15 hours.

You progress with the material at your own pace, jumping between lessons as you see fit, or through the original order they were laid out in.

If you’re looking for something to print and share, there’s a PDF version of the WebRTC course syllabus available.

Get Your WebRTC Questions Answered in the Course Forum

The course itself is supplemented with an online forum.

I’ve been contemplating making that forum a Slack channel or a Facebook group. Decided against it. While that may change with time, the course does have a forum built into it.

When you enroll to the course, you also gain access to the forum, which is where you can ask questions and get answers to them.

At any point in time.

Be it about a specific lesson, or a challenge you have in what you’re currently doing at work with WebRTC.

And if sharing openly isn’t your thing, you can always just email me directly.

WebRTC Training Office Hours

Twice a year, a series of office hours are provided for the course.

There are 12 such live sessions, taking place on roughly a weekly basis. They happen in 2 different times of the day, to fit different timezones.

These office hours include two parts in them:

  1. Me rambling about a topic. Call it a live lesson. It can be something from the actual course, or just thoughts and updates on what’s been going on lately with WebRTC out there
  2. Q&A. In this part, those enrolled to the course can ask anything they want. It is a part of the course which not many use, but those that do seem to enjoy it and derive benefit from it

The office hours are recorded and available for playback as well, so if you miss a session – you can always return to it and play it back.

WebRTC Course Bonus Materials

Besides all the 7 course module, I’ve added a bonus module.

This one contains some extra lessons as well as cheat sheets and templates that are spread all over my site in an easy to reach location.

What lessons are in the bonus materials?

4 recorded lessons

  1. WebRTC standardization
  2. Writing RFP requirements for WebRTC
  3. Media algorithms
  4. Using testRTC

The media algorithms lesson is really important. It covers topic that I touch only lightly during the course such as echo cancellation and jitter buffer.

2 recorded guest lessons

In my last round of the course, appear.in, who took the corporate plan, were also kind enough to share two new guest lessons:

  1. Video Quality in WebRTC
  2. Deploying (co)TURN on AWS

Philipp Hancke and Bradley T. Hughes were the instructors for these two and I found myself learning a lot in these lessons as well. Now, they are part of the course bonus materials.

What’s New in This Round of the WebRTC Course?

This is the third time I am running this course, and the second round of updates to it.

  1. I’ve updated some of the materials where appropriate (someone told me recently that Apple is doing something with WebRTC, so it had to find its way to the course )
  2. I also recorded a session from scratch because apparently, the audio recording of that one wasn’t the best
  3. The bonus materials (described above), are going to go away. They will be available only during course launch periods (=this week) or for corporate plans
  4. There’s a new eBook that is going to be added as a bonus to the course. It is called “Built to Scale”, and it is a look behind the scenes of how meet.jit.si is… built to scale

A Few Questions Answered About the WebRTC Course

I am now adding an option to take my WebRTC training as part of every consulting project I take. Sometimes, the customer takes me up on the offer, and other times they don’t. There are questions that get asked almost all the time about the course by these customers, so I decided to answer the most common ones here.

How long will it take to work through the WebRTC course?

It is entirely up to you.

There’s over 15 hours of recorded content in the course. More if you start going through the links, external slide decks and videos that I share in the course lessons.

But at the end of the day:

  1. You decide on the pace of your WebRTC studies
  2. You decide which lessons to start with first
  3. You decide if there are lessons you prefer skipping
  4. You decide if you want to watch to a specific lesson again

If you take a lesson in each working day, then 2 months is approximately what you’ll need to get from start to end.

Is there any prerequisite to taking this WebRTC training?

This WebRTC training program assumes you have some good understanding of technology. The rest – it fills in with the various modules of the course.

You don’t need to have knowledge in VoIP to take this course. You don’t need to be a web developer either. What you do need, is to have some technical grasp and understanding.

If you already have prior knowledge, then that’s fine – this WebRTC course isn’t forcing you to take its modules and lessons by their order, so you can skip to the relevant topics that interest you.

Is there a certificate?

As most online learning courses go, so too the WebRTC course offers a certificate.

Once you’ve completed the course, you will be receiving a WebRTC certificate indicating you’ve passed the course.

For companies, there’s a separate plan, which enables them to hold a badge of the WebRTC course. You can find the vendors that have taken this plan in the corporate partners page.

What’s Next?

Want to learn more about media in WebRTC? Join this free webinar to see an analysis of a real case study I came across recently. What did the company had in mind to build and how they botched their architecture along the way.

Register and Grok media in WebRTC

And if you’re really serious, enroll to my Advanced WebRTC Architecture Course.

The post What’s in my Online WebRTC Course? appeared first on BlogGeek.me.

Grokking Media in WebRTC (a free webinar for my WebRTC Course)

Mon, 09/18/2017 - 12:00

Media in WebRTC.

What makes it so challenging?

I guess it can be attributed to the many disciplines and different areas of knowledge that you are expected to grok.

My last two articles? They were about the differences between VoIP, WebRTC and the web.

By now, you probably recognize this:

If you’ve got some VoIP background, then you should know how WebRTC is different than VoIP.

If you’ve got a solid web background, then you should know why WebRTC development is different than web development.

When it comes to media, media flows and media related architectures, there seems to be an even bigger gap. People with VoIP background might have some understanding of voice, but little in the way of video. People with web background are usually clueless about real time media processing.

The result is that in too many cases, I see WebRTC architectures that make no sense in how they fit to what the vendor had in mind to create.

Want to learn more about media in WebRTC? Join this free webinar to see an analysis of a real case study I came across recently. What did the company had in mind to build and how they botched their architecture along the way.

Register and Grok media in WebRTC

Here are 4 reasons why media is so challenging:

#1 – Media is as Real Time as it Gets

Page load speed is important. People leave if your site doesn’t load fast. Google incorporates it as an SEO ranking parameter.

This is how it is depicted today:

So… every second counts. And the post slug is “your-website-design-should-load-in-4-seconds”.

From a WebRTC point of view, here’s what I have to say about that:

If I were given a full second to get things done with WebRTC I’d be… (fill in the blank)

Seriously though, we’re talking about real time conversations between people.

Not this conversation:

But the one that requires me to be able to hold a real, live one. With a person that needs to listen to me with his ears, see me with his eyes, and react back by talking to me directly.

400 milliseconds of a roundtrip or less (that’s 200 milliseconds to get media from your camera to the display on the other side) is what we’re aiming for. A full second would be disastrous and not really usable.

Real time.

For real.

#2 – Media Requires Bandwidth. Lots and Lots of Bandwidth

This one seems obvious but it isn’t.

Here’s a typical ADSL line:

Most people live in countries where this is the type of a connection you have into your home. You’ll have 20, 40 or maybe 100MB downlink – that’s the maximum bitrate you can receive. And then you’ll have 1, 2 or god forbid 3MB uplink – that’s the maximum bitrate you can send.

You see, most of the home use of the internet is based on the premise that you consume more than you generate. But with WebRTC, you’re generating media at all times (if it isn’t a live streaming type of a use case). And that media generation is going to eat on your bandwidth.

Here’s how much it takes to deliver this page to your browser (text+code, text+code+images) versus running 5 minutes of audio (I went for 40kbps) and 5 minutes of video (I went for 1Mbps). I made sure the browser wasn’t caching any page elements.

There’s no competition here.

Especially if you remember that with the page it is you who is downloading it, while with audio and video you’re both sending and receiving – it it is relentless as long as the conversation goes on the data use will grow.

Three more things to consider here:

  1. Usually, the assumption is that you need twice the bandwidth available than what you’re going to effectively send or receive (overheads, congestion and pure magi)
  2. You’re not alone on your network. There are more activities running on your devices competing over the same bandwidth. There can be more people in your house competing over the same bandwidth
  3. If you’re connecting over WiFi, you need to factor in stupid issues such as reception, air interferences, etc. These affect the effective bandwidth you’ll have as well as the quality of the network
#3 – Media is a Resource Hog

So it’s real time and it eats bandwidth. But that’s only half the story.

The second half involves anything else running on your device.

To encode and decode you’re going to need resources on that device.

CPU. Something capable. A usable hardware acceleration for the codecs to assist is welcomed.

Memory. Encoding and decoding are taxing processes. They need lots and lots of memory to work well. And also remember that the higher the resolution and frame rate of the video you’re pumping out – the higher the amount of memory you’ll be needing to be able to process it.

Bus. Usually neglected, there’s the device’s bus. Data needs to flow through your device. And video processing takes its toll.

Doing this in real time, means opening dedicated threads, running algorithms that are time sensitive (acoustic echo cancellation for example), synchronizing devices (lip syncing). This is hard. And doing it while maintaining a sleek UI and letting other unrelated processes run in the background as well makes it a tad harder.

So thinking of running multiple encoders and decoders on the device, working in mesh topologies in front of a large number of other users, or any other tricks you’re planning need to account for these challenges. And they need to put in focus the fact that browser vendors need to be aware of these topologies and use cases and take their time to optimize WebRTC to support them.

#4 – Media is Just… Different

Then there’s this minor fact of media just being so darn different.

It isn’t TCP, like HTTP and Websocket.

It requires 3 (!) different servers to just get a peer to peer session going (and they dare call it peer to peer).

Here’s how most websites would indicate their interaction with the browser:

And this is how a basic one would look like for WebRTC:

We’ve got here two browsers to make it interesting. Then there’s the web server and a STUN/TURN server.

It gets more complicated when we want to add some media servers into the mix.

In essence, it is just different than what we’re used to in the web – or in VoIP (who decided to do signaling with HTTP anyway? Or rely on STUN and TURN instead of placing an SBC?).

What’s Next?

These reasons of media being challenging? Real time, bandwidth-needy, resource hog and being different; That’s on the browser/client side only. Servers that need to process media suffer from the same challenges and a few more. One that comes to mind is handling scale.

So we’ve only touched the tip of the iceberg here.

This is why I created my Advanced WebRTC Architecture Course a bit over a year ago. It is a WebRTC training that aims at improving the WebRTC understanding of developers (and the semi-technical people around them).

In the coming weeks, I’ll be relaunching the office hours that run alongside the course for its third round. Towards that goal, I’ll be hosting a free webinar about media in WebRTC.

I’ll be doing something different this time.

I had an interesting call recently with a company moving away from CPaaS towards self development. The mistake they made was that they made that decision with little understanding of WebRTC.

Here’s what we’ll do during the webinar:

  1. Introduce the requirements they had
  2. Explain the architecture and technology stack they selected
  3. Show what went wrong
  4. Suggest an alternate route

Similar to my last launch, there will be a couple of time limited bonuses available to those who decide to enroll for the course.

Want to learn more about media in WebRTC? Join this free webinar to see an analysis of a real case study I came across recently. What did the company had in mind to build and how they botched their architecture along the way.

Register and Grok media in WebRTC

And if you’re really serious, enroll to my Advanced WebRTC Architecture Course.

 

The post Grokking Media in WebRTC (a free webinar for my WebRTC Course) appeared first on BlogGeek.me.

Why Developing With WebRTC is Different than Web Development?

Mon, 09/11/2017 - 12:00

Soda and Mentos.

Last week I wrote about the difference between WebRTC and VoIP development. This week let’s see how WebRTC development is different from web development.

Let’s start by saying this for starters:

WebRTC is about Web Development

Well, mostly. It is more about doing RTC (real time communications). And enabling to do it over the web. And elsewhere. And not necessarily RTC.

WebRTC is quite powerful and versatile. It can be used virtually everywhere and it can be used for things other than VoIP or web.

When we do want to develop WebRTC for a web application, there are still differences – in the process, tools and infrastructure we will need to use.

Why is that?

Because real time media is different and tougher than most of the rest of the things you happen to be doing on the browser itself.

It boils down to this illustration (from last week):

So yes. WebRTC happens to run in the web browser. But it does a lot of things the way VoIP works (it is VoIP after all).

WebRTC dev != Web dev. And one of the critical parts is the servers we need to make it work. Join my free mini video WebRTC course that explains the server story of WebRTC.

Join the free server side WebRTC course

If you plan on doing anything with WebRTC besides a quick hello world page, then there’s lots of new things for you to learn if you’re coming from a web development background. Which brings me to the purpose of this article.

Here are 10 major differences between developing with WebRTC and web development:

#1 – WebRTC is P2P

Seriously. You can send voice, video and any other arbitrary data you wish directly from one browser to another. On a secure connection. Not going through any backend server (unless you need a relay – more on that in #6).

That triangle you see there? For VoIP that’s obvious. But for the web that’s magical. It opens up a lot of avenues for new types of services that are unrelated to VoIP – things like WebTorrent and Peer5; The ability to send direct private messages; low latency game controllers; the alternatives here are endless.

But what does this triangle mean exactly?

It means that you are not going to send your media through a web server. You are going to either send it directly between the browsers. Or you are going to send it to a media server – dedicated to this task.

This also means that a lot of the things you’ll need to keep track of and monitor don’t even get to your servers unless you do something about it to make it happen.

#2 – It isn’t all Javascript and JSON

Yes. I know last time I said it is all Javascript.

But if what you know is limited to Javascript then life is going to be a world of pain for you with WebRTC.

Media servers for example are almost always developed using C/C++ or Java. If you’ll need to debug them (and the serious companies do that), then you’ll need to understand these languages as well.

The second part is more JSON and less Javascript related – there’s one part of WebRTC that is ugly as hell but working. That’s the SDP that is used in the offer-answer negotiation process.

Besides being hard to interpret (different people understand SDP differently which later means they develop parsers and code for it differently), SDP is also hard to parse using Javascript. It isn’t built as a JSON blob, so the code to fetch a field or modify a field in SDP isn’t trivial (doable, but a pain).

#3 – There’s This Thing Called UDP

I guess this is the start of the following points as well, so here we go.

Today, the web is built on top of TCP. It started with HTTP. Moved to Websockets (also on top of TCP). And now HTTP/2 (also TCP).

There are attempts to allow for UDP type of traffic – QUIC is an example of it. But that isn’t there yet. And for most web developers that’s just under the hood anyway.

With WebRTC, all media is sent over UDP as much as possible. It can work over TCP if needed (I sent you to #6 didn’t I?), but we try to refrain for it – you get better media quality with UDP.

The table above shows the differences between UDP and TCP. This lies at the heart of how media is sent. We use unreliable connections with best effort.

#4 – Compromise is the Name of the Game

That UDP thing? It adds unreliability into the mix. Which also means that what you send isn’t what you get. Coupled with the fact that codecs are resource hogs, we get into a game of compromise.

In VoIP (and WebRTC), almost any decision we make to improve things in one axis will end up costing us in another axis.

Want better compression? Lose quality.

Don’t want to lose quality? Use more CPU to compress.

Want to lower the latency? Lose quality (or invest more CPU).

On and on it goes.

While CPUs are getting better all the time, and available bandwidth seems to be getting higher as well, our demand of our media systems is growing just as well. At times even a lot faster.

That ends up with the need to compromise.

All the time.

You’ll need to know and understand media and networking in order to be able to decide where to compromise and where to invest.

#5 – Best Effort is the Other Name

Here’s something I heard once in a call I had:

“We want our video quality to be a lot better than Skype and Hangouts”.

I am fine with such an approach.

But this is something I heard from:

  • 2 entrepreneurs with no experience or understanding if video compression
  • For a use case that needs to run in developing countries, with choppy cellular reception at best
  • And they assumed they will be able to do it all by themselves using WebRTC

It just doesn’t work.

WebRTC (and VoIP) are a best effort kind of a play.

You make do with what you get, trying to make the best of it.

This is why WebRTC tries to estimate the bandwidth available to it, and will then commence eating up all that available bandwidth to improve the video quality.

This is why when the network starts to act (packet loss), WebRTC will reduce the bitrate it needs and reduce the media quality in order to accommodate what is now available to it.

Sometimes these approaches work well. Other times not so well.

And yes. A lot of the end result will be reliant on how well you’ve designed and laid out your infrastructure for the service.

#6 – NAT Traversal Rules Your Life

Networks have NATs and Firewalls. These are nothing new, but if you are a web developer, then most likely they never did make life any difficult for you.

That’s because in the “normal” web, the browser will reach out to the server to connect to it. And being the main concept of our current day web, NATs and Firewalls expect that and allow this to happen.

Peer to peer communications, direct across browsers, as WebRTC operates. And with the use of UDP no less (again, something that isn’t usually done in the web browser)… these are things that firewalls and the IT personnel configuring them usually don’t need to contend with.

For WebRTC, this means the addition of STUN/TURN servers. Sometimes, you’ll hear the word ICE. ICE is an algorithm and not a server. ICE makes use and STUN and TURN. STUN and TURN are two protocols for NAT traversal, each using its own server. And usually, STUN and TURN servers are implemented in the same code and deployed using a single process.

WebRTC is doing a lot of effort to make sure its sessions will get connected. But at the end of the day, even that isn’t always enough. There are times when sessions just can’t get connected – whoever configured the firewall made sure of it.

#7 – Server Scaling is Ridiculous

Server scaling with WebRTC is slightly different than that of regular web.

There are two main reasons for that:

  1. The numbers are usually way smaller. While web servers can handle 5 digit connections or more, their WebRTC counterparts will often struggle with the higher end of 3 digits. There’s a considerable cost of hosting HD video and media server processing
  2. WebRTC requires statefulness. Severing a connection and restarting it will always be noticeable – a lot more than in most other web related use cases. This makes high availability, fault tolerance, upgrading and similar activities harder to manage with WebRTC

You’ll need to understand how each of the WebRTC servers work in order to understand how to scale it.

#8 – Bandwidth is Expensive

With web pages things are rather simple. The average web page size is growing year to year. We’ve got above 2.3MB in 2016. But that page is constructed out of different resources pulled from different servers. Some can be cached locally in the browser.

A 5 minute HD video at 2Mbps (not unheard of and rather common) will take up 75 MB during that 5 minutes.

If you are just doing 1:1 video calls with a 10% TURN relay factor, that can be quite taxing – running just 1,000 calls a day with an average of 5 minutes each will eat up 15 GB a day in your TURN server bandwidth costs. You probably want more calls a day and you want them running for longer periods of time as well.

Using a media server for group calling or recording makes this even higher.

As an example, at testRTC we can end up with tests that run into the 100’s of GBs of data per test. Easily…

When you start to work out your business model, be sure to factor in your bandwidth costs.

#9 – Geography is Everything for Media Delivery

For the most part, and for most services, you can get away with running your service off a specific data center.

This website of mine is hosted somewhere in the US (I don’t even care where) and hooked up to CDN services that take care of the static files. It has never been an issue for me. And performance is reasonable.

When it comes to real time live media, which is where WebRTC comes in, this won’t always do.

Getting data from New York to Paris can easily take 100 milliseconds or more, and since one of the things we’re striving for is real time – we’d like to be able to reduce that as much as we can.

Which gets us to the illustration above. Imagine two people in Paris having a WebRTC conversation that gets relayed through a TURN server in New York. Not even mentioning the higher possibility of packet losses, there’s clearly a degradation in the quality of the call just by the added delay of this route taken.

WebRTC, even for a small scale service, may need a global deployment of its infrastructure servers.

#10 – Different Browsers Behave Differently

Well… you know this one.

As a web developer, I am sure you’ve bumped into browsers acting differently with your HTML and CSS. Just recently, I tried to use <button> outside of a form element, only to find out the link that I placed inside it got ignored by Firefox.

The same is true for WebRTC. The difference is that it is a lot easier to bump into and it messes things up in two different levels:

  1. The API behavior – not all browsers support the exact same set of APIs (WebRTC isn’t really an official standard specification yet – just a draft; and browser implementations mostly adhere to recent variants of that draft)
  2. The network behavior – WebRTC means you communicate between browsers. At times, you might not get a session connected properly from one browser to another if they are different. They process SDP differently, they may not support the same codecs, etc.

As time goes by, this should get resolved. Browser vendors will shift focus from adding features and running after the specification towards making sure things interoperate across browsers.

But until then, we as developers will need to run after the browsers and expect things to break from time to time.

#11 – You Know More Than You Think

The majority of WebRTC is related to VoIP. That’s because at the end of the day, is it a variant of VoIP (one of many). This means that VoIP developers have a huge head start on you when it comes to understanding WebRTC.

The problem for them is that they have a different education than you do. Someone taught them that a call has a caller and a callee. That you need to be able to put a call on hold. To transfer the call. To support blind transfer. Lots and lots of notions that are relevant to telephony but not necessarily to communications.

You aren’t “tainted” in this way. You don’t have to unlearn things – so that nagging part of an ego telling you how things are done with VoIP – it doesn’t exist. I had my share of training sessions where most of my time was spent on this unlearning part.

This means that in a way you already know one important thing with WebRTC – that there’s no right and wrong in how sessions are created – and you are free to experiment and break things with it before coming to a conclusion of how to use it.

That’s powerful.

What’s Next?

If you have web development background, then there’s much you need to learn about how VoIP is done in order to understand WebRTC better.

WebRTC looks simple when you start with it. Most web developers will complain after a day or two of how complex it is. What they don’t really understand is how much more complicated VoIP is without WebRTC. We’ve been given a very powerful and capable tool with WebRTC.

Need to warm up to WebRTC? Try my free WebRTC server side mini course.

And if you’re really serious, enroll to my Advanced WebRTC Architecture Course.

 

The post Why Developing With WebRTC is Different than Web Development? appeared first on BlogGeek.me.

Why Developing With WebRTC is Different than VoIP Development?

Mon, 09/04/2017 - 12:00

Water and oil?

Let’s start by saying this for starters:

WebRTC is VoIP

That said, it is different than VoIP in the most important of ways:

  1. In the ways entrepreneurs make use of it to bring their ideas to life
  2. In the ways developers yield it to build applications

Why is that?

Because WebRTC lends itself to two very different worlds, all running over the Internet: The World Wide Web. And VoIP.

And these two worlds? They don’t mix much. Beside the fact that they both run over IP, there’s not a lot of resemblance between them. Well, that and the fact that both SIP and HTTP has a 200 OK message.

Everyone is focused on the browser implementation of WebRTC. But what of the needed backend? Join my free mini video WebRTC course that explains the server story of WebRTC.

Join the free server side WebRTC course

If you ever developed anything in the world of VoIP, then you know how calls get connected. You’re all about ring tones and the many features that comprise a Class 5 softswitch. The turth of the matter is, that this kind of knowledge can often be your undoing when it comes to WebRTC.

Here are 10 major differences between developing with WebRTC and developing with VoIP:

#1 – You are No Longer in Control

With VoIP, life was simple. All pieces of the solution was yours.

The server, the clients, whatever.

When something didn’t work, you’d go in, analyze it, fix the relevant piece of software, and be done with it.

WebRTC is different.

You’ve got this nagging thing known as the “browser”.

4 of them.

And they change. And update. A lot.

Here’s what happened in the past year with Chrome and Firefox:

A version every 6-8 weeks. For each of them.

And these versions? They tend to change things in how the browsers change their behavior when it comes to WebRTC. These changes may cause services to falter.

These changes means that:

  1. You are not in control over the whole software running your service
  2. You are not in control of when pieces of your deployment get upgraded (browsers will upgrade without you having a say in it)

VoIP doesn’t work this way.

You develop, integrate, deploy and then you decide when to upgrade or modify things. With WebRTC that isn’t the case any longer.

You must continuously test against future browser versions (beta, unstable, Canary and nightly should become part of your vocabulary). You need to have the means to easily and quickly upgrade a production service – at runtime. And be prepared to do it rather frequently.

#2 – Javascript is King

My pedigree comes from VoIP.

I am a VoIP developer.

I did development, project management, product management and then been a CTO of a business unit where what we did was develop VoIP software SDKs that were used (and are still used) in many communication products.

I am a great developer. Really. One of the best I know. At least when it comes to coding in C.

VoIP was traditionally developed in C/C++ and Java.

With Javascript I know my way but by no means am I even an average developer. My guess is that a lot of VoIP engineers have a similar background to me.

WebRTC is all about Javascript.

In WebRTC, JavaScript is King
Click To Tweet

Yes. WebRTC has a Javascript API. But that’s half the story. Many of the backend systems written for use with WebRTC ends up using Node.js. Which uses Javascript.

WebRTC isn’t limited to Javascript. There are systems written in C, Java, Python, C#, Erlang, Dart and even PHP that are used. There are .Net systems. On mobile, native apps use Objective C, Swift or Java in their implementations of client-side WebRTC SDKs.

But the majority? That’s Javascript.

Three main reasons I can see for it:

  1. Fashion. Node.js is fashionable and new. WebRTC is also new, so there’s a fit
  2. Asynchronous. The signaling in WebRTC needs to be snappy and interactive. It needs to have a backend that can fit nicely with its model of asynchronous interactions and interfaces. Node.js offers just that and makes it easier to think of signaling on the frontend and backend at the same time. Which leads us to the third and probably most important reason –
  3. Javascript. You use it in the frontend and backend. Easier for developers to use a single language for both. Easier to shift bits and pieces of code from one side to the other if and when needed
#3 – A Big Island

VoIP is all about interoperability. A big happy family of vendors. All collaborating and cooperating. The idea is that if you purchase a phone from one vendor, you *should* be able to dial another vendor’s phone with it via a third vendor’s PBX. It works. Sometimes. And it requires a lot of effort in interoperability testing and tweaking. An ongoing arduous task. The end result though is a system where you end up testing a small set of vendors that are approved to work within a certain deployment.

VoIP and interoperability abhors the idea of islands. Different communication services that can’t connect to each other.

WebRTC is rather different. You no longer build one VoIP product or device that is designed to communicate with VoIP devices of other vendors. You build the whole shebang.

An island of sorts, but a rather big one. One where you can offer access through all browsers, operating systems and mobile devices.

You no longer care about interoperability with other vendors – just with interoperability of your service with the browsers you are relying on. It simplifies things some while complicating the whole issue of being in control (see #1 above).

#4 – It is Cloudy

It seems like VoIP was always mean to run in local deployments. There are a few cases where you see it deployed globally, but they aren’t many. Usually, there’s a geography that goes into the process.

This is probably rooted with the origins of VoIP – as a replacement / digital copy of what you did in telecom before. It also relates to the fact that the world was bigger in the past – the cloud as we know it today (AWS and the many other cloud providers that followed) didn’t really exist.

Skype is said to have succeeded so much as it did due to the fact that it had a great speech codec at the time that was error resilient (it had FEC built-in at a time companies conceptualized about bickering in the IETF and the ITU standard bodies about adding FEC in the RTP layer). It also had NAT traversal that just worked (again, when STUN and TURN were just ideas). The rest of the world? We were all happy enough to instruct customers to install their gatekeepers and B2BUAs in the DMZ.

Since then VoIP has evolved a lot. It turned towards the SBC (more on this in #10).

WebRTC has bigger challenges and requirements ahead of it.

For the most part, and with most deployments of WebRTC, there are three things that almost always are apparent:

  1. Deployments are global. You never know from where the users will be joining. Not globally and not their type of network
  2. Networks are unmanaged. This is similar to the above. You have zero control over the networks, but your users will still complain about the quality (just check out any of Fippo’s analysis posts)
  3. We deploy them on AWS. All the time. On virtual machines. Inside Docker containers. Layers and layers of abstraction. For a real time service. It it seems to work
#5 – Bring Your Own Signaling

VoIP is easy. It is standardized. Be it SIP, H.323, XMPP or whatever you bring to the table. You are meant to use a signaling protocol. Something someone else has thought of in the far dark rooms in some standards organization. It is meant to keep you safe. To support the notion and model of interoperability. To allow for vendor agnostic deployments.

WebRTC did away with all this, opting to not have a signaling protocol at all out of the box.

Some complain about it (mostly VoIP people). I’ve written about it some 4 years ago – about the death of signaling.

With WebRTC you make the decision on what signaling protocol you will be using. You can decide to go for a standards based solution such as SIP over WebSocket, XMPP over BOSH or WebSocket – or you can use a newly created signaling protocol invented only for your specific scenario – or use whatever you already have in your app to signal people.

As with anything in WebRTC, it opens up a few immediate questions:

  1. Should you use a standards based signaling protocol or a proprietary one?
  2. Should you built it on your own from scratch or use a third party framework for it?
  3. Should you host and manage it on your own or use it as a service instead?

All answers are now valid.

#6 – Encryption and Privacy are MANDATORY

With VoIP, encryption was always optional. Seldom used.

I remember going to these interoperability events as a developer. The tests that almost never really succeeded were the ones that used security. Why? You got to them last during the week long event, and nobody got that part quite the same as others.

That has definitely changed over the years, but the notion of using encryption hasn’t. VoIP products are shipped to customers and deployed without encryption. The encryption piece is an optional configuration that many skip. Encryption makes it hard to use wireshark to understand what goes in the network, it takes up CPU (not much anymore, but still conceptually it is), it complicates things.

WebRTC on the other hand, has only encryption configured into it. No way to use it with clear RTP. even if you really really want to. Even if you swear all browsers and their communications run inside a secure network. Nope. can’t take security out of WebRTC.

#7 – If it is New, WebRTC Will be Using it

When WebRTC came out, it made use of the latest most recent RFCs that were VoIP related in the media domain.

Ability to bundle RTP and RTCP on the same stream? Check.

Ability to multiplex audio and video on the same stream? Check.

Ability to send FIR commands over RTCP and not signaling? Check.

Ability to negotiate keys over DTLS-SRTP instead of SDES? Check.

There are many other examples for it.

And in many cases, WebRTC went to the extreme of banning the other, more common, older mechanisms of doing things.

VoIP was always made with options in mind. You have at least 10 different ways in the standard to do something. And all are acceptable.

WebRTC takes what makes sense to it, throwing the rest out the window, leaving the standard slightly cleaner in the end of it.

Just recently, a decision was made about supporting non-multiplexed streams. This forced Asterisk and all of its users to upgrade.

VoIP and SIP were never really that important to WebRTC. Live with it.

#8 – Identity Management and Authorization are Tricky

There’s no identity management in WebRTC.

There’s also no clear authorization model to be heard of.

Here’s a simple one:

With SIP, the way you handle users is giving them usernames and passwords.

The user clicks that into the client and this gets used to sign up to the service.

With regular apps, it is easy to set that username/password as your TURN credentials as well. But doing it with WebRTC inside a browser opens up a world of pain with the potential of harvesting that information to piggyback on your TURN servers, costing you money.

So instead you end up using ephemeral passwords in TURN with WebRTC. Here’s an explanation how to do just that.

In many other cases, you simply don’t care. If the user already logged into the page, and identified and authenticated himself in front of your service, then why have an additional set of credentials for him? You can just as easily piggyback a mechanism such as Facebook connect, Twitter, LinkedIn or Google accounts to get the authentication part going for you.

#9 – Route. Don’t Mix

If you come from VoIP, then you know that for more than two participants in a call you mix the media. You do it usually for audio, but also for the video. That’s just how things are (were) done.

But for WebRTC, routing media through an SFU is how you do things.

It makes the most sense because of a multitude of reasons:

  1. For many use cases, this is the only thing that can work when it comes to meeting your business model. It strikes that balance between usability and costs
  2. This in turn, brings a lot of developers and researchers to this domain, improving media routing and SFU related technologies, making it even better as time goes by
  3. In WebRTC, the client belongs to the server – the server sends the client as HTML/JS code. With the added flexibility of getting multiple media streams, comes an added flexibility to the UI’s look and feel as well as behavior

There are those who are still resistant to the routing model. When these people have a VoIP pedigree, they’ll lean towards the mixing model of an MCU, calling it superior. It will usually cost 10 times or more to deploy an MCU instead of an SFU.

Be sure to know and understand SFUs if you plan on using WebRTC.

#10 – SBCs are Useless

Or at least not mandatory anymore.

Every. SBC. vendor. out. there. is. adding. WebRTC.

And I get it. If you’re building an SBC – a Session Border Controller – then you should also make sure it supports WebRTC so all these pesky people looking to get access through the browser can actually get it.

An SBC was an abomination added to VoIP. It was a necessary evil.

It served the purpose of sitting in the DMZ, making sure your internal network is protected against malicious VoIP access. A firewall for VoIP traffic.

Later people bolted on that SBC the ability to handle interoperability, because different vendor products never really worked well with one another (we’ve already seen that in #3). Then transcoding was added, because we could. And then other functions.

And at some point, it was just obvious to place SBCs in VoIP infrastructure. Well… WebRTC doesn’t need an SBC.

VoIP needs an SBC that handles WebRTC. But if you’re planning on doing a WebRTC based application that doesn’t have much of VoIP in it, you can skip the SBC.

#11 – Ecosystem Created by the API and Not the Specification

Did I say 10 differences? So here’s a bonus difference.

Ecosystems in VoIP are created around the network protocol.

You get people to understand the standard specification of the network protocol, and from there you build products.

In WebRTC, the center is not the network protocol (yes, it is important and everything) – it is the WebRTC APIs. The ones implemented in the browsers that enable you to build a client on top. One that theoretically should run across all browsers.

That’s a huge distinction.

Many of the developers in WebRTC are clueless about the network, which is a shame.  On the other hand, many VoIP developers think they understand the network but fail to understand the nuanced differences between how the network works in VoIP and in WebRTC.

What’s Next?

If you have VoIP background, then there are things for you to learn when shifting your focus towards WebRTC. And you need to come at it with an open mind.

WebRTC seems very similar to VoIP – and it is – because it is VoIP. But it is also very different. In the ways it is designed, thought of and used.

Knowing VoIP, you should have a head start on others. But only if you grok the differences.

Need to warm up to WebRTC? Try my free WebRTC server side mini course.

And if you’re really serious, enroll to my Advanced WebRTC Architecture Course.

 

The post Why Developing With WebRTC is Different than VoIP Development? appeared first on BlogGeek.me.

Taking a Breather. Be Back in September

Mon, 08/21/2017 - 12:30

See you in September.

Time for some downtime for me.

Not from work – got too many projects going on at the moment – updating my course, testRTC and some interesting customer projects I am involved with. I am also working on an offering around APIs. More on that later.

This means – no new writing here for the next couple of weeks.

See you all once I am back.

In the meantime, if you have any questions or needs around the things I write about, feel free to contact me. I’ll gladly help you find your way around this tech (and even focus my writing in the areas you are interested in).

Until September

The post Taking a Breather. Be Back in September appeared first on BlogGeek.me.

How do you Upgrade Your WebRTC Media Servers?

Mon, 08/14/2017 - 12:00

I say it doesn’t matter what the technique is as long as you go through the motion of upgrading your WebRTC Media Servers…

Here’s the thing. In many cases, you end up with a WebRTC deployment built for you. Or you invest in a project until its launch.

And that’s it.

Why Upgrade WebRTC Media Servers?

With WebRTC, things become interesting. WebRTC is still a moving target. Yes. I am promised that WebRTC 1.0 will be complete and published by the end of the year. I hear that promise since 2015. It might actually happen in 2017, but it seems browser vendors are still moving fast with WebRTC, improving and optimizing their implementations. And breaking stuff at times as they move along.

Add to that the fact that media servers are complex, and they have their own fixes, patches, security updates, optimizations and features – and you find yourself with the need to upgrade them from time to time.

Upgrade as a non-functional feature is important for your WebRTC requirements. I just updated my template, so you don’t forget it:

Download the WebRTC Requirements How To

I’ll take it a bit further still:

  1. With WebRTC, the browser (your client) will get upgraded automatically. It is for your own safety This in turn, may force you to upgrade the rest of your infrastructure; and the one prone the most?
  2. Your WebRTC media server needs to be upgraded. First to keep pace with the browsers, but also and not less important, to improve; but also
  3. The signaling server you use for WebRTC. That one may need some polish and fine tuning because of the browser. It may also need to get some care and attention – especially if and when you start expanding your service and need to scale out – locally or geographically
  4. Your TURN/STUN servers. These tend to go through the least amount of updates (and they are also relatively easy to upgrade in production)

Great. So we need to upgrade our backend servers. And we must do it if we want our service to be operational next year.

Talking Production

But what about production system? One that is running and have active users on it.

How do you upgrade it exactly?

Gustavo García‏ in a recent tweet gave the techniques available and asked to see them by popularity:

Just curious about how do you upgrade your #WebRTC mediaservers?

— Gustavo García (@anarchyco) August 4, 2017

I’d like to review these alternatives and see why developers opt for “Draining first”. I’ll be using Gustavo’s naming convention here as well. I will introduce them in a different order though.

#1 – Immediate Kill+Reconnect

This one is the easiest and most straightforward alternative.

If you want to upgrade WebRTC media servers, you take the following steps:

  1. Kill the existing server(s)
  2. Upgrade their software (or outright replace their machines – virtual or bare metal)
  3. Reconnect the sessions that got interrupted – or don’t…

This is by far the simplest solution for developers and DevOps. But it is the most disruptive for the users.

That third step is also something of a choice – you can decide to not reconnect existing sessions, which means users will now have to reconnect on their own (refresh that web page or whatever), or you might have them reconnected, either by invoking it from the server somehow or having the clients implement some persistency in them to make them automatically retry on service interruption.

This is also the easiest way to maintain a single version of your backend running at all times (more on that later).

#2 – Active/Passive Setup

In an active/passive setup you’ll have idle machines sitting and waiting to pick up traffic when the active WebRTC media servers are down (usually for whatever reasons and not only on upgrades).

This alternative is great for high availability – offering uptime when machines or whole data centers break, as the time to migrate or maintain service continuity will be close to instantaneous.

The downside here is cost. You pay for these idle machines that do nothing but sit and wait.

There are variations of this approach, such as active-active and clustering of machines. Not going to go in the details here.

In general, there are two ways to handle this approach:

  1. Upgrade the passive machines (maybe even just create them just before the upgrade). Once all are upgraded, divert new traffic to them. Kill the old machines one by one as the traffic on them whanes
  2. Employ rolling upgrade, where you upgrade one (or more) machines each time and continue to “roll” the upgrade across your infrastructure. This will reduce your costs somewhat if you don’t plan on keeping 1:1 active/passive setup at all times

(1) above is the classic active/passive setup. (2) is somewhat of an optimization that gets more relevant as your backend increases in its size – it is damn hard to replace everything at the same time, so you do it in stages instead.

Note that in all cases from here on you are going to have at least two versions of your WebRTC media servers running in your infrastructure during the upgrade. You also don’t really know when the upgrade is going to complete – it depends on when people will close their ongoing sessions.

In some ways, the next two cases are actually just answering the question – “but what do we do with the open sessions once we upgrade?”

#3 – Sessions Migration First

Sessions migration first means that we aren’t going to wait for the current sessions to end before we kill the WebRTC media server they are on. But we aren’t going to just immediately kill the session either (as we did in option #1).

What we are going to do, is have some means of persistency for the sessions. Once a new upgraded WebRTC media server machine is up and running, we are going to instruct the sessions on the old machine to migrate to the new one.

How?

Good question…

  • We can add some control message and send it via our signaling channel to the clients in that session so they’ll know that they need to “silently” reconnect
  • We can have the client persistently try to reconnect the moment the session is severed with no explanation
  • We can try and replicate the machine in full and have the load balancer do the switchover from old to new (don’t try this at home, and probably don’t waste your time on it – too much of a headache and effort to deal with anyways)

Whatever the technique, the result is that you are going to be able to migrate rather quickly from one version to the next – simply because once the upgrade is done, there won’t be any sessions left in the old machine and you’ll be able to decommission it – or upgrade it as well as part of a rolling upgrade mechanism.

#4 – Draining First

Draining first is actually draining last… let’s see why.

What we are going to do here is bring up our new upgraded WebRTC media servers, route all new traffic to them and… that’s about it.

We will keep the old machines up and running until they drain out of the sessions that they are handling. This can take a couple of minutes. An hour. A couple of hours. A day. Indefinitely. Depending on the type of service you have and how users interact with it will determine how long on average it will take for a WebRTC media server to drain its sessions with no service interruption.

A few things to ponder about here (some came from the replies to that original tweet):

  • WebRTC media servers can’t hold too much traffic (they don’t scale to millions of sessions in parallel)
    • With a large service, you can easily get to hundreds of these machines
    • Having two installations running in parallel, one with the new version and one with the old will be very expensive to operate
    • The more servers you’ll have, the more you’ll want to practice a rolling upgrade, where not all servers are upgraded at the same time
  • You can have more than two versions of the WebRTC media server running in parallel in your deployment. Especially if you have some really long lived sessions
  • You can be impatient if you like. Let session drain for an hour. Or two. Or more. And then kill what’s left on the old WebRTC media server
  • Media servers might be connected to other types of services – not only WebRTC clients. In such a case, you’ll need to figure out what it means to kill long lived sessions – and maybe decouple your WebRTC media server to further smaller servers
Why Most Developers Lean Towards Draining First?

Gustavo’s poll garnered only 6 answers, but they somehow feel right. They make sense from what I’ve seen and heard from the discussions I’ve had with many vendors out there.

And the reasons for this are simple:

  1. There’s no additional development on the client or WebRTC media servers. It is mostly DevOps scripts that need to reroute new incoming traffic and some monitoring logic to decide when to kill an empty old WebRTC media server
  2. There’s no service disruption. Old sessions keep running until they naturally die. New sessions get the upgraded WebRTC media servers to work on
What’s next?

If you are planning on deploying your own infrastructure for WebRTC (or have it outsourced), you should definitely add into the mix the upgrade strategy for that infrastructure.

This is something I overlooked in my WebRTC Requirements How To – so I just added it into that template.

Need to write requirements for your WebRTC project? Make sure you don’t miss out on the upgrading strategy in your requirements:

Get my WebRTC Requirements How To

The post How do you Upgrade Your WebRTC Media Servers? appeared first on BlogGeek.me.

Is WebRTC Safe?

Mon, 08/07/2017 - 12:00

Yes.

In recent years, we’ve seen a lot of hysteria going on around WebRTC. Mainly it being unsafe to use. So much so, that there are tutorials out there explaining how to disable it in every conceivable browser out there.

This reminds me all of the past (and present?) hysteria around running JavaScript code inside the browser – and again – how to disable it.

If you are developing a WebRTC application AND you care about the security of your service and the privacy of your users, make sure to review my WebRTC Security Checklist.

Get the WebRTC Security Checklist

Why is WebRTC Perceived as Dangerous?

WebRTC is a real time communication technology that is embedded in the browser. It can access your camera and your microphone as well as share the contents of your screen. As such, it enables a browser (and web developers) access to a lot more resources on the device of an end user.

This boils down to two main risks:

  1. Your data can be stolen by nefarious people
  2. Your privacy can be breached by knowing more about your device
1. Your data can be stolen by nefarious people

Here are a few scary ideas:

  • If I can access your microphone, I’ll be able to record all of your conversations
  • If I can access your camera, I’ll be able to snoop on you. Maybe take a nice recording of your intimate moments
  • If I can access your screen remotely, I’ll be able to record what you’re doing. Maybe even control your mouse and keyboard remotely while at it?

With all the goodness WebRTC brings, who wants to be spied on by his own device?

Now, that said, we also need to understand two things here:

  1. The browser isn’t the only game in town to gaining this access to your data and actions
  2. There are measures put in place to limit the ability to conduct in such activities
2. Your privacy can be breached by knowing more about your device

This one I guess is mostly about tracking you over the internet. Which is what ad networks are doing most of the time.

WebRTC gives access to more elements that are unique, which makes fingerprinting of the device (and you) a lot more accurate. Or so they say.

The main concern here are around the exposure of private IP addresses to web servers. There are many out there who see these “IP leaks” as a serious threat. for most of humanity, I believe it isn’t, which is why I’ll gladly publish my private IP address here: 10.0.0.9.

There are other, more nuanced ways in which WebRTC can be used for fingerprinting, such enumerating the device list as part of your device’s unique identity. Which is a concern, until you review the  accuracy of fingerprinting without even using WebRTC. Here are two resources for you to enjoy:

  1. Panopticlick – EFF’s fingerprinting check up ad research. If you are not unique – comment below – I have a feeling your browser is as unique as mine. Their TL;DR? Disable JavaScript (which will be too much work) or use a more “common” browser. I am NOT making this up:
  2. Fingerprintjs2 – one of the many libraries available to fingerprint your browser. It doesn’t use WebRTC, although there’s an “intent” in there to add support to it

In this area, Apple with their new WebRTC support in Safari is leading the way in maintaining privacy. You can read about it in a recent article in the WebKit blog. Look specifically on the sections titled “ICE Candidate Restrictions” and “Fingerprinting”.

Why is WebRTC the Safest Alternative?

If you are a developer looking for a real time communications technology to use in your application, or you are an IT person trying to decide what to deploy in your company, then WebRTC should be your first alternative. Always.

Here’s why.

1. Browser vendors take care security seriously

There are 4 major browser vendors: Apple, Google, Microsoft and Mozilla

All of these vendors are taking care of security and patching their browsers continuously. In some cases, they even roll out new versions at breakneck speeds of 6-8 weeks, with security patches in-between.

If a security threat is found, it gets fixed fast.

While many other vendors can say that they are fixing and patching security threats fast – do they deploy them fast? Do they have the means to do so?

Since browsers get updated and upgraded so frequently, and to hundreds of millions of users, getting a security patch to the field happens rather fast. Philipp Hancke showed and explained some Here are some browser upgrade stats last year. This is from real users conducting appear.in sessions. I asked him to share a more recent graph, and here’s what they’ve had in the last two large browser version cycles for Chrome:

Look at the point in time when each Chrome version got ramped up from less than 30% to over 80% in a span of a couple of days. Chrome 59 is especially interesting. Also note that there are at most 2 versions of Chrome out there with over 95%+ of use. Since they routinely do it, patching and deploying security issues is “easy”.

The only other vendors who can roll out and deploy patches so fast? Operating system vendors (again we end up with Apple, Google and Microsoft), and application developers, through mobile app stores (which sums up to Apple and Google).

Nothing comes close to it.

Takeaway: Assume there will be security breaches or at the very least the need to patch security issues. Which means you should also plan for upgrade policies. Browsers are the best at upgrading these days.

2. You don’t need to redeploy the client software

Lets face it – most users don’t disable the automatic update policy of their browsers. If you’re even remotely interested in security, you shouldn’t disable automatic update policies of ANYTHING.

Manual updates bring with them a world of pain:

(a) When do you upgrade?

Here’s the thing.

How do you know an upgrade is in order? Are you on the list of threat alerts of all the software and middleware you are using in your company? Once a threat is announced and a patch is available – do you immediately upgrade?

When we leave this decision to a human, then he might just miss the alert. Or fail to upgrade. Or decide to delay. Just because… he’s human.

Most software can get updated, but usually won’t do it automatically or won’t do it silently. And automation in this area that is done externally, such as the Kaspersky Software Updater. It works, but up to a point and it also adds another headache to contend with and manage.

If a browser does that for you freely, why not use it?

(b) What if this fails?

Did you ever get a software update to fail?

What about doing that in a company with 100+ employees?

If software fails to update 1% of the time, it means that every time you update something – someone will complain or just fail to update, making you revert back to a manual process.

There are tons of reasons why these processes fail, and most are due to the fact that we all have different firmware, software and device drivers on our machines (see fingerprinting above). This fact alone means that if a software isn’t running on millions of devices already, it will fail for some. I’ve seen this too many times when the company I worked for developed a plugin for browsers.

Anyone not using WebRTC and deploying via software installation will cause you grief here. If this is only in front of employees, then maybe that’s fine. But often times this is also with end user devices – and you don’t want to mess there.

Browser upgrades will fail a lot less often, so better use that and just make use of WebRTC instead of rolling your own proprietary solution.

(c) What about edge cases?

You can’t control your employees and their whereabouts for your upgrades.

People working from home.

People traveling abroad.

People using BYOD and… not having tight enterprise policies on their own home laptop.

If you want less headache in this department, then again – using WebRTC will give you peace of mind that security patches get updated.

Why?

Look at it this way, the engine of WebRTC will always stay secure when you rely on browser and browser updates.

You have control over the backend (or rely on a cloud service provider with an SLA you are paying for exactly for this reason). The backend gets updated for security patches all the time (or as much as you care). The browsers get updated automatically so you can think less about it.

Using proprietary software or legacy VoIP vendor software means you’ll need to patch both backend and client software. This is harder to do and maintain – and easier to miss.

3. WebRTC has inherent security measures in place

This should probably be the first reason…

One thing you hear many complain about is questioning why WebRTC is always encrypted. Somehow, developers decided that sending media in the clear is a good thing. While there might be some reasons to do that, most of them are rather irrelevant for something like WebRTC, meant to be used on unmanaged networks.

WebRTC took the approach of placing its security measures first. This means:

  1. There’s no way send media in the clear. Everything is always encrypted. In other VoIP solutions, you can configure encryption on and off (if encryption is even there)
  2. There’s no way to use WebRTC in websites that aren’t served over HTTPS. This means WebRTC forces developers to use secure connections for signaling – and for the whole site. And no. Using iframes won’t work either
  3. Users are asked to allow access to the their media inputs. Each browser handles this one slightly differently, and these models also changes over time, but suffice to say that the idea here is again – to balance privacy of the users and the usability of the service

Me? I’d rather rely on the security measures placed in browsers. These go through the scrutiny of lots of people who are all too happy to announce these security flaws. Software from vendors that is specific to communications? A lot less so.

And yes. This isn’t enough. WebRTC is the building block used to build an application. A lot of what goes to the security of the finished service will rely on the developers who developed the application – but at least they got a head start by using WebRTC.

Ads and WebRTC

There’s an angle that isn’t much discussed about WebRTC. And that’s the uses it finds in the ad business.

The Bad

Two main scenarios that I’ve seen here:

  1. Fingerprinting. You get better means to know more about who the user behind the browser is
  2. Serving ads themselves. Theoretically, you might be able to serve ads via WebRTC, and that at the moment has the potential to circumvent ad blockers
The Good

There’s the second part of it. When ads are served today, the companies paying for these ads being served like to get their ROI. On the other hand, there are those who would like the money spent on ads to be wasted. So they use bots to click ads. Probably by automating selenium processes.

This is similar in concept to the “I am not a robot” type of entry measures and captchas out there. WebRTC gives another layer of understanding about the user and its behavior – and enables us to know if he is a human or a bot inside that browser. And yes. We can use it for things other than ad serving.

Where do we go from here?

There are two main approaches to security:

  1. Security by obscurity – relying on people not knowing the protocol in place. It works great when you’re small and insignificant, so no one is going to care about you anyway. It falls apart when you become popular
  2. Kerckhoffs’ principle – a system needs to be secure even when we know everything about the system. It works best when many people scrutinize, analyze and try to hack such systems, making it better and more robust through time

WebRTC is in the second category (the first one – security by obscurity – is often criticized for being unsecure by nature).

With all the resources put into WebRTC from all angles, security is also being taken care of and not left behind.

WebRTC is safe to adopt as developers. IT and security people in the enterprise shouldn’t shy away from it either – just make sure the vendor you pick did a decent job with his implementation.

Are you doing what it takes to improve the security of your WebRTC application?

Get the WebRTC Security Checklist

The post Is WebRTC Safe? appeared first on BlogGeek.me.

Sound Gurus Finding a Home in WebRTC

Mon, 07/31/2017 - 12:00

When it comes to different verticals and market niches, it seems like WebRTC can fit anywhere.

6 years in, and there are many who still question if WebRTC is the way to go with their use case. This is one of the reasons why I started the WebRTC Dataset. The idea behind it all was to showcase all the variations and services where WebRTC is being used.

Here’s an example for you.

Musicians of all kinds make use of WebRTC. They have services today that are geared towards their specific needs. And I am not talking only about replacing Skype with a marketplace or a searchable directory of experts that can help you take private guitar lessons online.

When I bumped into Profound Studio, I knew this is an area I’d like to write a bit more about, so here it goes.

What I will be doing in this article, is go over some of the vendors found in the WebRTC Dataset, collected over the years, who are playing a role in the sound/music industry in one way or another.

I won’t be picking favorites here – my own experience with music is rather dull – I like to hear music just like anyone else, but I don’t consider myself an expert or a fan of anything really. This means that we’ll be going in alphabetic order of the vendors.

Care 2 Rock

Care 2 Rock is that we-teach-guitar-lessons use case with a twist.

The basic premise is having teaching music lessons of any kind online, through a video call. The twist is that this is a paid/voluntary act on the side of the teacher, who ends up teaching and mentoring a foster care kid in his community.

Profound Studio

Profound Studio connects musicians with recording experts.

This is a marketplace for professionals – not for hobbyists. You can run live classes there or do consultation calls.

sofasession

sofasession is all about musicians making music together online. The majority of it is done asynchronously, where each musician contributes and edits tracks of the final masterpiece, but they don’t need to be live together at the same time. And still, this kind of a use case can use WebRTC.

Here’s a job posting they had from two years ago:

We will use Kurento as media server and extending the service for multitrack mixing and reducing latency by developing latency reducing algorithms for serving content to clients connected via the webRTC protocol.

For the layman, handling audio in realtime in the browser to handle that mixing module that’s inside sofasession requires low latency. The best way to get there is to have something like WebRTC manage it – we are talking about real time here.

I am not sure if and how far they went with their WebRTC support. They do support live jam sessions, but there’s a need to download dedicated software for that. It does use UDP to work, so there still might be some WebRTC in there.

Soundtrap

Soundtrap is similar to sofasessions. It too focuses on musicians collaborating online.

Based in Stockholm, where some of the Google WebRTC team are located, even got it to appear at Google I/O in 2014:

StreetJelly

StreetJelly is about live performances. It allows artists to stream their music live to a global audience.

At the moment, performing and viewing is free. Viewers can tip performers if they want.

On the technical side, StreetJelly uses HTML5 video playback for the viewers and either Flash (now dead) or WebRTC (the new method for StreetJelly) to be able to broadcast the performances. They explain this further here.

Shots in the dark

Other vendors such as Rapt.fm and unltd.fm came and went.

As with any set of startups, the vendors in this space don’t always succeed.

How will others fair? Time will tell.

The Appeal

Music and WebRTC. There’s an appeal there.

VoIP was crap most of the time up until recently. There were two main reasons for it:

  1. The selection of voice codecs, with the dominant ones being narrowband codecs like G.711 or the G.729. When you did get a better sounding codec, it was either protected by patents, not suitable for real time or just stuck in wideband but still focusing on speech and not music
  2. These codecs aren’t usually error resilient. The moment you introduce packet loss (and these happen regularly), the audio quality suffers. So something had to be done in that front

WebRTC comes with Opus “out of the box”. A wideband codec suitable for music – not only speech, which is royalty free, low latency and error resilient. To top it all – it is mandatory in WebRTC and one of only two such codecs (the other one being the ridiculously crappy G.711). What’s not to like as a musician?

Why is this important?

Well… here’s the kicker.

None of it is new.

VoIP and video calling could have done this all before WebRTC.

But it didn’t.

Why?

Costs. Barriers of entry. Finding talent.

WebRTC solves all that, which is why I categorize all of these vendors and many others as WebRTC vendors.

They don’t care about WebRTC as a technology – for them it is just means to an end, which is just fine.

But what about you?

Want to learn more about a specific market niche where real time communications be of use? Want to instantly find who’s there already and what they are doing?

You’d better take a look at the WebRTC Dataset. Especially today, before the earlybird discount ends.

Get access to the WebRTC Dataset

The post Sound Gurus Finding a Home in WebRTC appeared first on BlogGeek.me.

Drilling into the WebRTC Dataset

Mon, 07/24/2017 - 12:00

Knowledge=Power. Which is why the WebRTC Dataset might be just what you need.

A Quick History Lesson

You see numbers flying around about WebRTC all the time. One of them is the number of vendors using WebRTC. 1,200 might sound familiar in that context. Well… it comes from the WebRTC Dataset that I am maintaining.

It all started ages ago. I think it was Alan Quayle who made a shortlist of the companies that were using WebRTC that he knew about. That was somewhere in 2012. Which made me start my own Excel sheet. Which was then converted into a Google sheet. Which was then converted into a whole operation of how to find, catalog and update a dataset.

The reason? One of the main companies who are influencers in WebRTC wanted access to it and were willing to pay, so I made it into a product. Since then it had a few more customers who got exclusive ongoing access to this dataset, and now, I decided to repackage it in a different fashion, making it more accessible to more companies.

What’s in the WebRTC Dataset?

The WebRTC Dataset itself is a collection of vendors and projects who are making use of WebRTC in one way or another. It can be anything from a healthcare service to an outsourcing vendor to a live streaming service or a contact center.

The list includes today around 1,200 vendors and counting – it grows and gets updated on a monthly basis.

You’ll find in the dataset vendors large and small. Anything from Google, Cisco and Facebook to small startups and even individual projects that are popular or interesting enough.

You’ll find there acquisitions made in the industry, with reasons behind them and my own indication of how related they are to WebRTC.

You’ll find there vendors who have shut down. Those who have pivoted and changed their focus.

When the information is publicly known, available or can be found online – the suppliers that are used by a vendor are also indicated.

Here’s an example vendor’s information from the WebRTC Dataset:

The page is split into several parts:

  1. The top part, where general information about the company/project can be found. Including things like size, ranking, status and external sources such as Crunchbase
  2. Then there’s the verbal description and notes, which gets updated through time as the offering evolves
  3. After that, different classifications. These are parameters that you can easily use to filter out or find similarly typed vendors in the dataset
  4. Then come links from other sources as well as my own blog and the latest tweets from that vendor
  5. Last but not least, a quick form in the end allows you to ask anything you have about this vendor and get direct answers from me
Where does this data come from?

All over the web.

Since I am actively working on projects like WebRTC Index and WebRTC Weekly, I got to keep tabs with anything related to WebRTC. I go over the blogs of all the vendors using WebRTC and investigate anything that looks like RTC that I bump into in whatever it is that I am reading. On top of that, I use additional sources like Google Alerts and a few other trade secrets

And I’ve been doing this since 2013.

The data in the WebRTC Dataset got created along the way. First as a resource for me to use whenever I need research information on certain domains. And then because it made sense to package it as a distinct product of its own.

Whatever is on the WebRTC Dataset it is something you can go and find out on your own. But it will take you time. Lots and lots of time.

What can you DO With the WebRTC Dataset?

Lots of things actually. It all depends on what it is you’re trying to gain.

Here are a few ideas and uses that people have been using it for already:

  1. Mark potential companies as leads for your salespeople – if you have a cool solution or service that can fit a certain segment, then you can find some interesting companies who might be interested in what you have to offer in this dataset
  2. Check out your competitors – find who they are. See who their known customers are. Compare them to your own company
  3. Find target markets and their size – need to decide where to put your focus? Should it be Healthcare or should it be Education? Should you offer a click to dial button as a service or go for a video chat widget instead? Who are the competitors in the niche you’re trying to carve for yourself? What are they doing?
  4. Understand market trends – here’s what Serge Lachapelle, who was Group Product Manager at Google heading their WebRTC efforts, had to say of the time he used the WebRTC Dataset:

The dataset enables me to understand where the WebRTC platform is going and make strategic roadmap decisions based on where the innovation and heavy usage lies. Being able to get an updated complete view of the market at any given point in time over a large set of criteria makes it easy to see trends in different industries and verticals that make use of WebRTC.

I am sure you’ll be able to find other ways to use it if you only think about it.

And me?

I use this WebRTC Dataset all the time. One of the things I use it for is my annual “WebRTC State of the Market” infographic.

Here’s the one for 2016 and the one for 2017 that I created.

How about a sneak peak?

If you want to see how the WebRTC Dataset feels like to use, then here’s a short video:

I’m interested. Now what?

Access to the WebRTC Dataset comes at $2,400.

The WebRTC Dataset access gives you 1 month of access to all the vendors there. You’ll be able to download the main worksheet and use it after that month is up.

You can decide to purchase it at any point in time, just head to the WebRTC Dataset page.

While we’re at it – if you decide to purchase before the end of July (even if you plan on using it later on), there’s an early bird discount of $400. Just use coupon code DATASET-EARLYBIRD.

Get the dataset

The post Drilling into the WebRTC Dataset appeared first on BlogGeek.me.

Google and WebRTC. An Interview with Niklas Blum

Thu, 07/20/2017 - 12:00

Where are we headed with WebRTC?

Google made an interesting announcement recently. It was about WebRTC 1.0 and Google’s own commitment to it. It seems we’ve come to a point in time when WebRTC is considered a done deal and the rest are just details – getting bugs fixed and polishing its performance.

I wanted to understand a bit more where we are headed, from the point of view of the company who lead the effort up until now. So I reached out to Niklas Blum, who is leading product management for WebRTC at Google, to answer a few of my questions.

 

How is it like to manage something like WebRTC at Google?

WebRTC is an exciting project. It is one of these kind of projects that are only possible at companies like Google and a few other places when you think of scale and impact of the technology. We started about 6 years ago as an open source project in Chrome and now WebRTC is providing the stack for an ecosystem for real-time communication services on Web. From a product management perspective there are tons of requirements impacting the platform – ranging from enterprise multi-party communications to p2p video calling on bad networks and even streaming services. It’s a very challenging and exciting time, with so many opportunities to further evolve the product.

 

What metrics do you use to gauge WebRTC’s success?

We have very practical metrics like number of API requests and amount of media/data being consumed in Chrome from users that opt-in to share this data with us. From a product perspective, I like to measure the impact of the technology on the Internet. You are tracking for example the number of projects and services that build with WebRTC. The latest update I got from you was around 1200 projects and companies. I think this is a great metric reflecting the success of WebRTC and the impact we achieved by open sourcing it.

 

You recently made an announcement in discuss-webrtc around WebRTC 1.0. Why now?

We have reaching our goal of having all the standards defined, and the technology is now stable enough for everyone to use. The web-based RTC ecosystem is becoming mature as more and more services that build on top of WebRTC are getting massive reach.

With Chrome, Edge, Firefox and Safari supporting WebRTC, about 80% of all installed browsers globally have now WebRTC build in. This is a big milestone for us as we are achieving our initial goal of making audio and video available in all browsers, through a uniform standardized set of APIs. Additionally, formerly application-focused communication services are transitioning towards the Web platform and adopting WebRTC.

About 80% of all installed browsers globally have now WebRTC build in
Click To Tweet

We believe that interoperability between different WebRTC browsers is now of key importance to continue growing the adoption of WebRTC. It’s also of key importance to provide stability and a common ground to services and companies for continue growing a user base and eventually a flourishing business.

 

6 years in. What would you say worked great with WebRTC and what needs some improvement?

Our original mission to bring secure p2p real-time communication to the web has become real. This by itself is major contribution to the Web platform and the team is incredibly proud of this achievement. Our current efforts can be split into two main categories:

  1. Finalize the specs in Chrome
  2. Provide enterprise-grade reliability

We are working very hard on performance and to iron out remaining reliability issues in Chrome to make WebRTC the solution of choice for enterprise-grade communication services. These efforts address bugs like missing audio-input from the microphone or when the the camera is not detected. We are also getting close to launching a completely new echo canceller in Chrome for desktop. This should significantly improve the call quality when no headset is used on various devices. Additionally, we have major projects aiming at removing glitches in the audio and video capture and rendering processes. We are porting these time and resource critical processes to Mojo, a new process framework in Chrome. This will allow us to have a much better real-time performance in Chrome.

 

Looking 2 years ahead. What should we expect to see coming to WebRTC? AV1? Support for AR? …

Google is a founding member AOMedia and very active in defining the AV1 bitstream. Once AV1 is finalized we will start work on adding it to WebRTC. AR/VR/Mixed Reality is a completely new technology space with the potential to change how we consume services and media fundamentally. But the platform and overall product/market-fit is still unclear. But adding AR/VR functionality to WebRTC is definitely an interesting idea.

An interesting opportunity for evolving WebRTC is to replace RTP with QUIC. Experimenting with QUIC as media transport protocol could reduce the transport-layer protocol overhead and provide integrated congestion control. We also consider using QUIC for the DataChannel that is being used a lot by p2p CDNs for content distribution. Generally, we believe that there are still quite a few opportunities for reinventing real-time communications.

Looking a bit further ahead, a new mobile network generation (5G) is emerging. Which role WebRTC will play here still needs to be identified. But generally, having more bandwidth and lower latency available will open the door to explore video resolutions >4K and massive parallel connections. Additionally, having new software-defined networking functionality exposed to the application-layer seems to be an interesting option to improve real-time communication services. It will be very interesting to see the opportunities for WebRTC here.

 

During your time as the product manager of WebRTC at Google. What was the thing that surprised you the most?

I am still surprised every day by the creativity of developers building great services on top of WebRTC and the value those provide to users. A company called Qbtech, for example, uses WebRTC in a product that assess symptoms of ADHD. While traditional methods for assessing ADHD typically use subjective rating scales from physicians, Qbtech provides objective measurements by analyzing motion tracking over video. After implementing WebRTC, they went from specialized hardware to a web application that could run on a normal computer — opening up access to this technology to smaller clinics, schools, and even rural providers that might not have the resources for more specialized solutions.

Of course, there are many other great services that use WebRTC, but it’s this kind of out of the box thinking to apply WebRTC beyond its original audio/video calling use case and the value that is created by it that always surprises me.

How can developers contribute to WebRTC?

We have received thousands of user feedback reports and feature requests in the WebRTC and Chromium trackers from the growing WebRTC developer community. This feedback has been extremely valuable to improve WebRTC overall and especially to make it more stable for production usage. Generally, developers can provide feedback at bugs.webrtc.org by filing bugs or feature requests. And if you want to do more – you can become contributors and help us polishing the codebase – either as an individual or a company.

 

The post Google and WebRTC. An Interview with Niklas Blum appeared first on BlogGeek.me.

Prepping for that Big WebRTC Product Launch

Mon, 07/17/2017 - 12:00

Are you sure you’re ready with your WebRTC product launch?

Here’s the thing. If you want to have a successful launch at the end of the project, you should focus on the planning phase in the beginning. Oh – and your plan should be different if you are going to self develop it all in-house or have the communication parts of it outsourced to external vendors.

Too often people contact me when they have already budgeted the project, spent the money, have a “product” in hand but it is lacking.

Two extreme cases recently:

  1. A startup hired a development company who said they know WebRTC. They were so good that they said there is no need to use a media server for 8 participants in a session unless the session is being recorded (if you think that way too, then you are more than welcome to take my WebRTC course)
  2. Another startup got delivered a finished product. Just to find out it didn’t even come with a TURN server

We see this even more at testRTC, where I am a co-founder. Companies come to us because they think the product a third party developed for them doesn’t work as expected, and too often it breaks in ways that are unacceptable (like stressing the service with 20 browsers).

The problem is finding these issues too late in the game, and paying dearly for them.

There are lots and lots of images out there that illustrate the issue. I’ll use the one from Raygun:

We can dispute the multipliers. They don’t really matter. But here’s how a typical WebRTC product with outsourced development takes place:

  1. Someone writes down the requirements (amount of detail varies wildly here, and of course, you can use my requirements template)
  2. He sends a few development companies the RFP. Mostly, these will be sent to local developers, but sometimes to other vendors as well
  3. Once responses are back, he will pick one. Almost at random… (I know it isn’t random, but it will mostly lean towards cost, as the one in charge knows little in this domain anyway)
  4. Now we wait. Just like placing a cake in the oven and waiting for it to cook. Once the big day arrives, the customer plays with what he gets back and finds out there are some holes in it. Areas left uncovered, or just an impression of poor media quality

Here’s the problem. Steps (1), (2) and (3) are the Design phase. And no one took any real ownership of it in this scenario.

Step (4) is probably Development,Testing and Staging. And they were all left to an outsourcing vendor. Who is most likely looking at this project from a cost perspective as well – but he doesn’t really care if this gets launched or not. Not really.

The customer got to step (5) immediately. With no milestones along the way. No checkpoints to see if everything is done correctly.

And please don’t sell me the story of agile development and how that will save the day for this customer. With agile he sees the “results” every week or two. In every sprint. But does he really know if that first Design phase was done properly?

Do you think you are getting a stable product that can scale to the millions (or even thousands) of users you plan on having? Are you sure your contractor feels the same way and didn’t build you a proof of concept instead?

Two things to do NOW about your WebRTC project:

#1 – Make sure you have a solid WebRTC architecture

Do you trust your vendor to build you the architecture you need?

Don’t.

You should do your homework or bring someone on “your side” that knows what he is doing.

Go now and look at the architecture you’ve been promised.

Take that architecture you’ve been given by the vendor and get a second opinion. It will be worth the time and effort.

#2 – Register to this free webinar

At testRTC we’ve decided to host a free webinar that deals with these issues exactly. And me? I decided to announce it here as well because I think it fits my readers AND it gets more people to know about testRTC, which is another thing I am a part of.

WebRTC: How NOT to Fail in Your BIG Launch

The webinar will take place on July 25 at 14:30 EDT.

It will be recorded and available for playback, so register now.

See you then!

 

The post Prepping for that Big WebRTC Product Launch appeared first on BlogGeek.me.

Pages

Using the greatness of Parallax

Phosfluorescently utilize future-proof scenarios whereas timely leadership skills. Seamlessly administrate maintainable quality vectors whereas proactive mindshare.

Dramatically plagiarize visionary internal or "organic" sources via process-centric. Compellingly exploit worldwide communities for high standards in growth strategies.

Get free trial

Wow, this most certainly is a great a theme.

John Smith
Company name

Startup Growth Lite is a free theme, contributed to the Drupal Community by More than Themes.