A SIP server has a central role in any modern voice or video communications infrastructure. The tasks are many and each of them is crucial for a reliable end-to-end communication. In this blog post we explain what a SIP server is, how it works and why it is such a crucial part in a SIP or WebRTC-based real-time communications infrastructure. It is part of a mini-series of technical blog posts looking at the key capabilities needed for mission critical and real-time voice and video communications services. In a separate post we dive into the challenges that private and public IPv4 addresses causes and how the NAT Traversal mechanism is addressing that. Go here for the NAT Traversal blog post

What is a SIP Server?

A SIP server is the base of a SIP infrastructure. It includes features such as SIP registrar, SIP proxy and usually also some sort of NAT and firewall traversal mechanisms, like media relay. The SIP server may also include, or be fronted by, a policy-based security and border control function, a Session Border Controller (SBC).

The SIP infrastructure allows SIP or WebRTC-enabled devices to successfully register their identities with the SIP server, enabling them to receive and initiate inbound and outbound calls, assuring successful end-to-end communications.

A conceptual view of the SIP server and its location in the network.

A conceptual view of the SIP server and its location in the network.

Next comes a description of each SIP server feature mentioned above, followed by an illustration showing the signaling and media flow between the features.

SIP Registrar

The SIP registrar is a SIP server entity combined with a user database. The database includes provisioned usernames and credentials used for authentication of the registration and session establishment. During the registration process the SIP-enabled device is authenticated towards the SIP registrar service. After a successful authentication, the SIP registrar stores an association between the user identity and the current contact information (IP address or domain name (FQDN)) of the device for the time specified in the registration. The device contact information is then used by the SIP proxy to contact the user and updated by the SIP device using the same procedure.

If the user is not registered in the SIP registrar, then there is no information available regarding where to reach the SIP device for outgoing sessions to be established. In summary, the SIP registrar is used to “publish” the current contact information of a specific user and the information is stored in a location service. 

SIP Proxy

The SIP proxy is the core of the SIP server. All session signaling (SIP or WebRTC) is passing through the SIP proxy. By querying the location service, previously populated by the SIP registrar, the SIP proxy can find out where different users and devices currently are located.

So, when an incoming call is received, the SIP proxy will query the location database for the current contact information of the user that is supposed to receive the call. If the receiving user is registered, the location service will return the current contact information (e.g., IP-address or FQDN) and the SIP proxy can relay the signaling request to the device. The signaling session is established and the communicating endpoints can negotiate the media or whatever they agree to send to each other.

The SIP proxy is a record-routing transaction stateful proxy and will stay in the signaling path throughout the entire session. However, depending on the discovered network topology, the negotiated media flow may go directly between the communicating endpoints, or via the included NAT traversal and media relay function.

NAT Traversal and Media Relay

The NAT traversal and media relay functions are covered in more detail in a separate blog post, but a short description comes here.

The NAT traversal function makes sure that negotiated media streams (voice, video) successfully can flow between the communication endpoints even if one, or both, of them are behind NAT devices. Endpoint devices behind a NAT use the private IPv4 address space, which is not reachable on the public internet. Consequently, end-to-end IP connectivity cannot easily be established.

The task of the NAT device is to translate the private internal IPv4 addresses to one single public IPv4 address and port. This means that many internal devices may share one single public IPv4 address. The NAT device needs to keep a state, a pinhole, for each and every outbound session established from the internal network to make sure that inbound packets are sent to the correct internal private IPv4 address. This makes it very hard for real-time end-to-end IP communications to work. This is where the NAT traversal mechanism comes into play. Read more about this in the NAT traversal blog.

Session Border Controller (SBC)

The SBC is usually situated at the border of a public network, e.g., Internet, and acts like a firewall for SIP signaling making sure that the functions of the SIP server are protected. The SBC can also be used to allow SIP-enabled devices and SIP trunks on private networks to communicate securely with the SIP Server. The SBC normally also includes NAT traversal capabilities if facing devices on the Internet.

Signaling and media flow between SIP server features

Below illustration shows the signaling and media flow of the interworking between the features included in the SIP server.

Signaling and media flow between SIP server features.

Signaling and media flow between SIP server features.

Why is a SIP Server important?

The SIP server is the heart and core of an end-to-end SIP or WebRTC communications infrastructure. It is the hub of signaling and the rendezvous point of user identities.

In use cases with few devices and users, or when the endpoint connections are well-known, IP based voice and video communications can, in most cases, still establish end-to-end communication even without the SIP server.

However, in most real-world scenarios the number of devices are many and at least one of the endpoints do not have a public IPv4 address. This means that in practice the SIP server capabilities are key for reliable end-to-end communication in commercial scenarios to work.

Instead of keeping a local “address book” on each device, the SIP server accepts registrations from all devices and maintains a single point of contact for all connected devices. A local “address book” on a device would have to be updated on all devices as soon as one of the addresses changes. The SIP server instantaneously updates a change in a device’s address or location as soon as that device is registered, without the need for any other device or user to make an update. The device is just sending its signaling to the SIP server as usual, and the SIP server will know where to relay the request.

What are the pitfalls to avoid when setting up a SIP Server?

SIP is an open IETF protocol and there are several good SIP server implementations of different flavors and functionalities available, free to use.

But, running, maintaining, securing, and scaling a SIP infrastructure for todays and tomorrow’s needs are big tasks that require key expertise. The SIP infrastructure must always be reliable and secure. Any service downtime must be avoided.

The end-to-end media streams must be successfully connected by implementing hosted NAT traversal and media anchoring. The SIP server also needs to quickly scale to handle any increase in the number of devices or to handle new scenarios or use-cases.

iotcomms.io SIP Server as a Service

iotcomms.io delivers a cloud-native SIP Server infrastructure to any provider of mission critical real-time voice and/or video communications services. Provided as a managed service from the cloud, it sets up, runs and maintains a SIP server infrastructure including all the vital tasks described above which are fundamental for reliable end-to-end communications.

The reliability and availability provided with a cloud-native approach are crucial for the safety and security of the service that runs on top of the SIP infrastructure. By utilizing the global presence, multiple locations and instances of cloud-based public datacenters the service is easy to scale and ready to deliver outstanding performance.

Avoid the complexity around NAT traversal, database backends or SIP server scalability, and benefit from a ready-to-use SIP server service from iotcomms.io.


A SIP server is an essential part in critical and real-time voice and/or video communications. The tasks of the SIP server are many, and the complexity in assuring seamless end-to-end communications should not be under estimated.

iotcomms.io offers full SIP Server functionality as a managed service. The complexity of running the infrastructure with many devices and advanced use cases is packaged in feature-rich and easy-to-use APIs.

iotcomms.io runs your SIP or WebRTC based real-time communication services by providing cloud-native scalability, reliability, and security assuring critical voice and video services and applications are working as your customers expect them to.

Avoid complexity– use iotcomms.io SIP Server for mission critical voice & video communications services

Senior Software Developer to iotcomms.io

iotcomms.io SIP Server

Offer reliable mission critical voice and video communication services and benefit from our ready-to-use SIP Server as a Service. Avoid the complexity of setting up your own SIP infrastructure – just integrate to our API.

Modern CPaaS built cloud-native from ground up – we run the operations for you so you can focus on your customers’ experience.


Built for mission critical alarm, voice & video services – delivering superior reliability, security and availability.

Built with serverless functions in AWS for unlimited scale, reach and global deployment – extend to new markets quick and easy.

Magnus Ladulåsgatan 13, 118 65 Stockholm, Sweden