based on a connectionless protocol such as UDP.

The figure above shows the basic of VoIP. Every communication is made of three basic
steps:
1. When a caller wants to communicate with another party it initiates a communication
either with the remote party or with a gateway/PBX (this depends on the protocol
being used and on the local network setup) using a signaling protocol. This step is
responsible for: • Verifying party credentials, credit (if applicable), ability to call the specified
number;
• Negotiating the call (e.g. is it voice only or voice and video).
• Agreeing on a common codec (e.g. H.264).
• Negotiating the ports used for exchanging voice/video data.
2. The call takes place on the ports previously selected, and the payload is encoded
using the specified codec. If a standard protocol is used, usually RTP is the one
selected. In case of a video-call there are two independent RTP streams, one for
voice and one per video.
3. When one of the parties decides to complete the call, using the signaling protocol
the call is terminated.
The following table lists some popular signaling and transport protocols used for VoIP.
Signaling Transport
• SIP (Session Initiation Protocol)
• Cisco Skinny
• H.323
• RTP (Realtime Transport Protocol)
• RTCP XS (RTP Control Extended
Reports)

Figure 2. - Popular Signaling and Transport Protocols
From the traffic monitoring point of view: • The signaling protocol contains important information such as parties identity, type of
call (voice or video-call), codecs, duration, and information about the RTP session(s)
that usually do not take place on fixed ports. In general without properly decoding the
signaling protocol, it is not possible to guess the ports used for RTP.
Caller Called
Party
Signaling Protocol
Voice/Video Data