forked from szechyjs/dsd
-
Notifications
You must be signed in to change notification settings - Fork 0
alring/dsd
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Digital Speech Decoder 1.4 Copyright (C) 2010 DSD Author GPG Key ID: 0x3F1D7FD0 (74EF 430D F7F2 0A48 FCE6 F630 FAA2 635D 3F1D 7FD0) Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies. THE SOFTWARE IS PROVIDED "AS IS" AND ISC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL ISC BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. DSD is able to decode several digital voice formats from discriminator tap audio and synthesize the decoded speech. Speech synthesis requires mbelib, which is a separate package. DSD 1.4.1 requires mbelib 1.1 or later. Supported formats in version 1.4.1: P25 Phase 1 Widely deployed radio standard used in public safety and amateur radio. Support includes decoding and synthesis of speech, display of all link control info, and the ability to save and replay .imb data files ProVoice EDACS Digital voice format used by public safety and amateur radio. Support includes decoding and synthesis of speech and the ability to save and replay .imb data files. Note: not enabled by default, use -fp to enable. X2-TDMA Two slot TDMA system currently being deployed by several public safety organizations. Based on the DMR standard with extensions for P25 style signaling. Support includes decoding and synthesis of speech, display of all link control info, and the ability to save and replay .amb data files DMR/MOTOTRBO "Digital Mobile Radio" Eurpoean two slot TDMA standard. MOTOTRBO is a popular implementation of this standard. Support includes decoding and synthesis of speech and the ability to save and replay .amb data files. NXDN Digital radio standard used by NEXEDGE and IDAS brands. Supports both 9600 baud (12.5 kHz) and 4800 baud (6.25 kHz) digital voice. Support includes decoding and synthesis of speech and the ability to save and replay .amb data files. Development (no speech) support only: D-STAR Amateur radio digital voice standard Development support only. DSD recognized frames and can extract the voice bits but speech is not yet decoded. D-STAR likely uses a version of AMBE not yet supported by mbelib. The voice bit interleave pattern also needs to be determined. Note: not enabled by default, use -fd to enable. Unsupported formats in version 1.4 considered for future development: P25 Phase 2 This is not yet a published standard. Full support is expected once the standard is published and there are systems operating to test against. Phase 2 will use a vocoder supported by mbelib. OpenSKY It is possible that the four slot version uses a vocoder supported by mbelib. The two slot version does not. Supported demodulation optimizations in version 1.4: C4FM Continuous envelope 2 or 4 level FSK with relatively sharp transitions between symbols. Used by most P25 systems. Optimizations include calibrating decision points only during sync, 4/10 sample window per symbol, and symbol edge timing calibration. GFSK Continuous envelope 2 or 4 level FSK with a narrower Gaussian/"raised cosine" filter that affects transitions between symbols. Used by DMR/MOTOTRBO, NXDN and many others. Noisy C4FM signals may be detected as GFSK but this is ok, the optimization changes will help with noisy signals. Optimizations are similar to C4FM except symbol transitions are only kept out of the middle 4 samples and only the middle two samples are used. QPSK Quadrature Phase Shift Keying (and variants) used in some P25 systems and all known X2-TDMA systems. May be advertised under the marketing term "LSM" Optimizations include continuous decision point calibration, using middle two samples, and using the symbol midpoint "spike" for symbol timing. Installation DSD should easily compile on any Linux or *BSD system with gcc. Just untar and run "make" or "make install". There are some debugging/ development options in config.h that normal users will want to leave disabled as they can severely impact performance. Operation There are two main operating modes, "Live scanner" and "Play files" Usage: dsd [options] Live scanner mode Live Scanner mode takes 48KHz/16 bit mono audio samples from a sound card input and decodes speech in real time. Options are provided for controling information display and saving mbe data files. The synthesized speech can be output to a soundcard and/or a .wav file. Usage: dsd [options] -r <files> Read/Play saved mbe data from file(s) Play files mode reads mbe data from files specified on the command line (including wildcards) and synthesizes speech from those files. The synthesized speech can be output to a soundcard and/or a .wav file. The -r command line options is used to activate Play files mode. Display modes There are two main display modes in Live scanner mode. "Errorbars" and "Datascope". Errorbars mode output for P25 Phase 1 looks like this: Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 0 tg: 32464 TDULC Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 0 tg: 32464 TDULC Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 0 tg: 32464 TDULC Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 0 tg: 32464 TDULC Sync: -P25p1 mod: C4FM inlvl: 38% nac: 5C2 src: 0 tg: 32464 TDU Sync: -P25p1 mod: C4FM inlvl: 38% nac: 5C2 src: 0 tg: 32464 HDU Sync: -P25p1 mod: C4FM inlvl: 42% nac: 5C2 src: 0 tg: 32464 LDU1 e: Sync: (-P25p1) mod: C4FM inlvl: 39% nac: 5C2 src: 52610 tg: 32464 (LDU2) e: Sync: -P25p1 mod: C4FM inlvl: 38% nac: 5C2 src: 52610 tg: 32464 LDU1 e: Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 52610 tg: 32464 LDU2 e: Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 52610 tg: 32464 LDU1 e: Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 52610 tg: 32464 LDU2 e: Sync: -P25p1 mod: C4FM inlvl: 39% nac: 5C2 src: 52610 tg: 32464 LDU1 e: "Sync" indicates the frame type detected and whether the polarity is positive or negative. DSD automatically detects and handles either polarity except for DMR/MOTOTRBO/X2-TDMA which unfortunatley use both sync polarities. Most combinations of transmitter, receiver and soundcard show netagive (-) polarity for X2-TDMA signals and (+) polarity for DMR/MOTOTRBO so those are the defaults. ******** You may need to use the -x option to select non-inverted polarity if you are not getting usable X2-TDMA/MOTOTRBO/DMR speech. As they use both normal and inverted sync it is not possible to detect polariy automatically. ******** "mod" indicates the current demodulation optimizations. "inlvl" indicates the audio input level. QPSK signals tend to appear much "wider" than C4FM from a discriminator tap so it is important to set your input gain using a QPSK signal if you plan to montir them. It is not necessary nor desirable to get to 100%, in fact your sound card may max out below 100%. It is best to use the Datascope mode for setting input gain (see below). Typical values with good results are 40% for C4FM and 66% for QPSK. "nac" is the P25 Phase 1 Network Access Code. This is a 12 bit field in each P25 Phase 1 header. It should not be confused with the 16 bit System ID used in non-P25 trunking control channels. "src" is the radio id of the trasmitting subscriber unit. "tg" is the talkgroup derived from link control information. "HDU/LDU1/LDU2/TDU/TDULC" are P25 Phase 1 frame types, referred to as frame subtype within DSD. "e:" is the beginning of the errorbars display. Each "=" indicates a detected error within the voice data. "R" and "M" indicat that a voice frame was repeated or muted due to excessive errors. Values in parentheses () indicate an assumption (soft decision) was made based on the previous frame. Errorbars mode output for X2-TDMA looks like this: Sync: -X2-TDMA mod: QPSK inlvl: 59% src: 17211 tg: 197 [SLOT0] slot1 VOICE e: Sync: -X2-TDMA mod: QPSK inlvl: 47% src: 17211 tg: 197 [SLOT0] slot1 VOICE e: Sync: -X2-TDMA mod: QPSK inlvl: 43% src: 17211 tg: 197 [SLOT0] slot1 VOICE e: Sync: (-X2-TDMA) mod: QPSK inlvl: 28% src: 17211 tg: 197 [SLOT0] slot1 VOICE e: DMR/MOTOTRBO display is similar except it does not yet show source and talkgroup information. As of version 1.2 DSD shows which specific TDMA slots are active (with capital SLOT letters) and which slot is currently being monitored (with square brackets []. Noisy/degraded signals will affect the accuracy of this display. The frame subtypes (Voice/LC etc) are shown based on the DMR standard types. Datascope mode output looks like this: Demod mode: C4FM Nac: 8C3 Frame Type: P25 Phase 1 Talkgroup: 16528 Frame Subtype: LDU1 Source: 0 TDMA activity: slot0 slot1 Voice errors: +----------------------------------------------------------------+ | # ^ !| ^ # | | * | * | | * | * | | * | * * | | * * | * * | | * * | ** * | | * ** | ** * | | ** ** | ** * | | ** ** | ** * | | ** ** | ** * | +----------------------------------------------------------------+ C4FM Example Demod mode: C4FM Nac: 126 Frame Type: P25 Phase 1 Talkgroup: 25283 Frame Subtype: LDU2 Source: 0 TDMA activity: slot0 slot1 Voice errors: +----------------------------------------------------------------+ | # ^ ! ^ # | | * | | | * | | | ** | | | ** | * | | * ** | * * | | ** ** | * * | | *** ** | ** * | | *** ** | *** * | | *** **** | **** * * | +----------------------------------------------------------------+ QPSK Example At the top is various information about the signal, similar to the information provided in Errorbars mode. The large box is similar to a spectrum analyzer viewing the channel bandwidth. The horizontal axis is the input audio level, minimum on the left and maximum on the right. The vertical axis is the number of samples seend at each audio level. The "*" symbols represent the number of audio samples that were at each level during the aggregation period. (default = 36 symbols) The -S options controls the aggregation period as well as the QPSK tracking symbol buffer, so changing that will affect QPSK performance as well as the Datascope display. As you can see from the figures above, clean C4FM signals tend to have four very sharply defined audio levels. The datascope pattern also tends to be faily stable with minor shifts left and right as the receiver tries to frequency track any DC offset. QPSK signals on the other hand tend to appear much broader (and artifact of how they are distored by FM PLL discriminators). They also tend to vary wildly in width and centering. This is especially true when monitoring simulcast systems. Muliple QPSK signals interfere much more dramatically with an FM discriminator than C4FM signals. ******* For this reason it is important to isolate your receiver to one transmitter tower, _especially_ for QPSK signals. ******* The "#" symbols indicate the detected min/max values that are used to calibrate the symbol decision points. These are indicated by "!" for the center decision point and "^" for the mid decision points. Display Options There are several options to control the type and quantity of information displayed in Errorbars mode: -e Show Frame Info and errorbars (default) -pe Show P25 encryption sync bits -pl Show P25 link control bits -ps Show P25 status bits and low speed data -pt Show P25 talkgroup info -q Don't show Frame Info/errorbars -s Datascope (disables other display options) -t Show symbol timing during sync -v <num> Frame information Verbosity -z <num> Frame rate for datascope Most of these options are self explanitory. Symbol timing is a noisy option that allows you to view the quality of the frame sync samples and accuracy of the symbol timing adjustments. Symbol Timing display looks like this: Symbol Timing: ---------- ---------- ---------- ---------- ---------- -+++++++++ 1 +---------- 0 ---------- ++++++++++ 0 ++++++++++ ---------- 0 ---------- ++++++++++ 0 ++++++++++ ++++++++++ ++++++++++ ---------- 0 ++++++++++ 0 ---------- 0 ++++++++++ 0 ++++++++++ ++++++++++ ++++++++++ ++++++++++ C4FM example Symbol Timing: +--------- ---------- ---------- ---------- -----X---- 5 --+++O++++- 4 ---------- ----X----- 4 ++++O++--- 4 --++O++++- 4 ----X----- 4 ---------- ++++O+++-- 4 -+++O+++-- 4 --++O+++-- 4 --++O+++-- 4 ---------- ++++O++++- 4 ---------- ++++O+++-- 4 -+++O++++- 4 -+++O+++++ 4 -+++O++--- 4 --++O+++-- 4 QPSK example Symbol timing is only displayed for symbols during the frame sync period. Each horizontal line represents the 10 audio samples for each symbol. "-" indicates an audio sample below the center reference level and "+" represents a sample above center. "X" indicates a low spike below a reference threshold (reference minimum for C4FM and 80% of reference minimum for QPSK). "O" represents a high spike above the high reference threshold. The numbers to the right indicate which sample position the targeted transition occurred (+/- for C4FM or spike high/low for QPSK). The number of audio samples for the next symbol are adjusted to get this value closer to the target (0 for C4FM and 4 for QPSK). This shows how DSD maintains accurate symbol timing. Symbol timing adjustments are only made during sync, which is the only time reliable transitions can be observed. In both examples above the symbol timing was off by one sample at the beginning of the frame sync period and was adjusted. Generally if you see any spike values "X/O" in C4FM mode, or lots of them in QPSK mode it indicates noise on the input signal. Input/Output Options -i <device> Audio input device (default is /dev/audio) -o <device> Audio output device (default is /dev/audio) -d <dir> Create mbe data files, use this directory -g <num> Audio output gain (default = 0 = auto) -n Do not send synthesized speech to audio output device -w <file> Output synthesized speech to a .wav file The audio in device can be a sound card OR a .wav file if the file is in the exact format 48k/16bits/mono/pcm. Audio in should be an unfilterd discriminator tap signal. The audio out device should be a sound card (use the -w options to output to a .wav file). If the audio in device is the same as the audio out device, the synthesized speech has to be upsampled to the 48k sample rate required for input. A fast upsample function is provided but still leaves some artifacts. ******** The best sound and minimum cpu usage is achieved with separate sound cards for input and output ******** if you specify different input/output devices DSD will use 8k as the output sample rate and the lack of resampling results in much better audio as well as lowe cpu consumption. If you are using onboard "AC97" sound device you may find that DSD uses much more cpu than expected, in some cases more than is available. This is because many AC97 sound devices are designed to rely on CPU processing power instead of hardware. You may also find that 8k sample rate output is upsampled in the driver using a very basic algorithim resulting in severe distortion. The solution is to use a real hardware sound device (pci card, usb device etc). As of version 1.2 DSD now automatically levels the output audio. This greately improves readability and eliminates the painful effects of noise bursts. You can specify a fixed audio output gain with the -g option. Scanner control options: -B <num> Serial port baud rate (default=115200) -C <device> Serial port for scanner control (default=/dev/ttyUSB0) -R <num> Resume scan after <num> TDULC frames or any PDU or TSDU On some P25 systems Packet Data Units (PDU) are sent on the same frequencies used for voice traffic. If done constantly this can be a severe hinderance to scanning the system in conventional mode. The -R option enables sending a "resume scan" command to a scanner connected to a serial port. Use -B and -C to set the baud rate and serial port device if necessary. Decoder options -fa Auto-detect frame type (default) -f1 Decode only P25 Phase 1 -fd Decode only D-STAR* (no audio) -fi Decode only NXDN48* (6.25 kHz) / IDAS* -fn Decode only NXDN96 (12.5 kHz) -fp Decode only ProVoice* -fr Decode only DMR/MOTOTRBO -fx Decode only X2-TDMA -ma Auto-select modulation optimizations (default) -mc Use only C4FM modulation optimizations -mg Use only GFSK modulation optimizations -mq Use only QPSK modulation optimizations -u <num> Unvoiced speech quality (default=3) -xx Expect non-inverted X2-TDMA signal -xr Expect inverted DMR/MOTOTRBO signal * denotes frame types that cannot be auto-detected. ProVoice and NXDN48 not auto-detected as use different symbol rates (9600 and 2400) than most formats (4800). D-STAR is not enabled by default as voice decode does not work and it has a short sync word that is prone to false triggering. It is included for development/testing only. MBE speech synthesis is broken down into two main types of sounds, "Voiced" and "Unvoiced". Voiced speech bands are synthesized with a single sine wave centered in the frequency band with the appropriate phase and amplitude. Unvoiced speech is supposed to be generated with a noise source, 256 point DFT a number of band filters, followed by a 256 point inverse DFT. For computational simplicity mbelib uses a different method. For each unvoiced speech band, a number of sine waves are generated, each with a different random initial phase. The number of waves used per band is controlled by the -u option. A setting of 4 would approximate the performance of the 256 point DFT method as the maximum number of voice bands is 56, and very low frequencies are not synthesized. Values less than 3 have a noticable lack of unvoiced speech and/or artifacts. The defualt of 3 provides good speech quality with reasonable cpu use. Increasing the quality above the default rapidly consumes more CPU for increasingly diminishing returns. Advanced decoder options -A <num> QPSK modulation auto detection threshold (default=26) -S <num> Symbol buffer size for QPSK decision point tracking (default=36) -M <num> Min/Max buffer size for QPSK decision point tracking (default=15) Encryption ******** Decryption of speech is NOT supported, even if you lawfully posess the encryption keys. Decryption support will not be added in the future as the authors wish to steer as far away from the legal issues associated with encryption as possible. ******** We realize that there are many legitemate and lawful uses of decryption software including system/interoperability testing and lawful monitoring. This software is distributed under a liberal BSD license so there is nothing to stop others from supplying patches, forking this project or incorporating it into a commercial product and adding decryption support. There is support for displaying the encryption sync bits transmitted in the clear on P25 Phase 1 systems. These bits do not allow for the decryption of signals without the secret encryption keys. The encryption sync bits are useful for determining whether a signal is encrypted vs merely noisy or degraded. As the encryption sync bits typically include long strings of zeros when a transmission is not encrypted they can also be used to visually estimate bit error rates.
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published