Common Scrambling Algorithm
CSA was specified by ETSI and adopted by the DVB consortium in May 1994. It is being succeeded by CSA3, based on a combination of 128-bit AES and a confidential block cipher, XRC. However, CSA3 is not yet in any significant use, so CSA continues to be the dominant cipher for protecting DVB broadcasts.
CSA was largely kept secret until 2002. The patent papers gave some hints, but important details, like the layout of the so-called S-boxes, remained secret. Without these, free implementations of the algorithm were out of question. Initially, CSA was to remain implemented in hardware only, and this would have made it difficult to reverse engineer existing implementations.
In 2002 FreeDec was released, implementing CSA in software. Though released as binary only, disassembly revealed the missing details and allowed reimplementation of the algorithm in higher-level programming languages.
With CSA now publicly known in its entirety, cryptanalysts started looking for weaknesses.
Description of the cipher
The CSA algorithm is composed of two distinct ciphers: a block cipher and a stream cipher.
When used in encryption mode the data are first encrypted using the 64 bits block cipher in CBC mode, starting from packet end. The stream cipher is then applied from packet start.
The block cipher process 64 bits blocks in 56 rounds. It uses 1 byte from expanded key on each round.
The first 32 round of the stream cipher are used for initialization and do not generate any output. The first 64 bits of data are used as initialization vector during this phase and are left unchanged. The stream cipher then generates 2 bits of pseudo-random stream on each round which are xored starting at bit 64 of the packet.
Were CSA to be broken, encrypted DVB transmissions would be decipherable, regardless of any proprietary conditional access (CA) system used. This could seriously compromise paid digital television services, as DVB has been standardised on for digital terrestrial television in Europe and elsewhere, and is used by many satellite television providers.
Most attacks on the pay-TV system have not targeted CSA itself, but instead the various key exchange systems responsible for generating the CSA keys (Conax, Irdeto, VideoGuard, etc.); either by reverse-engineering and breaking the algorithms altogether, or by intercepting the keys in real-time as they are generated on a legitimate decoder, and then distributing them to others (so-called card sharing).
Software implementations and bit slicing
The stream cipher part of CSA is prone to bit slicing, a software implementation technique that allows decryption of many blocks, or the same block with many different keys, at the same time. This significantly speeds up a brute force search implemented in software, although the factor is too low to make a real-time attack practical.
The block cipher part is harder to bit slice, as the S-boxes involved are too large (8x8) to be efficiently implemented using logical operations, a prerequisite for bit slicing to be more efficient than a regular implementation. However, as all operations are on 8-bit subblocks, the algorithm can be implemented using regular SIMD, or a form of “byteslicing”. As most SIMD instruction sets, with a notable exception of AVX2, do not support parallel look-up tables, the S-box lookups are done as in a non-bytesliced implementation, but their integration into the rest of the algorithm is not hampered markedly by the byteslicing.
Both techniques are used in libdvbcsa, a free implementation of CSA.
Cryptanalysis is made more difficult by the fact that most data is protected both by the block and the stream cipher. However, there are parts that are protected by one of the ciphers only: The first 64-bit block is only encrypted with the block cipher, and the any excess bits after the last 64-bit block (zero to seven bytes) are protected by the stream cipher only. However, the block cipher is applied from back to front and in CBC mode, which means that in the end, every bit of the output depends on every bit on the input, and each 183-byte cell has to be decoded as a whole.
Brute force approach
While the CSA algorithm uses 64-bit keys, most of the time, only 48 bits of the key are unknown, since bytes 3 and 7 are used as parity bytes in CA systems, and may be easily recalculated. This opens up for possible known plaintext attacks when combined with knowledge of the underlying plaintext structure. For instance, as the first three bytes of the PES header is known to always be 0x000001, it would be possible to launch a brute force attack. Such an attack would reveal millions of possible keys, but still few enough to make it practical to attempt decryption of other parts of the data with the same key in a second pass to recover the true key.
However, 48 bits, even if small by today's standards, is a significant amount of keyspace to search through. For most practical applications, one would want to break the key faster than it is changed, and as the key changes at a minimum of every 120 seconds, this would require scanning through on average at least half the keyspace in that period of time. As an implementation taking 1 µs for each try (i.e., testing a million keys per second; about what libdvbcsa can do on a single core of a modern x86 processor) would require 8.9 years to scan the entire keyspace, this makes a brute force approach impractical for decrypting the data in real time, even with a highly parallel implementation.
Furthermore, even if dedicated hardware or FPGA implementations made it possible to test the required 2.35 trillion keys per second needed to scan the entire keyspace in two minutes, the key could simply be replaced more often. Furthermore, the parity bytes could be replaced by real key bytes, increasing the keyspace to 64 bits, which is 65,536 times as large.
In 2011, a group of German researchers released an attack on CSA as used in the DVB system. By noting that MPEG-2 padding frequently requires long series of zeroes, leading to entire 183-byte cells being encrypted with zeroes only, it is possible to build up a rainbow table recovering the key from such a known-zero block. (A block would be known to be zero if two blocks with the same ciphertext were found, since presumably both would be zero blocks.) The attack described would require about 7.9 TB of storage, and enable an attacker with a GPU to recover a key in about seven seconds with 96.8% certainty. However, the attack is only effective when such all-zero padding blocks are present (i.e., in scenes with very little movement or noise), and it would be easily mitigated by for instance sending such all-zero cells unencrypted.
Note that this differs from the brute-force attacks above in that the plain-text contents of the entire cell is known before the attack, not just parts of a block; it is this that enables the ahead-of-time computation of the rainbow table.
In 2004, a fault attack was published on the block cipher. The basic idea was to introduce a few deliberate faults in the intermediate calculations, making it possible to deduce the last eight round keys. From this, the final key can be computed.
While very few faults are needed (on average two or three per round key recovered), fault attacks are usually impractical to carry out, in that it requires the ability to change bits at will inside the hardware of a decoder that already has the key in question.