11 July 2025

The Morse code is a noiseless coding method. The following is the order-0 histogram of letters in book1 from the Calgary corpus (English text):

LetterFrequencyLetterFrequencyLetterFrequency
' '125551a48803b10595
c13265d26892e72875
f12650g12878h38538
i39906j721k5039
l23491m14609n41421
o45651p10025q534
r33134s37638t51993
u16134v5446w14824
x866y12402z264

The Shannon Entropy of this corpus is about 4.12 bits per byte, while the average code word length for the Huffman code (optimal discrete noiseless encoding) is about 4.15 bits per byte. We assume the following properties of the Morse code:

  • A dash (-) is three dits, a dot (.) is one dit.
  • Code symbols are separated by silence one dit long.
  • Letters of the same word are separated by silence three dits long.
  • Words are separated by silence for seven dits.

This implies that the Morse code requires, on average, 9.26 dits worth of time per letter of English text to transmit.

< back to journal