Emil's Projects & Reviews

OpenHardware & OpenSource

CW Keyer/Decoder BlackPill/F401 23rd March 2022

I've started my own CW Keyer/Decoder project based on a BlackPill STM32F401 module because I'm not very good at Morse code (neither transmitting nor receiving) and I wanted something more sophisticated as an aid.

There is a similar more mature nanoKeyer project which is a CW keyer based on the Arduino platform and can be operated with a PS/2 keyboard or a Iambic paddle but has only a very limited decoder as an option.

By taking advantage of cheap modules available on Aliexpress I propose the following schematics for a BOM of around 15$

The design uses:


The project schematic

The device can be power from any 8-20V power supply and it is not wasting any power thanks to the GW1584 switching supply module. The small pot on that module needs to be adjusted for a 5V output prior to soldering all the components. I also put a drop of nail polish to permanently fix the output voltage. Alternatively the device can be powered from the USB-C connector of the CPU module from a power bank if you want to use it portable. You will need to use a USB-C cable without data lines (just 5V and Gnd) because the CPU has only one USB interface and it will conflict with the USB keyboard.

2 layer PCB

The PCB above is a second iteration of this project. I've changed the following:


I've finished and released version 0.1 of the software. This version (written from scratch) completely suits my needs so let me know if you want any other feature added. Only the 16×2 LCD display has software support for now. The decoder uses a significant portion of the RAM but there is still plenty of memory available both in Flash and in RAM.

Memory Resources: No Decoder
Memory Resources: With Morse Decoder

To program the firmware unplug the keyboard and connect the CPU with an USB-C cable to computer. While keeping the Boot0 button pressed reset the CPU and this will enumerate as a DFU device. Use the command bellow to program your device.

    dfu-util -R -a 0 -D CW-Keyer_vXXX.dfu

I'm using Rowley Crossworks for development but the Makefile and linker ld file can be adapted for GCC only development. All development and source files are here.

The only relevant files are in the Core sub-folder, all the other are generated by CubeMX. The DSP sub-folder is a trimmed down version of the CMSIS DSP Library (v1.9.0) which is used to fast compute trigonometric functions and 64 points Complex FFT.

The program uses FreeRTOS and starts a few equal priority threads with no pre-emption just after the hardware initialization. The memory allocation is dynamic but this is only because the CubeMX USB host implementation cannot handle static allocation. I'm not explicitly allocating and freeing heap memory. There are no mutexes or semaphores, instead I'm using the much faster (and leaner on the resources) task notifications.

In addition to USB keyboards all of the following keyers are supported: Straight key, Bug, Single paddle, Iambic A/B and Ultimatic. They are plugged into the stereo 3.5mm jack and the dit/dah lines can also be swapped if wanted. While transmitting with a keyer the time intervals of the clicks are used to decode in real time what is transmitted. The transmission speed can be adjusted in WPM or using Farnsworth WPM. Lead and tail delays can be added for PTT activation.

When using the keyboard the keys are pushed on the right side of the display and stored in a buffer. The cursor will blink under the character which is currently transmitted.

Short phrases which are used often are stored in the flash memory and can be played back with the F1-F12 keys from the USB keyboard. There is also a beacon mode where a chosen phrase is transmitted autonomously at an adjustable time interval.

Reading the encoder and the on-board switches uses a 500Hz timer interrupt. The frequency needs to be this high to catch fast encoder movements. Mechanical debouncing is observing the past 8 samples (8 / 500Hz = 16ms) to detect valid transitions.

For sidetone generation a sine wave is computed every time the frequency or the amplitude is changed. At the same time a number of decreasing amplitude sine waves are also computed. These are needed to be played back at the end of a sidetone to prevent an annoying popping sound. The playback uses the I2S peripheral in DMA mode with a double buffer configuration and 16KHz audio rate. When the sidetone ends the alternate buffer is pointed to the decreasing amplitude versions of the tone.

All the configuration options and phrases are kept in the first sector of flash. I've chosen the first rather than the last because it has a smaller size of only 16KB instead of 128KB and even 16KB is way too much for options. Since it is the first sector at address zero I need to write the first 0x30 bytes every time I erase this sector (which is not often) with a small bootstrap code which passes execution at 0x8004000 where the code resides. There are two regions in this first sector, one for saved strings and one for options. Every time an option is changed and saved a new copy of the options structure is written to the flash. Once the program runs out of options space it will erase the entire sector, reset the strings and copy only the last used options structure.

Morse Decoding

To decode incoming Morse code the digital microphone is activated and sampled at 16KHz. The data is also acquired using a double buffer DMA mode and while one buffer is filled the other one is processed. The CubeMX provides no support for double buffer I2S capture (there is support for double buffered DMA but not for the I2S interface) so I had to write my own.

The microphone samples are 24bit mono and every second one is zero. The 32bits of the mono samples have to be swapped (because the I2S/SPI peripheral is 16bit) and also right aligned. The samples are then transformed to floating point because the amplitude can vary a lot and I'll perform the FFT on complex floating point numbers since the CPU has a hardware FPU and is fast. It would have been great if I could have done these transformations in place but while the buffer in under DMA control (even while not actively written) you cannot change its content so I had to use a separate buffer for the FFT (another half kilobyte of RAM wasted).

By using FFT on complex floating point numbers the decoding works for CW tones regardless of their frequency or amplitude.

I've chosen to process the audio samples every 4ms to get a good timing resolution so I'm using the 64 real samples acquired in that interval to compute a 64 point FFT. This will generate magnitude and phase in 32 bins and each bin will extend on Nyquist 8KHz / 32 = 250Hz intervals. I first apply a Blackman Window on the data and then keep only the bin with the highest magnitude from the first 8 (I'm assuming a max sidetone of 2KHz / 250Hz = 8 bins). To detect tone on/off transitions I'm using a very simple algorithm which compares the magnitude with the average of all recorded magnitudes (4096 in total). The tone frequency information (the bin number) is not used for now. To compute the average fast I convert back the (squared) magnitude from float to integer. This allows me to keep the total sum of all magnitudes while doing just one subtraction (the oldest value) and one addition( the new value). The average is then just one shift away.

I was impressed of how fast the FFT and transition detection is done as you can see in the logic analyzer capture bellow. Once one buffer is filled the interrupt sends a notification. The decoding thread then starts 8us later and after just about 100us the FFT largest magnitude and transition are recorded. There are 3900us until the next frame so there is plenty of time to serve all the other threads.

DFT Timing

Bellow you can see the tone (and bin) detection for a typical Morse reception.


Detecting Morse symbols from tone transitions is not trivial. The best would be a class segmentation algorithm which will group the on and off intervals in classes. This is computational intensive and I haven't tried it yet. Instead I'm relying on the fact that the dit/dah ratio are approximately 1:3. When a Iambic or Ultimatic keyer is used then the space intervals ratios are also approximately 1:3. Sorting the intervals and using these ratios works reasonably well in the get_dit_dah_timing function but this could be further improved.

I keep all previous magnitudes and transition in 4096 long buffers (16KB + 0.5KB of RAM) so the last 16 seconds of Morse reception are available for decoding. The algorithm adapts while it receives more symbols and can correctly decode and display even past symbols. This is also the case when the WPM speed changes when tuning to another station. When using computer generated Morse code the decoder works with no errors to speeds up to 50WPM. With human operators the decoding is less accurate but still quite good (and it can be improved in software). I've tried some Morse decoding apps on smartphones and they are not better than this decoder. The decoder is very flexible and doesn't need any adjustments for WPM transmission speed, amplitude or tone frequency.

The STM32F401 processor CubeMX and its documentation


ARM Cortex-M4
3.3V LCD1
16×2 lines
128×64 pixels
I2S Microphone
I2S Speaker Amplifier
Switching Power Supply
1.5W 8Ohm
Rotary Encoder5pcs$2.88
Switch with Caps
USB Type A
3 pin Socket
Stereo Socket
1.6mm 2 layers
Aluminum Enclosure4pcs$40.73

The components cost (with shipping) for one CW keyer including the PCB and the aluminium enclosure is around 25$ (or 15$ without the enclosure).

Tags: black pill, cw.

Comments On This Entry

Richard Submitted at 07:16:46 on 23 March 2022
I am interested in this nice project can have a look on the source code too? How can order this PCB or kit?
Emil Submitted at 08:47:22 on 23 March 2022
I've only just started this project and ordered the PCBs and components. It will take at least 3 weeks until I receive them and have the hardware built; then more weeks for the software side.
I don't plan to sell kits because I can't be bothered. All the design files and software are available in this page so maybe someone else will pick this task to make some cash.
Jacek Submitted at 10:50:19 on 23 March 2022
As son as you share source code I'll start assembly it.
It's nice project.
Rafael Submitted at 08:35:44 on 24 March 2022
This project looks really nice.
Emil Submitted at 21:58:57 on 21 June 2022
Update: I've released version 0.1 of the code and the project sources.
Coskun MUTI Submitted at 07:55:18 on 27 January 2023
Excellent project. I want to do this, but i am novice. I don't know how to load the *.dfu file into the processor in Windows OS. Is it possible to upload to STM32F401 processor as *.bin file with ST-Link software. If possible, can you share the *.bin file?
Emil Submitted at 17:11:37 on 30 January 2023
Check this page for converters between DFU/HEX/BIN formats.

Add A Comment

Your Name
Your EMail
Your Comment

Your submission will be ignored if any field is left blank or if you include clickable links (URLs without the "http://" start are fine). English only please. Your email address will not be displayed.