It's dead[time], TIM

Just discovered another interesting features of the "Advanced Timers" in the STM32G4: 3-phase combined PWM. Essentially it takes timer channels 1, 2, and 3 and ANDs them together with channel 5. On the one hand this sounds like it's kinda useless: why would you want to modify your PWM waveform during 3-phase commutation?
Top three traces: PWM for CH1-3, bottom channel shows output from CH5 (well, CH4 since CH5 doesn't have pin-mapped output)
Turns out, it's a really interesting way to add deadtime to a PWM signal without worrying about maximum duty cycle limits. In fact, you can crank the duty cycle all the way up to 100% and you'll still get center-aligned deadtime!

CH3 is "100% duty cycle" according to its the timer configuration, but it still has deadtime to do ADC sampling.
That's *super* useful for ADC sampling, which, as it turns out, is my next target. Stay tuned...

It's about that time[r]

The next system of the STM32G4 I'm taking on is the Timer system. And no, it's not just because I get to see my name all over the place...

Dead simple amirite?
This chip has a whole host of different types of timers: 3x advanced-control timers TIM1/8/20, 4x general-purpose timers TIM2/3/4/5, 3x general-purpose timers TIM15/16/17 (huh? must be cut-down versions), 2x basic timers TIM6/7, a low-power timer LPTIM, and a high-resolution timer HRTIM that is actually seven timers in one: master, and 6x slaves. Not sure exactly what master/slave relationship is just yet.

I got a pretty good head start understanding timers from this video by Eddie Amaya. In short, they're 16-bit counters that count up or down based on the peripheral clock divided by some prescalar. When the counter reaches the value set in the TIMx_ARR register it does... something; fires an interrupt, toggles a GPIO, etc. Then when the timer overflows, it resets back to 0 (in the case of the GPIO it toggles it back on again). Apparently they can be tied to DMA reqeusts too, according to p1101 of the datasheet.

I'm particularly interested in the center-aligned timing modes that allow for up-down counting, which will be super useful for current sampling during BLDC commutation (more on that project in a later post).

H gates for the top 3 traces, and a spike indicating current sampling from Ben Katz's blog.
But since I'm still new to timers I figure I'll just try and get a PWM signal out first. 

STM's timer's are ... weird? After reading about them for the better part of 3 days, I started trying to get a simple PWM signal up and running. Took me the better part of another day to figure out how to enable them (grrr... CubeMX's generated code is actually flat out wrong). Aaaand then about 10 minutes and as many lines of code to get three center-aligned PWM pulses synchronized?!

That was alarmingly easy...
I've still got to figure out things like duty cycle, frequency, and syncing the ADC to the dead-time, but so far so good.

SPI-nning up

As part of a bigger project that I'll go into in a future post, I'm beginning to mess around with the SPI peripheral on the STM32G4, a new-as-of-last-year microcontroller geared towards BLDC motor control. Instead of starting with a middleware like mBed, I figured if I really wanted to learn electrical engineering it'd probably be in my best interest to bite the bullet and start at the bare metal to understand exactly what's going on at the chip level.

After reading all 2075 pages of the datasheet (heh) I set about using the HAL drivers provided by ST's STM32CubeMX to get the SPI port up and running in order to talk to the LSM9DS1 9-DoF chip (datasheet). Adafruit has created a solid breakout board for it:

I had previously played around with Adafruit's other 9-DoF breakout board, but that requires communication with two different chips. On top of that, according to the datasheet for the LSM303DLHC chip the accelerometer registers are big-endian while the magnetometer registers are little-endian. Only lost a whole day to that, grrr...

To bootstrap learning, I'm using the NUCLEO-G474RE dev board (datasheet). CubeMX has built-in support for this board and an initial pinout for its peripherals, which is nice to get up and running quickly. On top of that, it's got a shiny new ST-LINK V3 on board (well, the V3E version anyway), which makes for super nice USB hi-speed debugging.

On to exploring the SPI interface. I configured the chip to use SPI2, since SPI1 uses PB3 that's tied to T_SWO.

Easy enough. Next, I exported the code and looked through the auto-generated source files created by CubeMX.

Yeah, no, not doing that. I guess I'll use it as an example of how to twiddle the registers, but that's about it. I ended up spending a few days getting a toolchain up and running using a combination of VSCode, Cortex-Debug, gcc-arm-.*, and the awesome build system Bazel. And once this patch to OpenOCD landed, I had a full debugging setup up and running. Never underestimate the utility of full IDE debugging support, even in embedded systems.

Here's the SPI port from the LSM9DS1 to SPI2 on the Nucleo, and my trusty (read: I barely know how to use it) Rigol DS1054

I initially tried to bit-bang the CSS (NSS according to ST) at 5.3125 Mbit, polling for the WHO_AM_I register. According to the datasheet, we're expecting to read 0x68. 

Nice! If we expand the MOSI/MISO decoding lines we see that 0x8F is written first (0x07 register, with 0x80 bit set), and the chip responds with 0x68. Woo! Looking closer though, the CSS goes low, then about 3us later the clock begins to switch. The clock switching appears to be at the appropriate 5.3 Mhz, but there's a *huge* delay between the initial clock pulses and the subsequent pulses. The total read-then-write takes almost 20us (!).

Looking back at the code, I'm using the blocking version of the SPI HAL to do a write to write the register address, then subsequently a read. My suspicion was that the delay in the clock was likely due to spending time between the completion of the SPI frame, remaining HAL code, and my next read call. The compilation target is in dbg mode though, so this isn't entirely unexpected.

I wondered what would happen if instead of doing two sequential Read/Write calls, if the HAL function HAL_SPI_TransmitReceive would have less CPU overhead. 

Much better. A tiny bit of a delay between the read and write, but better. Though there's still a large delay between pulling the NSS switching and the SPI transfers. Again, we're in debug here, so I wondered what it would look like if we switched to opt compilation mode.

Even better! Digging through the datasheet, I found out that the SPI interface does in fact have support for hardware chip-select switching. I switched that on, for opt+hardware NSS performance.

Wow. Yeah, I don't think there's any way bit-banging will be faster than hardware. That said, it's probably more than acceptable for reading/writing config registers. For doing multi-reads on the other hand, probably best to try a hardware implementation.

One other trick I tried is to take advantage of the fact that the LSM9DS1 does most transactions in 16-bit windows. The STM32G4 has support for SPI transactions up to 16 bit. Switching from the standard 8-bit DataSize to 16-bit resulted in an even tighter timing, just about as optimal as you can get:

Including overhead, this setup was able to transfer 16 bits in about 3.2us, or almost exactly 5Mbit. Not too shabby!