Expand SPI API and Functionality for High Speed Devices. #189

May 16, 2023

greiman
May 16, 2023

I am the author of the SdFat library. Many users are disappointed with performance of SdFat on boards using the ArduinoCore-API. This package does not support high-speed transfers well.

The only block transfer function is void transfer(void *buf, size_t count). This is a poor API for SD use. It can't be used for receive unless you fill buf with 0XFF since the SD examines the sent stream for commands. It can't be used for send unless you copy the block to a tmp buffer since buf will be overwritten.

Often the implementation is just a loop with uint8_t transfer(uint8_t data). This means using transfer(buf, count) is slower than using single byte transfer.

Other board packages provide a suitable API. Adafruit, Sparkfun, Teensy, STM32, particle.io, and many others provide this function.

void transfer(const void *tx_buffer, void* rx_buffer, size_t length);

tx_buffer: array of Tx bytes that is filled by the user before starting the SPI transfer. If NULL, default dummy 0xFF bytes will be clocked out.

rx_buffer: array of Rx bytes that will be filled by the slave during the SPI transfer. If NULL, the received data will be discarded.

length: number of data bytes that are to be transferred.

Some implementations return a status and some implement this async API for DMA.

void transfer(const void *tx_buffer, void* rx_buffer, size_t length, std::function myFunction);

myFunction: user specified function callback to be called after completion of the SPI DMA transfer. It takes no argument and returns nothing, e.g.: void myHandler()

If NULL is passed as a callback then the result is synchronous i.e. the function will only return once the DMA transfer is complete.

Many users are very disappointed with performance on SAMD. Here are SD performance results for the current SAMD SPI library on a MKR Zero with my bench example:

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
482.81,1623,1055,1057
482.86,1079,1055,1057

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
464.81,1108,1096,1098
464.81,1108,1096,1098

Here is the result with a SAMD driver I implemented with the above API:

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
1366.49,394,368,372
1366.49,395,368,372

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
1374.38,376,366,369
1374.38,376,366,369

Here is the function I implemented:

void SERCOM::transferDataSPI(const void *txBuf, void *rxBuf, size_t count) {
  const uint8_t *tx = reinterpret_cast<const uint8_t *>(txBuf);
  uint8_t *rx = reinterpret_cast<uint8_t *>(rxBuf);
  size_t ir = 0;
  size_t it = 0;
  if (rx) {
    while (it < 2 && it < count) {
      if (sercom->SPI.INTFLAG.bit.DRE) {
        sercom->SPI.DATA.reg = tx ? tx[it] : 0XFF;
        it++;
      }
    }
    while (it < count) {
      if (sercom->SPI.INTFLAG.bit.RXC) {
        rx[ir++] = sercom->SPI.DATA.reg;
        sercom->SPI.DATA.reg = tx ? tx[it] : 0XFF;
        it++;
      }
    }
    while (ir < count) {
      if (sercom->SPI.INTFLAG.bit.RXC) {
        rx[ir++] = sercom->SPI.DATA.reg;
      }
    }
  } else if (tx && count) {  // might hang if count == 0
    sercom->SPI.CTRLB.bit.RXEN = 0;
    while (it < count) {
      if (sercom->SPI.INTFLAG.bit.DRE) {
        sercom->SPI.DATA.reg = tx[it++];
      }
    }
    // wait till all data sent
    while (sercom->SPI.INTFLAG.bit.TXC == 0) {
    }
    sercom->SPI.CTRLB.bit.RXEN = 1;
    while (sercom->SPI.CTRLB.bit.RXEN == 0) {
    }
  }
}

I added a call to this function in SPI.h and a virtual version of the API to HardwareSPI.h.

Jul 17, 2023

greiman
Jul 17, 2023
Author

The standard array transfer, void transfer(void* buf, size_t count), is slow for most Arduino core implementations.

When the standard array transfer is improved, it is easy to implement the above API with little extra code.

First implement void transfer(const void *tx_buffer, void* rx_buffer, size_t length); This usually take little more code than the standard array transfer. Often vendor packages provide this proposed API.

the standard API is then:

void className::transfer(void* buf, size_t count) {
  if (buf) {
    transfer(buf, buf, count);
  }
}

It is also easy to make sure all Arduino Core implementations support the API, maybe with lower performance, using uint8_t transfer(uint8_t data) in a loop.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand SPI API and Functionality for High Speed Devices. #189

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Search code, repositories, users, issues, pull requests...

Expand SPI API and Functionality for High Speed Devices. #189

greiman May 16, 2023

Replies: 1 comment

greiman Jul 17, 2023 Author

greiman
May 16, 2023

greiman
Jul 17, 2023
Author