Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Expand SPI API and Functionality for High Speed Devices. #189

greiman started this conversation in Ideas
Discussion options

I am the author of the SdFat library. Many users are disappointed with performance of SdFat on boards using the ArduinoCore-API. This package does not support high-speed transfers well.

The only block transfer function is void transfer(void *buf, size_t count). This is a poor API for SD use. It can't be used for receive unless you fill buf with 0XFF since the SD examines the sent stream for commands. It can't be used for send unless you copy the block to a tmp buffer since buf will be overwritten.

Often the implementation is just a loop with uint8_t transfer(uint8_t data). This means using transfer(buf, count) is slower than using single byte transfer.

Other board packages provide a suitable API. Adafruit, Sparkfun, Teensy, STM32, particle.io, and many others provide this function.

void transfer(const void *tx_buffer, void* rx_buffer, size_t length);

tx_buffer: array of Tx bytes that is filled by the user before starting the SPI transfer. If NULL, default dummy 0xFF bytes will be clocked out.

rx_buffer: array of Rx bytes that will be filled by the slave during the SPI transfer. If NULL, the received data will be discarded.

length: number of data bytes that are to be transferred.

Some implementations return a status and some implement this async API for DMA.

void transfer(const void *tx_buffer, void* rx_buffer, size_t length, std::function myFunction);

myFunction: user specified function callback to be called after completion of the SPI DMA transfer. It takes no argument and returns nothing, e.g.: void myHandler()

If NULL is passed as a callback then the result is synchronous i.e. the function will only return once the DMA transfer is complete.

Many users are very disappointed with performance on SAMD. Here are SD performance results for the current SAMD SPI library on a MKR Zero with my bench example:

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
482.81,1623,1055,1057
482.86,1079,1055,1057

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
464.81,1108,1096,1098
464.81,1108,1096,1098

Here is the result with a SAMD driver I implemented with the above API:

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
1366.49,394,368,372
1366.49,395,368,372

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
1374.38,376,366,369
1374.38,376,366,369

Here is the function I implemented:

void SERCOM::transferDataSPI(const void *txBuf, void *rxBuf, size_t count) {
  const uint8_t *tx = reinterpret_cast<const uint8_t *>(txBuf);
  uint8_t *rx = reinterpret_cast<uint8_t *>(rxBuf);
  size_t ir = 0;
  size_t it = 0;
  if (rx) {
    while (it < 2 && it < count) {
      if (sercom->SPI.INTFLAG.bit.DRE) {
        sercom->SPI.DATA.reg = tx ? tx[it] : 0XFF;
        it++;
      }
    }
    while (it < count) {
      if (sercom->SPI.INTFLAG.bit.RXC) {
        rx[ir++] = sercom->SPI.DATA.reg;
        sercom->SPI.DATA.reg = tx ? tx[it] : 0XFF;
        it++;
      }
    }
    while (ir < count) {
      if (sercom->SPI.INTFLAG.bit.RXC) {
        rx[ir++] = sercom->SPI.DATA.reg;
      }
    }
  } else if (tx && count) {  // might hang if count == 0
    sercom->SPI.CTRLB.bit.RXEN = 0;
    while (it < count) {
      if (sercom->SPI.INTFLAG.bit.DRE) {
        sercom->SPI.DATA.reg = tx[it++];
      }
    }
    // wait till all data sent
    while (sercom->SPI.INTFLAG.bit.TXC == 0) {
    }
    sercom->SPI.CTRLB.bit.RXEN = 1;
    while (sercom->SPI.CTRLB.bit.RXEN == 0) {
    }
  }
}

I added a call to this function in SPI.h and a virtual version of the API to HardwareSPI.h.

You must be logged in to vote

Replies: 1 comment

Comment options

The standard array transfer, void transfer(void* buf, size_t count), is slow for most Arduino core implementations.

When the standard array transfer is improved, it is easy to implement the above API with little extra code.

First implement void transfer(const void *tx_buffer, void* rx_buffer, size_t length); This usually take little more code than the standard array transfer. Often vendor packages provide this proposed API.

the standard API is then:

void className::transfer(void* buf, size_t count) {
  if (buf) {
    transfer(buf, buf, count);
  }
}

It is also easy to make sure all Arduino Core implementations support the API, maybe with lower performance, using uint8_t transfer(uint8_t data) in a loop.

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
💡
Ideas
Labels
None yet
1 participant
Morty Proxy This is a proxified and sanitized view of the page, visit original site.