Detecting Overflow Conditions with CANlib

November 23, 2016
Magnus Carlsson

When developing software applications that rely heavily on communication protocols, one of the key concerns is that the application is handling incoming data quickly enough that messages are not dropped. This situation could be caused by the application spending too much time processing a message, the application is paused waiting for user interaction, or the application is waiting on a shared system resource like a data file. Regardless of the cause, the application developer should plan to detect these dropped messages to prevent searching for a system problem when the error was caused by the application behavior.

To detect this issue, CANlib provides several mechanisms for checking on the receive buffer state and determining if a CAN frame has been dropped.

canRead Flag Parameter

The first method for determining if a CAN frame has been dropped is to monitor the flag parameter returned by the canRead function. The flag parameter contains two bits which indicate if a software or hardware overflow has occurred between the last message returned by this function call and the current call. These bits are defined as canMSGERR_HW_OVERRUN for hardware overflows and canMSGERR_SW_OVERRUN for software overflows. You can also use the canMSGERR_OVERRUN mask to check both conditions at the same time. So, when the application sees one of these bits set, the application knows that a message was lost between the current CAN frame and the previous CAN frame received.

To understand how this will appear to the software, imagine a receive buffer that holds 10 frames. (Of course the default receive buffer size in CANlib is much larger than 10.) 10 CAN frames have been received by the hardware and placed in the buffer filling the buffer.

Another CAN frame is received by the hardware but the receive buffer is full so the frame is not added to the buffer.

A 12th CAN frame is received by the hardware overwriting the frame that was not added to the buffer.

The application calls canRead removing the first received CAN frame and leaving a spot for the 12th frame to be added to the receive buffer.

CAN frames 1 through 10 will not indicate an overflow when retrieved by canRead. CAN frame 12 will indicate an overflow when retrieved from the buffer by canRead where the 11th frame was dropped.

canReadStatus

The second method for determining if a CAN frame has been dropped is to call canReadStatus. The flags parameter returned by this function will indicate an overflow if the canSTAT_HW_OVERRUN or canSTAT_SW_OVERRUN bit is set. You can check if either of these overflow bits is set by using the canSTAT_OVERRUN mask.

This status information is updated asynchronously, meaning the values returned by canReadStatus were the last reported but not necessarily the current status. To ensure that the reported data is more current, you call canRequestChipStatus at a periodic rate. canRequestChipStatus ask that the status information be updated but the information is not current upon the function’s exit. The status will be current a period of time after the call completes.

So let’s take the previous example where we have a full buffer and the eleventh message was received by the hardware. We are calling canRequestChipStatus twice a second and canReadStatus once a second.

At this point canReadStatus would not indicated an overflow. When the 12th CAN frame is received by the hardware overwriting the frame that was not added to the buffer the chip status changes to indicate an overflow.

On the next periodic canRequestChipStatus call the process of reporting that status begins. When the process completes the next call to canReadStatus will indicate an overflow. This means that depending on when the 12^th CAN frame arrives in the cycle of canRequestChipStatus and canReadStatus calls, the status will not be indicated by the canReadStatus call for up to a second after the event (in this example).

Once an overflow is indicated in the flags parameter of the canReadStatus call, the status will remain latched until you clear the status using the canIoCtl routine with the canIOCTL_CLEAR_ERROR_COUNTERS function. This is to prevent the application from missing a detected overflow due to race conditions between detection of the overflow and the polling of the status with the canReadStatus routine.

Request the Receive Buffer Level

A third method is to monitor the current depth of the receive buffer by using the canIoCtl routine with the function parameter set to canIOCTL_GET_RX_BUFFER_LEVEL. The buffer returned will be a count of the CAN frames currently stored in the receive buffer.

Keep in mind, time spent checking the buffer level may be better spent actually emptying the receive buffer. Checking the receive buffer level may be more useful when performing predefined block transfers where the application can wait until the entire block has been received before processing the frames.

Actions Have Consequences

You may notice that when using one of the last two methods (canReadStatus or canIOCTL_GET_RX_BUFFER_LEVEL), that an overflow state is indicated in the following call to canRead if the buffer is full and in an overflow state. To retrieve the data in these methods, the receive queue must be placed in a stable state so the entire queue can be examined. During this process, messages may be discarded between the driver and application buffer due to lack of space. This discarding is indicated by an overflow status on the next CAN frame retrieved using canRead.

Conclusions

The application developer should always detect dropped messages during the canRead() handling to prevent searching for a system problem when the error was caused by the application’s behavior. This is your first indication that your application design may have an issue with the amount of traffic on the CAN bus. Monitoring using this method will help determine where the trouble is when a handshake message is dropped or expected periodic message times out. Though you can use a separate tool to monitor that the desired messages are on the bus, monitoring the overflow flags will indicate that your application node is the culprit or at least at risk.

You could use the canRead overflow information to track frequency of overflows. This could identify possible issues with the application when traffic bandwidth increases due to message bursts.

Use the canReadStatus method when indicating on a GUI the current overflow state or raising a warning to the user that an important message may have been dropped.

Checking the receive buffer level would be used when you are willing to stall your GUI or other processes in order to dedicate computing resources to immediately emptying the buffer once a certain size is reached – preventing the overflow from occurring. One such case would be flashing nodes.

Bug reports, contributions, suggestions for improvements, and similar things are much appreciated and can be sent by e-mail to [email protected].

Magnus Carlsson

Magnus Carlsson is a Software Developer for Kvaser AB and has developed firmware and software for Kvaser products since 2007. He has also written a number of articles for Kvaser’s Developer Blog dealing with the popular Python language.