False positive stall detection with Farmduino 1.4 and TMC2208

I downgraded from Farmduino 1.5 to 1.4 to reduce the noise level of the motors. As the Farmduino 1.4 has changeable motor drivers I could fit TMC2208 with SilentStep mode set to 2 microsteps. With the encoders deactivated the motors are running very quietly and smooth.

The z axis is also running perfectly with the encoder activated. However after activating the encoders for x and y the stall detection falsely triggers after a short period of time / some millimeters of movement.

The encoder scaling is correct as I can see in the movement tab. Most of the time the moved distance for the motor and the scaled measurement of the encoders matches perfectly. Sometimes other values (very big numbers, negative numbers) show up for a small fraction of a second but the displayed value always returns to the correct value.

I am trying to understand what exactly is happening. The observations are briefly jumping values in the web app (could be a display problem only) and a falsely triggering stall detection for the x and y axis.
@RickCarlino Is there a way to gather more information on the problem, for example by connecting a notebook via USB to the Farmduino and sending G0 commands? Could there be an electrical problem (maybe wiring of the encoders) or can this be ruled out by the fact that the steps are counted and displayed correctly most of the time?

I spotted the wrong encoder value in the network transmissions to the web app (right side normal value most of the time, left side negative value displayed very briefly):
NegativeEncoderValue

It looks like the encoder steps are calculated wrongly sometimes. Could this be a bug in the firmware?

1 Like

@pinae It is possible you have identified a firmware bug.

I am less involved with the firmware development these days, but I have made a note of it and forwarded this message to the rest of the team. I will let you know if they find anything. Our current priority for the firmware is to fix a longstanding issue with calibration and also to re-write the FBOS GCode handler that has been encumbered with technical debt.

Your idea of controlling the bot remotely via laptop is a good starting point. This would allow us to definitively pin down a firmware bug vs. a network (MQTT) bug. Please let us know how your experiments go. I’m happy to answer any questions you have about the firmware handler. Additionally, if you want to see how my firmware handler updates (on the Pi, not the MCU) are coming along, they are in the firmware branch of FBOS. That’s my main thing this week.

Look forward to hearing your findings. Will keep you posted about updates on our end.

Have you also set the microstepping setting (in the app) to 2, to match the configuration of the stepper drivers? In doing so, you may need to adjust the other settings such as steps per mm, encoder scaling, etc to match everything back up.

I’m not sure why you are seeing the extremely large incorrect numbers briefly flash for raw/scaled encoder values. It kind of looks like an overflow issue, however, usually overflow issues don’t resolve themselves perfectly. There is possibly a bug in the firmware related to these drivers or the microstepping, as this is not a use case we test against. Let us know if you find anymore information, thanks!

I initially set the Microstepping to 2 and adjusted all the values. See my screenshot in this thread: Stepper driver change for no-noise bot - #56 by pinae
I also tested the suggestions of @Ascend (Stepper driver change for no-noise bot - #57 by Ascend) but 2.5 Steps/mm are too low and the distances are wrong. During these test I also tried Microstepping 1 and doubling the Steps/mm to 10 which works exactly the same as the initial setting. So I currently know two sets of settings which lead to correct movement distances and matching encoder values.

Today I connected a notebook via USB to the Farmduino 1.4. I tried Pronterface for the connection (baudrate 115200) and received a lot of messages from the Farmduino. I was not able to send commands because Pronterface did not think it was connected to a 3D printer and refused to send commands. Which serial terminal is suitable for communicating to the Farmduino (I’m using Linux)?

Interestingly I could already see falsely scaled encoder positions in the serial output. Here is an excerpt of the Messages sent by the Farmduino:

R82 X0.00 Y0.00 Z0.00 Q0
R84 X0.20 Y-0.40 Z-1.92 Q0
R85 X12 Y-21 Z-347 Q0
R81 XA1 XB1 YA1 YB1 ZA1 ZB1 Q0
R88 Q0
R00 Q0
R82 X0.00 Y0.00 Z0.00 Q0
R84 X0.20 Y-0.40 Z-1.92 Q0
R85 X12 Y-21 Z-347 Q0
R81 XA1 XB1 YA1 YB1 ZA1 ZB1 Q0
R88 Q0
R00 Q0
R82 X0.00 Y0.00 Z0.00 Q0
R84 X-59657096.00 Y-0.40 Z-1.92 Q0
R85 X-2147483636 Y-21 Z-347 Q0
R81 XA1 XB1 YA1 YB1 ZA1 ZB1 Q0
R88 Q0
R00 Q0
R82 X0.00 Y0.00 Z0.00 Q0
R84 X0.20 Y-0.40 Z-1.92 Q0
R85 X12 Y-21 Z-347 Q0
R81 XA1 XB1 YA1 YB1 ZA1 ZB1 Q0
R88 Q0
R00 Q0
R82 X0.00 Y0.00 Z0.00 Q0
R84 X-59657096.00 Y-0.40 Z-1.92 Q0
R85 X-2147483636 Y-21 Z-347 Q0
R81 XA1 XB1 YA1 YB1 ZA1 ZB1 Q0
R88 Q0
R00 Q0
R82 X0.00 Y0.00 Z0.00 Q0
R84 X0.20 Y-0.40 Z-1.92 Q0
R85 X12 Y-21 Z-347 Q0
R81 XA1 XB1 YA1 YB1 ZA1 ZB1 Q0
R88 Q0
R00 Q0
R82 X0.00 Y0.00 Z0.00 Q0
R84 X-59657096.00 Y-0.40 Z-1.92 Q0
R85 X-2147483636 Y-21 Z-347 Q0
R81 XA1 XB1 YA1 YB1 ZA1 ZB1 Q0
R88 Q0
R00 Q0
R82 X0.00 Y0.00 Z0.00 Q0
R84 X0.20 Y-0.40 Z-1.92 Q0
R85 X12 Y-21 Z-347 Q0
R81 XA1 XB1 YA1 YB1 ZA1 ZB1 Q0
R88 Q0
R00 Q0
R82 X0.00 Y0.00 Z0.00 Q0
R84 X0.20 Y-0.40 Z-1.92 Q0
R85 X12 Y-21 Z-347 Q0
R81 XA1 XB1 YA1 YB1 ZA1 ZB1 Q0
R88 Q0
R00 Q0

This proves that the Problem is not in FBOS but in the Farmduino firmware. Any ideas for further debugging?

Have you noticed if the very large encoder values occur on a regular basis, for example every few seconds, or always when the bot reaches a certain coordinate position?

The Arduino IDE provides an easy way to send GCODE commands via Tools > Serial Monitor.

They occur regularly even is no movement is happening. Maybe the wrong values occur more frequently when the bot is moving but that is hard to see in the web app.

Okay, thank you for the additional information.

Our firmware developer is currently working on ironing out a find-axis-length bug with the Genesis firmware. Once that bug is resolved, the Genesis firmware will spring forward several versions (from 6.5.11 to 6.5.32+) to catch up with the Express firmware, which may actually resolve your encoder issue.

If the encoder issue is not resolved though, our firmware developer’s next task is enabling StealthChop for the v1.5+ boards, at which point you may be able to switch back to your v1.5 board and have the quiet operation.

Thank you for your patience! You will be hearing from us in #announcements channel as soon as these updates go live.

1 Like

I tried sending G-code with the Arduino IDE. The Serial Monitor displays the messages from the Farmduino and there is no error message if I try to send a command. But there is also no movement.

I sent F22 P2 V1 Q0 and after that G00 X100 Y300 Z-5 Q0 but nothing moved. I also tried F11, F12, F13 and G28 - nothing. Am I missing something here?

Another observation that might be helpful for debugging: The negative encoder values seem to show up more frequently if the axis position is near 0. If I move the bot to X2600 Y1160 the neagive values still show up but with two to three seconds in between. And the error also seems to be more frequent on the Y axis than on the X axis.

@pinae First off, thank you so much for your help in isolating this bug- you’ve been a big help to us as we search for a root cause.

With regards to your questions about the serial monitor- perhaps there is a line endings issue? Here are the settings I use locally and it seems to work fine:

image

Please let me know if the settings shown in this screenshot (“Both NL & CR” + 115200 baud) do not work for you.

2 Likes

I managed to get the FarmBot moving with commands over USB. As I understand the Farmduino firmware starts with a default configuration which was not overwritten by FBOS in my test. The output for the configuration was this:

R08 *F20
*
16:26:18.816 -> R01 Q0
16:26:18.816 -> R21 P0 V1 Q0
16:26:18.816 -> R21 P2 V1 Q0
16:26:18.816 -> R21 P3 V1 Q0
16:26:18.816 -> R21 P4 V0 Q0
16:26:18.816 -> R21 P5 V3 Q0
16:26:18.816 -> R21 P11 V120 Q0
16:26:18.816 -> R21 P12 V120 Q0
16:26:18.816 -> R21 P13 V120 Q0
16:26:18.816 -> R21 P15 V0 Q0
16:26:18.816 -> R21 P16 V0 Q0
16:26:18.816 -> R21 P17 V1 Q0
16:26:18.816 -> R21 P18 V0 Q0
16:26:18.816 -> R21 P19 V0 Q0
16:26:18.816 -> R21 P20 V0 Q0
16:26:18.850 -> R21 P21 V0 Q0
16:26:18.850 -> R21 P22 V0 Q0
16:26:18.850 -> R21 P23 V0 Q0
16:26:18.850 -> R21 P25 V0 Q0
16:26:18.850 -> R21 P26 V0 Q0
16:26:18.850 -> R21 P27 V0 Q0
16:26:18.850 -> R21 P31 V0 Q0
16:26:18.850 -> R21 P32 V0 Q0
16:26:18.850 -> R21 P33 V0 Q0
16:26:18.850 -> R21 P36 V1 Q0
16:26:18.850 -> R21 P37 V1 Q0
16:26:18.850 -> R21 P41 V300 Q0
16:26:18.850 -> R21 P42 V300 Q0
16:26:18.850 -> R21 P43 V300 Q0
16:26:18.850 -> R21 P45 V0 Q0
16:26:18.850 -> R21 P46 V0 Q0
16:26:18.850 -> R21 P47 V0 Q0
16:26:18.850 -> R21 P51 V0 Q0
16:26:18.850 -> R21 P52 V0 Q0
16:26:18.850 -> R21 P53 V1 Q0
16:26:18.850 -> R21 P55 V5 Q0
16:26:18.850 -> R21 P56 V5 Q0
16:26:18.850 -> R21 P57 V25 Q0
16:26:18.850 -> R21 P61 V50 Q0
16:26:18.850 -> R21 P62 V50 Q0
16:26:18.882 -> R21 P63 V50 Q0
16:26:18.882 -> R21 P65 V50 Q0
16:26:18.882 -> R21 P66 V50 Q0
16:26:18.882 -> R21 P67 V50 Q0
16:26:18.882 -> R21 P71 V400 Q0
16:26:18.882 -> R21 P72 V400 Q0
16:26:18.882 -> R21 P73 V400 Q0
16:26:18.882 -> R21 P75 V0 Q0
16:26:18.882 -> R21 P76 V0 Q0
16:26:18.882 -> R21 P77 V0 Q0
16:26:18.882 -> R21 P101 V0 Q0
16:26:18.882 -> R21 P102 V0 Q0
16:26:18.882 -> R21 P103 V0 Q0
16:26:18.882 -> R21 P105 V0 Q0
16:26:18.882 -> R21 P106 V0 Q0
16:26:18.882 -> R21 P107 V0 Q0
16:26:18.882 -> R21 P111 V5 Q0
16:26:18.882 -> R21 P112 V5 Q0
16:26:18.882 -> R21 P113 V5 Q0
16:26:18.882 -> R21 P115 V5556 Q0
16:26:18.882 -> R21 P116 V5556 Q0
16:26:18.882 -> R21 P117 V5556 Q0
16:26:18.882 -> R21 P121 V5 Q0
16:26:18.882 -> R21 P122 V5 Q0
16:26:18.882 -> R21 P123 V5 Q0
16:26:18.882 -> R21 P125 V0 Q0
16:26:18.916 -> R21 P126 V0 Q0
16:26:18.916 -> R21 P127 V0 Q0
16:26:18.916 -> R21 P131 V0 Q0
16:26:18.916 -> R21 P132 V0 Q0
16:26:18.916 -> R21 P133 V0 Q0
16:26:18.916 -> R21 P141 V0 Q0
16:26:18.916 -> R21 P142 V0 Q0
16:26:18.916 -> R21 P143 V0 Q0
16:26:18.916 -> R21 P145 V0 Q0
16:26:18.916 -> R21 P146 V0 Q0
16:26:18.916 -> R21 P147 V0 Q0
16:26:18.916 -> R21 P201 V0 Q0
16:26:18.916 -> R21 P202 V60 Q0
16:26:18.916 -> R21 P203 V1 Q0
16:26:18.916 -> R21 P205 V0 Q0
16:26:18.916 -> R21 P206 V60 Q0
16:26:18.916 -> R21 P207 V1 Q0
16:26:18.916 -> R21 P211 V0 Q0
16:26:18.916 -> R21 P212 V60 Q0
16:26:18.916 -> R21 P213 V1 Q0
16:26:18.916 -> R21 P215 V0 Q0
16:26:18.949 -> R21 P216 V60 Q0
16:26:18.949 -> R21 P217 V1 Q0
16:26:18.949 -> R21 P221 V0 Q0
16:26:18.949 -> R21 P222 V60 Q0
16:26:18.949 -> R21 P223 V1 Q0
16:26:18.949 -> R20 Q0
16:26:18.949 -> R02 Q0
R00 Q0
16:26:19.316 -> R82 X200.00 Y500.00 Z0.00 Q0
16:26:19.316 -> R84 X99.40 Y-247.80 Z0.04 Q0
16:26:19.316 -> R85 X3585 Y-8923 Z8 Q0
16:26:19.316 -> R81 XA1 XB1 YA1 YB1 ZA1 ZB1 Q0
R00 Q0

I think the move command only works if A, B and C are specified. The default values seem to be wrong so the movement distance was too short and the speed was very slow. But it moved and here is the quite long output of my whole test: FarmduinoMoved.txt (83.3 KB)

Please let me know if I can help with another test.

@pinae

This sounds correct. FBOS always includes A,B,C values and I am unsure how the firmware will behave without them.

Did you notice any issues aside from the initial bug report or the unspecified behavior when A/B/C are missing? I am doing an overhaul of the firmware handler on the FBOS side this month but can take a deeper look thereafter. Please let me know if you have any other questions between now and then.

1 Like

Because I could not activate the encoders I fitted limit switches for the X and Y axis today. I used these waterproof switches from amazon and designed and 3d printed parts for attaching dhem to the frames. If anybody is interested the designs are in this copy of the FarmBot genesis in Onshape: https://cad.onshape.com/documents/0c7a2ad4d38ff17d7fad60fa/w/a4eeb42e08bde05944e14301/e/c4938dd264d00089aafef39a

The Y axis is able to home and moves exactly as expected. The X axis is able to home but jumps back if I start a movement with the X-switch enabled. If I deactivate the X-switch it moves without a problem. I disconnected the encoder cable because I guessed that the encoder bug influences this. That’s not the case. After disconnecting both encoder cables the position tab simply displays 0 for the encoder position. “Find length” seems to not work with limit switches.

During these tests I made an interesting observation: Even with the encoders for both X motors disconnected I get weird negative values for the X position reported from the encoders. This basically proves that the encoder bug is not related to wiring.

1 Like

In search of possible reasons for my problems I looked into the code for the Farmduino. As I understand the encoders are read out via SPI which is a serial protocol. So I guess it could be possible that I have a wiring related issue with the serial connection. The funny thing is that I have these problems only since switching from Farmduino 1.5 to 1.4. Did the wiring of the encoders change between these versions which could explain a serial connection error.

Completely unrelated to the encoders I experienced problems with my newly installed endstops: The X axis started to move but jumped back to 0 in the interface which lead to a completely wrong position. I used unshielded wires to connect normally open limit switches and I did not add any capacitors. Could this jumping behavior be caused by a falsely triggered endstop signal? For CNC mills I read that induced currents may cause false triggers.

The wiring of the encoders did not change between v1.4 and v1.5. I suppose there could be a manufacturing defect in your v1.4 board that is causing errors, though I wouldn’t know how to diagnose that and haven’t seen that before.

Because none of our kits have ever shipped with limit switches, this does not surprise me. The limit switch functionality is not a use case we test against and is really just leftover functionality from very early prototypes. Unfortunately we have very limited firmware development resources currently to look into fixing that, but thank you for bringing it to our attention. If you are savvy with Arduino code, PRs are welcome!

This makes me think there is some errant signal making it’s way to the STM32 co-processor (which is responsible for the encoder tracking). The firmware installed on this processor could also possibly be corrupted? Here is the source for that firmware: GitHub - FarmBot-Labs/encoder-tracker: Tracks multiple quadrature encoder signals.

Yes, it could! My guess is that the limit switch input needs a pull to prevent unwanted triggering. Again, sorry that the limit switch functionality is not fully tested against and your mileage with using that functionality may vary.

I checked the code for the firmware in the STM32 and I could not find problematic code. As I understand the STM32-firmware uses the timer modules in the STM32 to count the encoder pulses via hardware. The timer modules in the STM32 are designed for this (see page 12 of this presentation). The firmware basically takes this value, switches the endiness and transfers it via SPI to the Atmel running the Farmduino firmware.

I looked there if the SPI communication is somehow wrong but that also looks perfectly OK to me. There is only one thing that looks weird to me (which doesn’t need to be a bug): In line 77 of MovementEncoder.cpp the function MovementEncoder::setPosition(long newPosition) allows for an input other than 0 which might get used in Movement.cpp in Movement::setPositionX(), etc. If the passed value is something other than 0 the encoder is not updated. Maybe it’s just not documented why this is the desired behavior but at the moment I don’t see why the if-statement in line 84 is limiting the functionality to 0.

Manufacturing defects in PCBs are relatively rare so I assume for now that the board is OK.

When my FarmBot moves the portal and the z axis rock sometimes. This apparently increases friction a lot and the axis stalls sometimes. The bot restarts movement after the stall so I guess some kind of stall detection is going on in the TMC2208. I assumed that the Farmduino 1.4 does not configure them via UART so the stall detection is probably at default values. It’s hard to pinpoint if the motors are loosing steps in these moments. With the encoders deactivated my bot is loosing a lot of steps which leads to wrong positioning of several cm after watering 120 plants.

I am planning to redo the wiring for the limit switches. They do not work reliably at the moment. I am planning to use shielded wires, small capacitors and photocouplers. I will let you know if the new wiring makes them more reliable.

2 Likes

I updated FBOS to the beta of Version 14. Since then the issue semms fixed. I did not see negative encoder values in the serial output of the Farmduino and the Web App does not show false values either. Thank You very much for the quick bugfix!

2 Likes