Genesis Z axis stall detection fails --> and a workaround

Hi all,

I had a good couple of weeks with my FarmBot and then, with no warning, I tried to home and the Z axis failed its stall detection.

I went through the normal procedures of checking cables and that I had encoder data and everything looked really good. The problem was independent of homing speed and a reboot didn’t solve it.

Of course I went back over the encoder troubleshooting page at FarmBot Genesis Documentation | Hardware documentation and assembly instructions for FarmBot Genesis and FarmBot Genesis XL. Under potential solutions, item 4 caught my eye:

Reflash the firmware by factory resetting and configuring FarmBot OS again. (Alternatively, select a different FIRMWARE in the Device widget and then select the desired firmware again. You should see logs indicating that the firmware has been flashed.)

Thinking it a long shot, I reflashed my firmware and was able to home again. This started happening just about every day. Each time I would make sure that the problem was reproducible (i.e. it wasn’t going to just fail once and then work again). Each time, a reflash solved the problem.

The next time I encountered the problem, I tried a few other things like enabling the use of encoders for positioning, etc. Disabling the Z encoder and re-enabling it seemed to do the trick. Now, in my first morning routine before I home, I use asserts to disable and re-enable the encoder. I’ve been about a week now without having a failure:

image

I’d love to hear if anyone else has encountered this or either:

  1. Wants to confirm that this makes sense as a workaround
    or
  2. Instead wants to make fun of me and suggest I also shake a chicken bone over the FarmBot because it sounds ridiculous.

Jack

3 Likes

Have you applied grease?

Can you post the message which said such ? I’m curious :slight_smile:
Oh, and the bot type and h/w version and the FBOS version that you’re flashing too.

I’m also curious since I’ve had a similar issue today. When I rebooted the Bot, the Z-axis simply didn’t detect that it stalls and tried to home until timeout. I manually tried to home the axis several times until I noticed that the hardware settings toggle of ENABLE ENCODER that belongs to Z-axis was greyed out. So I checked the logs, there were some entries of “Invalid parameter ID” in the debug section. Setting the toggle to off and on again fixed the homing issue. Sadly this was not happening again yet.

1 Like

@Ascend @jrwaters I will take a look. It is very strange to see this issue happening so suddenly, since we haven’t done an FBOS release recently.

Am I correct in assuming that the issues started on Monday, 13 Jul? If that’s the case, it may be API-related, since we did just do a production deploy yesterday.

aronrubin I have not yet applied grease. The axis moves smoothly and it seems so specific to software - i.e. after I apply the workaround its solid. However, when all is said and done, I will do that because I know little things like that can have subtle impact.

And also @jsimmonds, I will post the message the next time it happens. I’m using a Genesis with latest 10.2 and have been since 10.2 came out.

Jack

@RichCarlino, the first time it happened was about two weeks ago. I don’t remember the date but it was at night. I assumed a flaky encoder connection and turned things off until the weekend. After the first reflash, things worked for a few days. As of this past weekend it seemed to start happening more frequently (~1 per day as discovered in the AM when it tried to home). I’ve had the workaround in place since Friday or Saturday.

@Ascend @jrwaters Would you be willing to try the beta release of 10.1.3 to see if the issue resolves itself? We’ve added some data-integrity fixes related to firmware settings management. After an internal discussion with Gabe, we’re suspicious (but not certain) that this could be related. Additionally, since this is a beta release, I would add the caveat that we have not finished QA on this version. If you would prefer to wait for the official release, that’s understandable.

@RickCarlino I don’t mind trying it at all. I love beta testing things. Just tell me how to get it.

Best
Jack

@jrwaters Our beta testing went better than expected so we should have a release ready in less than an hour. Sorry for the confusion.

@jrwaters Sorry for the delay- 10.1.3 is now ready for public use. It has already fixed at least one customer’s firmware related issues. Please let me know if this version helps for your setup as well.

1 Like

No worries. I have just removed my workaround. What I’ll do is see if I can reproduce the issue and get any logs/messages and then apply 10.1.3 and see what it looks like.

1 Like

My FarmBot auto upgraded and so I went ahead with 10.1.3 and no workaround (i.e. I’m not re-setting encoders). So far so good! All homing actions have completed successfully.

If I hit a week and have not seen an issue then I’m going to declare success even if I have to model it out with a binomial distribution.

2 Likes

That’s great to hear @jrwaters thanks for the update.

On Saturday something odd started happening. Not sure if it is related to this or not. Sporadically, my FarmBot just reboots. The logs are below. I need to make some more observations - couple of questions I want to answer like whether it happens with any tool or if soil sensor is involved. Also, I want to see if it happens during movement other than in the Z axis. Should be easy to characterize as this is happening frequently.

Confirmed that it happens with no tools mounted and no Z movement. Just moving around at a fixed height and taking pictures is enough. I have logging turned on full but nothing interesting. Just more of the same:

Monitoring afterwards and it looks like - even doing nothing - my FarmBot is rebooting every 10 minutes. :frowning:

I can’t see your bot the way @RickCarlino can, but meantime, would be useful to see all the app log records. Can you SSH a terminal into the bot over your local LAN and RingLogger.attach at the IEx prompt ?
Show us what’s logged around the stop/reboot.

1 Like

@jrwaters I think this might be a really easy fix:

I have turned your “AUTOMATIC FACTORY RESET” setting to “OFF”.
For some reason, it was set to “ON”. Did you set it this way, or could this be a bug that needs to be investigated?

image

Please let me know if the problem goes away. I have been considering removing this feature from the app because it does seem to cause confusion among users. Also interested to know if this feature enabled itself without user intervention, as this would be a bug that needs to be fixed.

If the problem returns even after turning this setting off, I can continue investigating. Pretty sure this was the cause of the reboots, though. Please let me know.

Thanks @jsimmonds - I took a break and grabbed the logs. Odd. Sitting there with no activity and then my SSH connection disconnects as the FarmBot reboots.

Jack

Interactive Elixir (1.9.0) - press Ctrl+C to exit (type h() ENTER for help)
iex(farmbot@farmbot-00000000c6b8bd5b.local)1> RingLogger.attach
e[33me[36m:oke[0me[33me[0m
iex(farmbot@farmbot-00000000c6b8bd5b.local)2> e[46D e[46De[22m
15:59:16.330 [info] Child {:server, :ssh_server_channel_sup, {0, 0, 0, 0}, 22} of Supervisor #PID<0.8967.0> (:ssh_subsystem_sup) started
Pid: #PID<0.8968.0>
Start Call: :ssh_server_channel_sup.start_link(%{pref_public_key_algs: [:“ecdsa-sha2-nistp384”, :“ecdsa-sha2-nistp521”, :“ecdsa-sha2-nistp256”, :“ssh-ed25519”, :“ssh-ed448”, :“ssh-rsa”, :“rsa-sha2-256”, :“rsa-sha2-512”, :“ssh-dss”], socket_options: [], auth_method_kb_interactive_data: :undefined, max_random_length_padding: 15, negotiation_timeout: 120000, minimal_remote_max_packet_size: 0, connectfun: #Function<42.74051636/3 in :ssh_options.default/1>, idle_time: :infinity, user_dir_fun: :undefined, transport: {:tcp, :gen_tcp, :tcp_closed}, vsn: {2, 0}, ssh_cli: :undefined, internal_options: %{lsocket: {#Port<0.113>, #PID<0.2317.0>}}, max_sessions: :infinity, tstflg: [], key_cb: {Nerves.Firmware.SSH.Keys, [authorized_keys: [{{:RSAPublicKey, REDACTED MY KEY’]}]]}, send_ext_info: true, preferred_algorithms: [kex: [:“ecdh-sha2-nistp384”, :“ecdh-sha2-nistp521”, :“ecdh-sha2-nistp256”, :“diffie-hellman-group-exchange-sha256”, :“diffie-hellman-group16-sha512”, :“diffie-hellman-group18-sha512”, :“diffie-hellman-group14-sha256”, :“curve25519-sha256”, :"curve25519-sha256@libssh.org", :“curve448-sha512”, :“diffie-hellman-group14-sha1”, :“diffie-hellman-group-exchange-sha1”], public_key: [:“ecdsa-sha2-nistp384”, :“ecdsa-sha2-nistp521”, :“ecdsa-sha2-nistp256”, :“ssh-ed25519”, :“ssh-ed448”, :“ssh-rsa”, :“rsa-sha2-256”, :“rsa-sha2-512”, :“ssh-dss”], cipher: [client2server: [:“chacha20-poly1305@openssh.com”, :"aes256-gcm@openssh.com", :“aes256-ctr”, :“aes192-ctr”, :"aes128-gcm@openssh.com", :“aes128-ctr”, :“aes256-cbc”, :“aes192-cbc”, :“aes128-cbc”, :“3des-cbc”], server2client: [:“chacha20-poly1305@openssh.com”, :"aes256-gcm@openssh.com", :“aes256-ctr”, :“aes192-ctr”, :"aes128-gcm@openssh.com", :“aes128-ctr”, :“aes256-cbc”, :“aes192-cbc”, :“aes128-cbc”, :“3des-cbc”]], mac: [client2server: [:“hmac-sha2-256”, :“hmac-sha2-512”, :“hmac-sha1”], server2client: [:“hmac-sha2-256”, :“hmac-sha2-512”, :“hmac-sha1”]], compression: [client2server: [:none, :"zlib@openssh.com", :zlib], server2client: [:none, :"zlib@openssh.com", :zlib]]], id_string: {:random, 2, 5}, auth_methods: ‘publickey,keyboard-interactive,password’, failfun: #Function<40.74051636/3 in :ssh_options.default/1>, dh_gex_limits: {0, :infinity}, infofun: #Function<44.74051636/3 in :ssh_options.default/1>, system_dir: ‘/srv/erlang/lib/nerves_firmware_ssh-0.4.4/priv’, password: :undefined, user_dir: false, unexpectedfun: #Function<15.74051636/2 in :ssh_options.default/1>, user_passwords: [], dh_gex_groups: :undefined, subsystems: [{‘sftp’, {:ssh_sftpd, []}}], profile: :default, user_options: [id_string: :random, key_cb: {Nerves.Firmware.SSH.Keys, [authorized_keys: [{{:RSAPublicKey, REDACTED MY KEY’]}]]}, system_dir: ‘/srv/erlang/lib/nerves_firmware_ssh-0.4.4/priv’, shell: {IEx, :start, []}], modify_algorithms: :undefined, parallel_login: false, pwdfun: :undefined, max_channels: :infinity, ssh_msg_debug_fun: #Function<17.74051636/4 in :ssh_options.default/1>, exec: :undefined, rekey_limit: {3600000, 1024000000}, recv_ext_info: true, shell: {IEx, :start, []}, disconnectfun: #Function<13.74051636/1 in :ssh_options.default/1>})
Restart: :temporary
Shutdown: :infinity
Type: :supervisor
e[0miex(farmbot@farmbot-00000000c6b8bd5b.local)2> e[46D e[46De[22m
15:59:16.337 [info] Child {:server, :ssh_connection_sup, {0, 0, 0, 0}, 22} of Supervisor #PID<0.8967.0> (:ssh_subsystem_sup) started
Pid: #PID<0.8969.0>
Start Call: :ssh_connection_sup.start_link(%{pref_public_key_algs: [:“ecdsa-sha2-nistp384”, :“ecdsa-sha2-nistp521”, :“ecdsa-sha2-nistp256”, :“ssh-ed25519”, :“ssh-ed448”, :“ssh-rsa”, :“rsa-sha2-256”, :“rsa-sha2-512”, :“ssh-dss”], socket_options: [], auth_method_kb_interactive_data: :undefined, max_random_length_padding: 15, negotiation_timeout: 120000, minimal_remote_max_packet_size: 0, connectfun: #Function<42.74051636/3 in :ssh_options.default/1>, idle_time: :infinity, user_dir_fun: :undefined, transport: {:tcp, :gen_tcp, :tcp_closed}, vsn: {2, 0}, ssh_cli: :undefined, internal_options: %{lsocket: {#Port<0.113>, #PID<0.2317.0>}}, max_sessions: :infinity, tstflg: [], key_cb: {Nerves.Firmware.SSH.Keys, [authorized_keys: [{{:RSAPublicKey, REDACTED MY KEY’]}]]}, send_ext_info: true, preferred_algorithms: [kex: [:“ecdh-sha2-nistp384”, :“ecdh-sha2-nistp521”, :“ecdh-sha2-nistp256”, :“diffie-hellman-group-exchange-sha256”, :“diffie-hellman-group16-sha512”, :“diffie-hellman-group18-sha512”, :“diffie-hellman-group14-sha256”, :“curve25519-sha256”, :"curve25519-sha256@libssh.org", :“curve448-sha512”, :“diffie-hellman-group14-sha1”, :“diffie-hellman-group-exchange-sha1”], public_key: [:“ecdsa-sha2-nistp384”, :“ecdsa-sha2-nistp521”, :“ecdsa-sha2-nistp256”, :“ssh-ed25519”, :“ssh-ed448”, :“ssh-rsa”, :“rsa-sha2-256”, :“rsa-sha2-512”, :“ssh-dss”], cipher: [client2server: [:“chacha20-poly1305@openssh.com”, :"aes256-gcm@openssh.com", :“aes256-ctr”, :“aes192-ctr”, :"aes128-gcm@openssh.com", :“aes128-ctr”, :“aes256-cbc”, :“aes192-cbc”, :“aes128-cbc”, :“3des-cbc”], server2client: [:“chacha20-poly1305@openssh.com”, :"aes256-gcm@openssh.com", :“aes256-ctr”, :“aes192-ctr”, :"aes128-gcm@openssh.com", :“aes128-ctr”, :“aes256-cbc”, :“aes192-cbc”, :“aes128-cbc”, :“3des-cbc”]], mac: [client2server: [:“hmac-sha2-256”, :“hmac-sha2-512”, :“hmac-sha1”], server2client: [:“hmac-sha2-256”, :“hmac-sha2-512”, :“hmac-sha1”]], compression: [client2server: [:none, :"zlib@openssh.com", :zlib], server2client: [:none, :"zlib@openssh.com", :zlib]]], id_string: {:random, 2, 5}, auth_methods: ‘publickey,keyboard-interactive,password’, failfun: #Function<40.74051636/3 in :ssh_options.default/1>, dh_gex_limits: {0, :infinity}, infofun: #Function<44.74051636/3 in :ssh_options.default/1>, system_dir: ‘/srv/erlang/lib/nerves_firmware_ssh-0.4.4/priv’, password: :undefined, user_dir: false, unexpectedfun: #Function<15.74051636/2 in :ssh_options.default/1>, user_passwords: [], dh_gex_groups: :undefined, subsystems: [{‘sftp’, {:ssh_sftpd, []}}], profile: :default, user_options: [id_string: :random, key_cb: {Nerves.Firmware.SSH.Keys, [authorized_keys: [{{:RSAPublicKey, REDACTED MY KEY’]}]]}, system_dir: ‘/srv/erlang/lib/nerves_firmware_ssh-0.4.4/priv’, shell: {IEx, :start, []}], modify_algorithms: :undefined, parallel_login: false, pwdfun: :undefined, max_channels: :infinity, ssh_msg_debug_fun: #Function<17.74051636/4 in :ssh_options.default/1>, exec: :undefined, rekey_limit: {3600000, 1024000000}, recv_ext_info: true, shell: {IEx, :start, []}, disconnectfun: #Function<13.74051636/1 in :ssh_options.default/1>})
Restart: :temporary
Shutdown: :infinity
Type: :supervisor
e[0miex(farmbot@farmbot-00000000c6b8bd5b.local)2> e[46D e[46De[22m
15:59:16.345 [info] Child #Reference<0.1468177200.806092801.84242> of Supervisor :“ssh_system_0.0.0.0_22_default_sup” started
Pid: #PID<0.8967.0>
Start Call: :ssh_subsystem_sup.start_link(:server, {0, 0, 0, 0}, 22, :default, %{pref_public_key_algs: [:“ecdsa-sha2-nistp384”, :“ecdsa-sha2-nistp521”, :“ecdsa-sha2-nistp256”, :“ssh-ed25519”, :“ssh-ed448”, :“ssh-rsa”, :“rsa-sha2-256”, :“rsa-sha2-512”, :“ssh-dss”], socket_options: [], auth_method_kb_interactive_data: :undefined, max_random_length_padding: 15, negotiation_timeout: 120000, minimal_remote_max_packet_size: 0, connectfun: #Function<42.74051636/3 in :ssh_options.default/1>, idle_time: :infinity, user_dir_fun: :undefined, transport: {:tcp, :gen_tcp, :tcp_closed}, vsn: {2, 0}, ssh_cli: :undefined, internal_options: %{lsocket: {#Port<0.113>, #PID<0.2317.0>}}, max_sessions: :infinity, tstflg: [], key_cb: {Nerves.Firmware.SSH.Keys, [authorized_keys: [{{:RSAPublicKey, REDACTED MY KEY’]}]]}, send_ext_info: true, preferred_algorithms: [kex: [:“ecdh-sha2-nistp384”, :“ecdh-sha2-nistp521”, :“ecdh-sha2-nistp256”, :“diffie-hellman-group-exchange-sha256”, :“diffie-hellman-group16-sha512”, :“diffie-hellman-group18-sha512”, :“diffie-hellman-group14-sha256”, :“curve25519-sha256”, :"curve25519-sha256@libssh.org", :“curve448-sha512”, :“diffie-hellman-group14-sha1”, :“diffie-hellman-group-exchange-sha1”], public_key: [:“ecdsa-sha2-nistp384”, :“ecdsa-sha2-nistp521”, :“ecdsa-sha2-nistp256”, :“ssh-ed25519”, :“ssh-ed448”, :“ssh-rsa”, :“rsa-sha2-256”, :“rsa-sha2-512”, :“ssh-dss”], cipher: [client2server: [:“chacha20-poly1305@openssh.com”, :"aes256-gcm@openssh.com", :“aes256-ctr”, :“aes192-ctr”, :"aes128-gcm@openssh.com", :“aes128-ctr”, :“aes256-cbc”, :“aes192-cbc”, :“aes128-cbc”, :“3des-cbc”], server2client: [:“chacha20-poly1305@openssh.com”, :"aes256-gcm@openssh.com", :“aes256-ctr”, :“aes192-ctr”, :"aes128-gcm@openssh.com", :“aes128-ctr”, :“aes256-cbc”, :“aes192-cbc”, :“aes128-cbc”, :“3des-cbc”]], mac: [client2server: [:“hmac-sha2-256”, :“hmac-sha2-512”, :“hmac-sha1”], server2client: [:“hmac-sha2-256”, :“hmac-sha2-512”, :“hmac-sha1”]], compression: [client2server: [:none, :"zlib@openssh.com", :zlib], server2client: [:none, :"zlib@openssh.com", :zlib]]], id_string: {:random, 2, 5}, auth_methods: ‘publickey,keyboard-interactive,password’, failfun: #Function<40.74051636/3 in :ssh_options.default/1>, dh_gex_limits: {0, :infinity}, infofun: #Function<44.74051636/3 in :ssh_options.default/1>, system_dir: ‘/srv/erlang/lib/nerves_firmware_ssh-0.4.4/priv’, password: :undefined, user_dir: false, unexpectedfun: #Function<15.74051636/2 in :ssh_options.default/1>, user_passwords: [], dh_gex_groups: :undefined, subsystems: [{‘sftp’, {:ssh_sftpd, []}}], profile: :default, user_options: [id_string: :random, key_cb: {Nerves.Firmware.SSH.Keys, [authorized_keys: [{{:RSAPublicKey, REDACTED MY KEY’]}]]}, system_dir: ‘/srv/erlang/lib/nerves_firmware_ssh-0.4.4/priv’, shell: {IEx, :start, []}], modify_algorithms: :undefined, parallel_login: false, pwdfun: :undefined, max_channels: :infinity, ssh_msg_debug_fun: #Function<17.74051636/4 in :ssh_options.default/1>, exec: :undefined, rekey_limit: {3600000, 1024000000}, recv_ext_info: true, shell: {IEx, :start, []}, disconnectfun: #Function<13.74051636/1 in :ssh_options.default/1>})
Restart: :temporary
Shutdown: :infinity
Type: :supervisor
e[0miex(farmbot@farmbot-00000000c6b8bd5b.local)2> e[46D e[46De[22m
15:59:19.677 [info] Child #Reference<0.1468177200.806092802.47744> of Supervisor #PID<0.8968.0> (:ssh_server_channel_sup) started
Pid: #PID<0.8971.0>
Start Call: :ssh_server_channel.start_link(#PID<0.8970.0>, 0, :ssh_cli, [{IEx, :start, []}], :undefined)
Restart: :temporary
Shutdown: 5000
Type: :worker
e[0miex(farmbot@farmbot-00000000c6b8bd5b.local)2>

1 Like

Hi Rick,

Early on I had WiFi issues and I may have used that setting but can’t swear. I didn’t change it recently, that is for sure.

I’m using Ethernet now. I’ve powered the FarmBot down. Over lunch I’ll give it another try.

BTW, re-burning FarmBot flash and re-flashing arduino do not help. I’ve factory defaulted all of my settings as well. Still getting spurious re-boots every 5 or so minutes.

thanks
Jack

As a side note, FarmBot should drop use of RSA keys:

It is now possible to perform chosen-prefix attacks against the SHA-1 algorithm for less than USD$50K. For this reason, we will be disabling the “ssh-rsa” public key signature algorithm by default in a near-future release.

1 Like