Farmbot via Ethernet drops every 20 minutes like clockwork


#1

Troubleshooting report here: Report o68x4 (Saved a few seconds ago)

Basically a hardwired via Ethernet farmbot is dropping off (rebooting?) Every 20 minutes. The cycle looks like this:

Success Farmbot was reconnected to the internet: [‘my.farm.bot’] 1006, 300, 0.2 Apr 14, 10:59am
1

Success|DNS resolution successful|1006, 300, 0.2|Apr 14, 10:59am|
|1

Success|Farmbot is up and running!|1006, 300, 0.2|Apr 14, 10:59am|
|2

Success|Synced|1006, 300, 0.2|Apr 14, 10:59am|
|2

Busy|Syncing|1006, 300, 0.2|Apr 14, 10:59am|
|1

Error|Error connecting to AMPQ: :unknown_host|1006, 300, 0.2|Apr 14, 10:59am|
|1

Error|Authorization failed: {:error, :nxdomain}. Trying again 2 more times.|1006, 300, 0.2|Apr 14, 10:59am|
|1

Error|Authorization failed: {:error, :nxdomain}. Trying again 3 more times.|1006, 300, 0.2|Apr 14, 10:59am|
|1

Error|Error connecting to AMPQ: :unknown_host|1006, 300, 0.2|Apr 14,|

:error, :nxdomain}. Trying again 4 more times. 1006, 300, 0.2 Apr 14, 10:59am
1

Info|Forcing a token refresh.|1006, 300, 0.2|Apr 14, 10:59am|
|3

Warn|Farmbot was disconnected from the internet: :nxdomain|1006, 300, 0.2|Apr 14, 10:59am|
|1

Error|Error connecting to AMPQ: :unknown_host|1006, 300, 0.2|Apr 14, 10:58am|
|1

Error|Token failed to reauthorize: jwood@me.com - https://my.farm.bot :nxdomain|1006, 300, 0.2|Apr 14, 10:58am|
|1

Error|Authorization failed: {:error, :nxdomain}|1006, 300, 0.2|Apr 14, 10:58am|
|1

Error|Error connecting to AMPQ: :unknown_host|1006, 300, 0.2|Apr 14, 10:58am|
|1

Error|Authorization failed: {:error, :nxdomain}. Trying again 0 more times.|1006, 300, 0.2|Apr 14, 10:58am|

It’s worth noting that I see the farmbot looking up a few domains, all of which the DNS server answers positively for.

I also see it looking up local (non existent domains) like:

0.pool.ntp.org.my.domain
and a couple other ampQ ones with bad local domains (where my.domain is replaced with my real LAN domain.

Any help would be greatly appreciated, no waterings work thanks to this :slight_smile:


#2

Is there a support number I can contact? Since I’ve installed this device I’ve mostly manually watered it. Now that I’m entering the dryer season here I’m more inclined to return the setup…

Aside, is anyone having good experiences here? If so - what type of setups do people have? How is network connectivity and reliable watering working for people?


#3

Did you return to v7.0.1 or are you still using 7.0.2 or 7.0.3?
If your bot is still online you can update in the device widget. If not you would need to flash the SD-card with the latest stable version (v.7.0.1).


#4

I am still using 7.01, 7.02 and 3 were even worse :slight_smile:


#5

@terafin (Your “aside”)

We have been having a pretty good time, with a few hiccups and challenges along the way. So far, this seems par for the course and all great at educating the students, teachers, parents and myself.
We decided at the get-go to opt for wired ethernet - From forum posts it seems like WiFi has a few issues, and the connection at the “Bot-Plot” was not strong.
As for reliable watering, I have to say that after working past the initial burn-in, all systems seem to be nominal. We are just about to plant (in the next few days), I have had the FarmBot performing a set of simulated tasks (watering dirt at this point), but no big obstacles.
I will say that the network configuration caused the most headache, but after working with the school district IT guy, we seem to be in the clear!
Luck!


#6

@terafin Are you using any sort of firewall software, or have a DNS server with special rules?


#7

To add to what Rick asked, Farmbot downloads a new auth token every 20 or so minutes so it would make sense that you are receiving the log every 20 minutes.


#8

I have a pretty stock UniFi setup, and no custom DNS beyond the built in. I actually set up another DNS VM yesterday and hard coded the hosts via /etc/hosts to the IPs for farmbot, AMPQ and device metrics, but no difference. I verified local DNS was answering positively, and same result.


#9

My UniFi setup doesn’t not block anything outgoing, FYI - do you require incoming ports?


#10

Added a new diagnostic report here: dtpps

This is a static IP configured to use the default DNS, which appears to be google dns?


#11

@terafin Thanks I will take a look now.


#12

@terafin Still investigating the issue here with @connor.

This is a static IP configured to use the default DNS, which appears to be google dns?

Is that in your WiFi settings for your router, or within configurator? Also, just to be sure we’re on the same page- you’re refering to 8.8.8.8 and 8.8.4.4, as the default Google DNS correct?

do you require incoming ports?

Forgot to answer this one earlier, sorry about that. FarmBot OS does not accepting incoming connections. It’s all outbound.


#13

The last log I uploaded was static IP at configuration time, I believe I either chose google DNS (8.8.8.8), or cloud flare (1.1.1.1) percent at setup time that round (4-5 resets today). I can do another reset and choose a specific DNS if you’d like!


#14

@terafin Let’s try 1.1.1.1 and see if that helps. All of our bots here at the FarmBot warehouse are on that one, so it would help us weed out any differences. That being said, we have used 8.8.8.8 in the past without any issues, so my hunch here is that there might be some other router / local config issue at play. Do you have the model number of your router handy?

Also, we took a look at your diagnostic dump yesterday and did not see anything that was particularly concerning (aside from a large number of nxdomain errors in your log).


#15

Thanks, tried 1.1.1.1 and new diagnostic here: yixbu

My router is this: https://unifi-xg.ui.com/usg-xg-8

Infrastructure is all either fiber or 10g Ethernet to endpoints.


#16

@terafin Im not seeing anything too out of the ordinary in that config. I did some analysis against all bots running on the server via SQL. This issue is only affecting a small percentage of devices, and when it does, the average occurrence rate is 1-3 times per day. Your device has hit the error 23 times in the last 24 hours, which is the highest occurrence rate across all devices on the server.

Some ideas at this point would be:

  1. Reset your account (delete the account and re-register to rule out a misconfiguration). I can do this for you if you provide me permission to do so.
  2. Remove the RPi from the case and re-configurate it on a different network with all peripherals detached for 24 hours to see if the issue persists (to rule out a bad router config setting). 3g/4g networks are historically problematic while ethernet setups (via DHCP) have historically been the most reliable.

#17

Thanks for looking into it.

  1. I’m happy to do this, go for it. I just backed up my account data via the web UI so I can rebuild the regimen.

  2. I’ve replaced the RPI with both another RPI 3 and currently now a RPI B+ in the last day or so. I’m happy to try again, but it’s already hard wired (10G CAT6e to the switch, CAT5e to the Farmbot) with a static IP (set at configuration time). There are two other devices on the same outdoor switch that remain connected just fine.
    For a different network, do you want me to try a hotspot? I have a cellular hotspot I can get ethernet out of I guess to try…

I’m a little confused, as my little esp8266s sitting right beside it (handling some water management), both are just completely fine with long lived MQTT connections to both internal and external addresses - as well as one with a PoE hat (to the same switch).

Also, when I use my internal DNS, I can see the response is sent to the FarmBot Pi… I will pull out wireshark this weekend to verify that DNS packets arrive on the Pi itself.

Let me know what other information you’d like, and/or you’d like me to try - happy to get whatever traces you need. Also happy to try the cell setup tonight.