Problem sending RPC command

Hi FB community!

I’m using the OS 8.2.3 on Genesis v1.4. I’m getting an error when trying to execute a sequence involving taking more than one picture.

I can make a sequence moving the FB to a certain position, take a picture, and then send it back to home. No problems with that.

But when I try adding a second step involving moving the FB to a different position and try to take a new picture, I’m getting weird pop-up messages which don’t show up on the log just before the “take picture” command gets executed. (see image below)

bugs_cameras_sequences

I’ve tried making the FB wait >40 seconds in the sequence so the Farmware can receive the image and “save” it, but I’m still getting the same problem.

Has anyone been through this before? Any comment would be greatly appreciated!

Cheers,
Nico

@NicoMolnar, Good post ! Welcome to the Forum.

I’m having an ( un-official ) look at this.

For me, even a simple TAKE PHOTO on the FARMWARE page ( PHOTOS link ) is not completely reliable and will sometimes error out at the finish ( even after a good image capture ).
I get the same "farmware catchall :exit: {:normal, { . . . }" that you posted.

This error would stop a running sequence at that point.

Here’s what I’m seeing on the IEx console . . note the Terminating farmware process log just after the 2nd take-photo Farmware execution.

1 Like

I believe you’ve revealed bug(s) . . Are you able to submit a new issue on GitHub ?

1 Like

@NicoMolnar Thanks for reporting this issue. My suspicion is that it is indeed a problem on our end rather than a local configuration issue. I will forward this to our team so they can take a look on Monday.

2 Likes

Hi @jsimmonds and @RickCarlino, thanks for looking at this post so quickly.

I’m a newbie in terms of using GitHub, so if @RickCarlino can forward it to the team we would probably flag the issue faster.

Hope to hear from you soon! Will keep you posted if I identify any new problems.

1 Like

Hi,

I have just ‘upgraded’ to the new Farmbot software and I am having the same issue as described here. I can take one photo and the farmbot continues on in the sequence. It stops at the second photo and goes no further. This sequence was working just fine on Farmbot V7.0.1.

Also, my logs are not giving the sort of information that they did previously so I can’t tell whether the farmbot is moving without viewing it in person.

Thanks
Mark

Hi,

Just tested it with a new sequence that includes take a photo, move relative, take a photo, move relative. It stopped at the second take a photo command and never went to the move relative command.

Thanks,

Mark

Hi Mark,

I’m having the same exact problem. The sequence will stop at the second “take a photo” command.
No error is recorded in the log and that’s why I had to print the screen to register the type of error.

Hope someone can shed some light on this!

Thanks,
Nico

@RickCarlino, maybe one problem is in the try . . catch coding in https://github.com/FarmBot/farmbot_os/blob/staging/farmbot_os/lib/farmbot_os/sys_calls/farmware.ex

There are 2 cases handling thrown values, but the last case will match :throw, :exit and :error (?)

( but I suspect there’s a larger issue elsewhere :slight_smile: . . because there’s a timing-dependent behaviour underneath )

1 Like

I have a fix for this ready. It should be released soon.

2 Likes

Thanks @connor. I’m running now on v8.2.4-rc1 from CI but the 2 big issues in this thread still exist. ( In a sequence, 2nd take-photo always fails. Simple take-photo will very intermittently exit with an uncaught throw ( but does capture an image ))
:confused:
I’ll dig deeper and see what I can uncover.

1 Like

Hey Connor . . I’ve made an attempt at “reverse-engineering” the Farmware runner code . . I’m in dire need of some diagrams of how Sequences run other Farmwares as steps, where those individual Farmware steps have asynch. completion ( e.g. take-photo )

Are there sequence diagrams or process interaction maps that I can refer to ?

Thanks ! :confused:

1 Like

I’ve tried making diagrams that maps the relationship between sequence and farmware but I’ve yet to find a way to make it presentable. Unfortunately it’s just too complex to put into a diagram that can be easily understood. Here’s the gist of how it works:

  1. celery script is sent to Farmbot
  2. Farmbot parses it and turns it into an AST that can be executed
  3. Farmbot loops over AST until it completes, hits a sub sequence call, or a farmware call.
  4. if it was a farmware, the sequence runner is paused
  5. Farmbot opens a Unix socket that farmware will communicate over
  6. Farmbot executes the farmware, while also monitoring the pipe
  7. if any commands come over the pipe, Farmbot pauses the farmware, then goto(1)
  8. repeat steps 3-7 until completion of farmware
  9. Repeat steps 3-8 until completion

This issue being presented in this thread is during step 7. What happens is there is a race condition in the way Linux handles pipes. There could be one of a few things that could happen while executing the farmware:

  1. it could not send anything. I think the pipe protocol requires at least one command every X seconds to be considered alive still.
  2. the farmware could send a malformed command
  3. the farmware could exit
  4. the pipe could close

Basically the error that is happening is a race condition between options 3 and 4 in that list. I’ve been working on a refactor to fix this but got distracted by the holidays

2 Likes

Thanks for this most valuable gist !

Step 7 seems to imply that there can be, e.g., 1+ paused take-photo in parallel with one running take-photo . . would that be right ? In that case, the Farmware runner context would need to be self-contained and without side-effects ( ? idempotent ) :safety_pin:

Maybe best if I quietly back away while you consider your refactoring :wink:

2 Likes

Hi guys,

Was this fix included in the December 3 update by any chance?

Because I’m still getting the Problem sending RPC command… error when trying to execute a sequence with two Take photo commands in it, so I wanted to know if the issue is now mine or if I should wait for a new release.

Thanks!

No it wasn’t.

As @connor said above,

We will need patience, as a correct refactoring could likely be a substantial effort.

Time to be patient, then! Appreciate all that effort from your end.

Not a great effort for me ( retired programmer )

  • a) because I’m intrigued how so much real functionality can be packed into a Nerves+Elixir-based package running on a Raspberry Pi 3 !!
  • b) I haven’t contributed much to possible solutions of this issue ( :disappointed: )

I genuinely wish this FarmBot enterprise a long and prosperous life . . albeit, when I have my own bot hardware, I will attempt to speed up the movements enough to scare the snails :upside_down_face:

1 Like