ERPNext.com Frappe Cloud Support Partners Foundation Frappe School

Email not sending - account works fine

Emails had been working fine for some time. For no reason that I can discern, messages have started piling up in the queue with a status of ‘Not Sent’. If I go into one of the messages and hit ‘Send Now’. It sends immediately and successfully.

What process manages the email queue?

erpnext v13.8.0 deployed via helm chart

Sounds like the scheduler is not running.

Try this:

bench --site site1.local enable-scheduler

If your site is not site1.local then use the name of your site instead.

Hope this helps. :nerd_face:

BKM

This is a helm-based deployment (all in dockers). Bench is not really an option. I’m not sure how the scheduler could fail in this environment but anything’s possible.

Any idea what process I’d be looking for?

Answering my own question…

{deployment name}-erpnext-scheduler pod is created for…well…scheduling. In my case, it’s in crash-loop-backoff state. Off to investigate kubernetes…

To close this out…I’d moved the location of my databases recently. I’d updated the configuration in each site’s site_config.json. I turned down the old database server and was operating successfully, or so I thought, on the new databases.

common_site_config.json - I just learned about this file tonight. At least one of the worker containers was consulting this file for its database configuration. The old database is offline and the pod was failing on startup. Overall, erpnext was up and mostly working fine. The scheduler seems to have been the only collateral damage.

The bench_helper.py which is part of frappe is available in containers.

That means many bench commands do work

Try bench --help

enable-scheduler commands is available.

Only change here is, run the bench commands from sites directory and not the bench directory

workers and scheduler run for all sites. So common_site_config.json

If something is set in common, it’ll be inherited by site.

Try:

  • play around with liveness and readiness probes, timeouts. After separating the db, ping may take more time to respond.
  • kubectl rollout restart deployment to restart crashing pods after config is changed
1 Like