Helm database username issue

Deployed a new helm release using frappe/erpnext. The release deploys as expected, all containers are running. Then I use the add-site yaml to deploy a new site. In the yaml file, there’s a name-value pair with the name DB_ROOT_USER. The value in the example file is set to ‘root’. I’m deploying on rds so my default ‘root’ user is ‘admin’.

With the value left to root, site creation fails with 'Access denied for user ‘root’@‘xx.xx.xx.xx’. This is as expected since ‘root’ is not the correct username. However, setting the value to ‘admin’ and attempting site deployment results in (1049, “Unknown database ‘admin’”).

I’ve tried this a few times with consistent results. Has anyone run into this?

how did you set it?

you can either exec into the container and run bench new-site command don’t forget --mariadb-no-socket or set the environment variable DB_ROOT_USER https://github.com/frappe/frappe_docker/blob/develop/docs/site-operations.md#setup-new-site

make sure you follow this https://github.com/frappe/frappe/wiki/Using-Frappe-with-Amazon-RDS-(or-any-other-DBaaS)#setup-frappe

following need to be set in site config

"rds_db": 1,

I can setup AWS RDS and create new site if you can share credentials/kubeconfig

I’m using the ‘[create new site job](https://helm.erpnext.com/kubernetes-resources/create-new-site-job.html -)’. For name ‘DB_ROOT_USER’, I set the value to admin. With root, it seems to run normally (except for the expected permission issue). With admin, I get the aforementioned error where it looks like it’s trying to name the db ‘admin’.

try to exec into container and run the bench new-site --mariadb-no-socket ... command

also, did you set rds_db: 1?

Revisiting this issue with some more insight.

There is a problem here when it comes to RDS:

def delete_user(self, target, host=None):
	if not host:
		host = self.get_current_host()
	try:
		self.db.sql("DROP USER '%s'@'%s';" % (target, host))
	except Exception as e:
		if e.args[0] == 1396:
			pass
		else:
			raise

When db_manager.py -> create_database is originally called during site setup, it creates two users:
{db_name}@’%’
{db_name}@’{host from which the create function is called}’

If you subsequently run bench drop-site, it attempts to delete the first user entry ({db_name}@’%’). However, things don’t go as planned because of RDS permissions:

MariaDB [mysql]> DROP USER '_a8e0009cc24b5d9d'@'%';
ERROR 1396 (HY000): Operation DROP USER failed for '_a8e0009cc24b5d9d'@'%'
MariaDB [mysql]>

In a kubernetes environment, the specific host IP registered in create_user is also not necessarily going to be the source of the delete_user call. This means that both of the above users are orphaned. Future site-create attempts will fail with permission denied errors. This is the voice of experience speaking.

At minimum, a rds_db=1 condition is needed in that calls an appropriate query to clean up the users. The following query is a bit hack-ish (it inserts ‘1’ if @accounts is null to avoid an error) but it works:

SELECT GROUP_CONCAT(QUOTE(user),'@',QUOTE(host))  INTO @accounts  FROM mysql.user  WHERE user = '%s';
EXECUTE IMMEDIATE IF(ISNULL(@accounts), 'SELECT 1', CONCAT('DROP USER ', @accounts));

Edit: Clarification…delete_user is called by bench drop-site.

What I did for RDS,

  1. use AWS security features to allow connections to RDS only from cluster, then you can add users with wildcard hosts using MySQL shell.
  2. check bench new-site --help you can pass db_name, db_user, db_password here
  3. for clean up do the delete with a py/bash script with appropriate AWS api instead of relying on bench command.

to codify or automate these custom scripts, write them as Kubernetes resource yaml(s)

check this old gist that had fargate and aurora related scripts. https://gist.github.com/revant/1328e3367c3042fd91c447859fb23dd1#file-create-site-yaml