Nginx ERROR with CentOS 7 during install.py - fix SELinux -> Permissive

Well, I have issues with both Ubuntu 16.04 and CentOS 7 when trying to install ERPNext to a clean server from install.py.

Here is the problem with CentOS 7. The installation encounters a fatal error when trying to start nginx. Here is the recap:


TASK [nginx : Rename default nginx.conf to nginx.conf.old] *****************************************************
skipping: [localhost]
TASK [nginx : Copy nginx configuration in place.] **************************************************************
changed: [localhost]
TASK [nginx : Setup www redirect] ******************************************************************************
skipping: [localhost]
TASK [nginx : Ensure nginx is started and enabled to start at boot.] *******************************************
fatal: [localhost]: FAILED! => {“changed”: false, “failed”: true, “msg”: “Unable to start service nginx: Job for
nginx.service failed because the control process exited with error code. See "systemctl status nginx.service"
and "journalctl -xe" for details.\n”}
RUNNING HANDLER [dns_caching : restart network manager] ********************************************************
RUNNING HANDLER [mariadb : restart mysql] **********************************************************************
RUNNING HANDLER [nginx : restart nginx] ************************************************************************
to retry, use: --limit @/tmp/.bench/playbooks/production/install.retry
PLAY RECAP *****************************************************************************************************
localhost : ok=48 changed=36 unreachable=0 failed=1
Traceback (most recent call last):
File “install.py”, line 388, in
install_bench(args)
File “install.py”, line 114, in install_bench
run_playbook(‘production/install.yml’, sudo=True, extra_vars=extra_vars)
File “install.py”, line 326, in run_playbook
success = subprocess.check_call(args, cwd=os.path.join(cwd, ‘playbooks’))
File “/usr/lib64/python2.7/subprocess.py”, line 542, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command ‘[‘ansible-playbook’, ‘-c’, ‘local’, ‘production/install.yml’, ‘-e’, ‘@/t
mp/extra_vars.json’, ‘–become’, ‘–become-user=erp_jmi’]’ returned non-zero exit status 2
[erp_jmi@centos7-test ~]$


It indicates that the process failed with an error code, but I do not know where to look in log files to find the specific error.

Not sure where to go from here.
The currently available local hosting company here only provides Ubuntu 16.04 and CentOS 7 as their cloud hosting OS choices. I cannot get ERPNext to install on either one!

Any ideas?

BKM

Hi @bkm

Have you checked the logs here as it suggests ?

"Unable to start service nginx: Job for
nginx.service failed because the control process exited with error code.

See “systemctl status nginx.service"
and “journalctl -xe” for details.\n”}

This will be a good start to see why nginx cannot be started . Its probably because on a misconfiguration which wont allow nginx to start cleanly

I saw those listings, but have no idea where in the directory structure to find those files. It just called out file names but no path.

BKM

all you need to do is run

sudo systemctl status nginx.service

from the command prompt. You dont need to run it from a particular location. This command will look at the systemctl which is the newer means to log almost everything with all the services running from start up and show in the case of nginx why it did not run.

Likewise

journalctl -x

does something similar …

Give it a try and report back. I’ll be off to bed shortly, but perhaps @clarkej can help. He’s in North America so will at least be closer to your time zone. There’s always the ERPNext gitter chat too

I didn’t realize that systemctl and journalctl were commands. I was busy looking for them as processes in the /bin directory so I could hopefully find the log files. It’s not very often that an error tells you exactly what command to use to read a log file, so I was overthinking it.

So, here is the listing from the nginx.service log:


HEY! USE SCREEN -bash-4.2$
HEY! USE SCREEN -bash-4.2$ sudo systemctl status nginx.service -l
● nginx.service - nginx - high performance web server
Loaded: loaded (/usr/lib/systemd/system/nginx.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Tue 2018-01-16 00:16:11 UTC; 7min ago
Docs: nginx documentation
Process: 780 ExecStartPre=/usr/sbin/nginx -t -c /etc/nginx/nginx.conf (code=exited, status=1/FAILURE)
Jan 16 00:16:11 centos7-test systemd[1]: Starting nginx - high performance web server…
Jan 16 00:16:11 centos7-test nginx[780]: nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
Jan 16 00:16:11 centos7-test nginx[780]: nginx: [emerg] chown(“/var/cache/nginx”, 994) failed (13: Permission de
nied)
Jan 16 00:16:11 centos7-test nginx[780]: nginx: configuration file /etc/nginx/nginx.conf test failed
Jan 16 00:16:11 centos7-test systemd[1]: nginx.service: control process exited, code=exited status=1
Jan 16 00:16:11 centos7-test systemd[1]: Failed to start nginx - high performance web server.
Jan 16 00:16:11 centos7-test systemd[1]: Unit nginx.service entered failed state.
Jan 16 00:16:11 centos7-test systemd[1]: nginx.service failed.
HEY! USE SCREEN -bash-4.2$


After running it the first time it indicated there were some lines that have been truncated and to use the -l switch to show everything. The listing above shows everything.

So now this next set comes from the journalctl command. The listing is pretty long, but the error does show up. I just cannot understand what it means:


Jan 16 00:16:11 centos7-test systemd[1]: Starting Google Compute Engine Instance Setup…
– Subject: Unit google-instance-setup.service has begun start-up
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit google-instance-setup.service has begun starting up.
Jan 16 00:16:11 centos7-test systemd[1]: Starting Dynamic System Tuning Daemon…
– Subject: Unit tuned.service has begun start-up
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit tuned.service has begun starting up.
Jan 16 00:16:11 centos7-test systemd[1]: Starting MariaDB 10.2.12 database server…
– Subject: Unit mariadb.service has begun start-up
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit mariadb.service has begun starting up.
Jan 16 00:16:11 centos7-test systemd[1]: Starting Postfix Mail Transport Agent…
– Subject: Unit postfix.service has begun start-up
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit postfix.service has begun starting up.
Jan 16 00:16:11 centos7-test nginx[780]: nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
Jan 16 00:16:11 centos7-test nginx[780]: nginx: [emerg] chown(“/var/cache/nginx”, 994) failed (13: Permission denied)
Jan 16 00:16:11 centos7-test nginx[780]: nginx: configuration file /etc/nginx/nginx.conf test failed
Jan 16 00:16:11 centos7-test systemd[1]: nginx.service: control process exited, code=exited status=1
Jan 16 00:16:11 centos7-test systemd[1]: Failed to start nginx - high performance web server.
– Subject: Unit nginx.service has failed
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit nginx.service has failed.

– The result is failed.
Jan 16 00:16:11 centos7-test systemd[1]: Unit nginx.service entered failed state.
Jan 16 00:16:11 centos7-test systemd[1]: nginx.service failed.
Jan 16 00:16:11 centos7-test systemd[1]: Started Dynamic System Tuning Daemon.
– Subject: Unit tuned.service has finished start-up
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit tuned.service has finished starting up.

– The start-up result is done.


If you can see where the problem might be, I would greatly appreciate any guidance. Nobody has commented on the Ubuntu16 error I posted so hopefully we can get CentOS7 working. It also seems that google is doing some of the “shameless self-promotion” in the journal listing. I have this as a testing area on Google Cloud Platform until I can get it to install properly. It is way more troublesome to delete an instance and start up another on the regular hosting account, so google is the CentOS 7 test bed.

Thanks in advance.

BKM

This is the pertinent bit …

chown(“/var/cache/nginx”, 994) failed (13: Permission denied)

So it’s trying to change owner of /var/cache/nginx but can’t as it does have permission to do so.

From here you could change the permissions on that folder to allow your frappe user to do a chown.

You get the gist .

Off to bed now … good look

BKM what else you might try is to set selinux to ‘permissive’, assuming it is set to ‘enforcing’ that is the default.

What command are you running to start the installer?

Well, thank you everyone for attempting to help with this. However keep in mind that this is all happening during the execution of the install.py script when being run the first time on a bare metal server. In this case it is a CentOS 7 server.

Hm,… not sure why it would not have permission. The install.py script was run as sudo so it should have permission to do anything it wanted because it was run as root.

Ok John. That is a little beyond me. Not sure what selinux is or where to find settings for it in the install script. If it is something that is installed during the ERPNext install, then why would it not be handled properly in the script. This same script works just fine with Debian 8 and Ubuntu 14.04.

Or is selinux something that is already existing in the CentOS distribution that I need to alter before running the install script?

The exact command is as follows:

sudo python install.py --production --user myusername

The reason for running it with the --user switch is that Google Cloud Platform does not allow administrator logins so creating the default frappe user is not really possible from the command prompt. So, I just insert the --user switch to force it to work with the only valid user login credentials available to the system. Since using this method now for a long time, I also use it on new servers where I have root admin access because I can create a user other than “frappe” to install everything. This sort of reduces the chances that some nefarious person would seek to disrupt a system using the default username. I take that out of the equation by using an alternate name. In this particular case I had no choice but to use the only login available to me.

Since the install script works correctly with Debian 8 and Ubuntu 14.04. I am left to think that there is something wrong in the install script when it decides on alternate paths through the script based on the OS detected.

The only other thing I can think of is there might be something that needs to be done to the server to prepare it better for the running of the install.py script. Regardless, I am left with a really tough problem. The hosting service I need to use only has CentOS 7 and Ubuntu 16.04 available as OS choices and neither of them work with the current version of the install.py script.

BKM

John, You are the mentor genius as always. I mistakenly thought selinux was something in the install.py script so I started my search there.That led me to how it operates and then back to the core OS itself. So, I did more research to see what exactly it was and why it was in the OS. The Dastardly NSA (national security agency) evidently pushed for this as a kernel module some time ago and it gets in the way of several things.

So, I used the follownig commands to check and set it properly BEFORE running the install script:

sudo getenforce ← this command returns the status

In my case the status came back as “Enforcing”

sudo setenforce 0 <–this command set the status to Permissive

Once this step is done the install.py script will in fact get past the NGINX errors I posted at the top of this thread.

However, the script then encounters another fatal error further down the line and still does not finish.

No matter. I still wanted to let @clarkej know his solution was the correct one for getting past the NGINX error.

BKM

1 Like

Now that the solution from @clarkej got me past the NGINX errors, the install script encountered another error. It is listed here (and I will start a separate thread to make sure it get attention as well.).

TASK [Check if /tmp/.bench exists] *****************************************************************************
ok: [localhost]
TASK [Check if bench_repo_path exists] *************************************************************************
ok: [localhost]
TASK [move /tmp/.bench if it exists] ***************************************************************************
fatal: [localhost]: FAILED! => {“changed”: true, “cmd”: [“cp”, “-R”, “/tmp/.bench”, “/home/root/.bench”], "delta
": “0:00:00.005924”, “end”: “2018-01-16 16:18:23.592925”, “failed”: true, “rc”: 1, “start”: “2018-01-16 16:18:23
.587001”, “stderr”: “cp: cannot create directory ‘/home/root/.bench’: No such file or directory”, “stderr_lines”
: [“cp: cannot create directory ‘/home/root/.bench’: No such file or directory”], “stdout”: “”, “stdout_lines”:
}
to retry, use: --limit @/tmp/.bench/playbooks/production/install.retry
PLAY RECAP *****************************************************************************************************
localhost : ok=63 changed=45 unreachable=0 failed=1
Traceback (most recent call last):
File “install.py”, line 388, in
install_bench(args)
File “install.py”, line 114, in install_bench
run_playbook(‘production/install.yml’, sudo=True, extra_vars=extra_vars)
File “install.py”, line 326, in run_playbook
success = subprocess.check_call(args, cwd=os.path.join(cwd, ‘playbooks’))
File “/usr/lib64/python2.7/subprocess.py”, line 542, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command ‘[‘ansible-playbook’, ‘-c’, ‘local’, ‘production/install.yml’, ‘-e’, ‘@/t
mp/extra_vars.json’, ‘–become’, ‘–become-user=erp_jmi’]’ returned non-zero exit status 2
[erp_jmi@cetnos7-test ~]$

I am not sure where this leads me, so I will stafrt a new problem thread to try to find answers.

However, a moderator can now mark this thread closed as the solution to the title problem was found.

Thank you

BKM

1 Like