[Tutorial] Step by Step guide to build the Poor Man’s Backup System

just curious whether there is any particular reason for not using the bench backup command (which does basically the same in the background I believe). This would spare you the MySQLpassword

this would create the backup in a certain location ~/sites/site1.local/private/backups but you could tweak the script to use that as the current folder just alike I guess

Yup. When I originally tried to put the bench command in the the script and run it from a crontab entry, there were occasions where running it in background would time out and since it had been run from a script, I would not know that the backup had failed.

If the backup fails in the script, then everything else in the script that depends on the backup to complete would also fail. Ultimately that meant that I had no valid backups to use in the event of an emergency.

Over time, I found that using mysql commands from the command line showed no such behavior and I never missed a backup.

After this discovery I made a few additional discoveries. Running restore from the bench also had issues when the database approaches 1gb in size. There would be errors reported that didn’t make much sense. So, using the command line to do restores turned out to be a better and faster approach as well.

There just seems to be something about how bench handles the database backups and restores that is prone to problems.

That has been my experience. Your mileage may vary. :sunglasses:

BKM

thx for clarifying.

How do you go about attachments/files? Is it sufficient to restore the files in the ~/frappe-bench/sites/[mysite]/private/files / ~/frappe-bench/sites/[mysite]/public/files folders in order to have a restored ERPNext instance to find them?

Funny you should bring that up just now.

Up to this point I was just always taking the files at the end of he week and using scp manually to move any there were added to the other servers.

However, during an offline conversation with another user this very topic came up last week and I began working on another method of running the backups that will also include the /public/files and the /private/files directories.

It is turning out to be a complete revamp of the process using tar and bundling everything together in a single file again so that moving it across the internet remains as simple as the current method published.

Look for that in the next few days. I have been working on it since Saturday and I am only about 25% finished with it. I want to make sure it is fully tested before I lay it out for everyone else to use. It will likely be called “Step by Step guide to the Poor Man’s Backup System ver2”

I want to keep in in the same line of actions so that anyone can do it using 2 different servers and not have to spend a ton of money on maintaining it. I am currently using the first published method on all of my client servers and it has been wonderful. However, the maintenance of all of those servers every Sunday afternoon manually using ‘scp’ to move all of the files got to be time consuming.

In the new instructions, the files will also be included with every database backup so that you can truly have a complete backup system. It still does not allow for automated restoration of everything because that has it’s own set of issues that I have to figure out.

Keep wathcing this space for the update!!

BKM

nice. looking forward to this. If you put that script on github I may contribute if I a can (i.e. using some VARIABLES here and there, which you haven’t done thus far)

1 Like

You know, that may be a good idea. I am not a programmer or developer and really do not know how to use the GitHub functions aside from reading the entries and posting bug reports on the repos that I follow.

I will make that a point to try to get to figuring out how to do that and have my own place for storing all of these helpful things. I tried to pull them all together in a single thread yesterday and the moderators deleted the entire thread last night. So maybe a github location would work.

I have no idea how to get information into or out of github at this point though because I have never had the time to learn how to use it.

Thanks for the wonderful idea. I will work on learning more about that after I get the backup instructions published here.

BKM

I am sure once you get started with git and a platform like github, bitbucket, …you won’t stop because it makes exactly such efforts much more fun an efficient. Here is one (of many) places to get started

And you are mistaken to think this is for developers. It’s for (collaboratively) working on content in a structured manner and even is very handy when you only cooperate with yourself. Can’t stress enough how helpful git can be.

Oh man now that you have said it I cannot get that out of my head. That really is a great idea.

I am not enough of a script writer to be good with variables so I will ultimately leave that up to you but it just created such a burn in me to get this all figured out now that I can’t get it out of my head.

Sometimes there are things that happen that spark me into a higher level of action. Usually it is the wisdom of people like @clarkej , @brian_pond and several others. Today it was your idea.

Thanks for the supportive ideas.

BKM

2 Likes

if you create a repo, I’ll provide the variables … promised.

or let me even take this a little further … send me your github id (in PM if you want). I’ll create a basic repository for this backup tutorial and transfer the ownership to you. Then there are not much excuses left to not start with git. I am sure this will even spark you more then the VARIABLES suggestion.

LOL… Very cool. Thanks

I am working away at the revision for the Poor Mans Backup v2 and have finally got over some of the major hurdles. Been working on it steady since this morning. By sometime late tonight I should have it working on one of my test servers to verify everything.

If I am unable to figure out the github thing, then I will certainly take you up on the offer tomorrow. I still want to try myself first.

BKM

So, I think I have worked out how to use GitHub. I created a place for the files once I get everything working on my test servers. Getting the new additions to the backup system to work has been a struggle. I keep seeing the script run past commands before they finish and only partial files get copied, etc. Still working on that part. Using ‘tar’ seems to take longer than using gzip, but gzip doesn’t handle multiple files so when executing tar commands the script blasts past the command once it is started and executes the next command before the tar function finishes.

I will get it figured out eventually. Once I do it will get published. I will post a thread here, but on GitHub I will have the updateable version of the text doc as well as separate files for the bash scripts. That should make it easier. Oh yeah, the repo is called ERPNext_Guides

More to be added once I get past some of the new script errors.

BKM

Well, here it is…

BKM’s ERPNext Guides

I also put a copy of it the newest version of the Poor Man’s Backup System (PMBS) on the forum here:

Since the forum locks me out of editing the text after a few weeks, your idea of putting it on GitHub was the right way to go. It just took me several days to figure out GitHub enough to get it all published.

Thanks for the push. :grin:

BKM

1 Like

If you go the route of having a second server somewhere, you would neither need backups 4 times an hour nor risk losing even a few minutes of user work if you opted for Master Slave replication.

Concerning brown outs.
Whenever there are network failures any updates on the master are not reflected to the slave. When connection resumes, the slave quickly catches up to the right position in the log file and everything is good again.

Interesting tutorial I might reuse some of the codes from here like only holding the last backup instead of all.

Currently what I have done is set cronjobs on my local linux nas device to ssh into the VPS Server and pass bench backup --with-files and pull the files created during that hour when the backup was taken.

This is done on a daily basis while I run another cronjob to remove all the backups which are older than 7 days.

#!/bin/sh

# Set Vairables
#################################
UsernameSite=Server1 #Used for local storage directory assigned to this server backup
Username=server1
Sitename=erp.server1.com
Ip=123.1.1.123
Baklocation=/home/nasbox/backups  #Add Path to local backup directory
#Set Current Time Def
CurrentDateTime=`date +"%Y-%m-%d-%H"`
BackupDateTime=`date +"%Y%m%d_%H"`
#################################

#Add Key to server - One time command
#ssh-copy-id -i /home/$(whoami)/.ssh/id_rsa.pub ${Username}@${Ip}

# Generate a new backup with files
ssh ${Username}@${Ip} 'cd /home/$(whoami)/frappe-bench/; /usr/local/bin/bench backup --with-files' > ${Baklocation}/${UsernameSite}/bak-${CurrentDateTime}.log 

#Sync Command - Not useful for multi-bench Servers
rsync -azzv -e ssh ${Username}@${Ip}:/home/${Username}/frappe-bench/sites/${Sitename}/private/backups/${BackupDateTime}*  ${Baklocation}/${UsernameSite}/${CurrentDateTime} >> ${Baklocation}/${UsernameSite}/bak-${CurrentDateTime}.log 

I will try to adjust these lines with more sophisticated way of creating a new backup and pulling it locally.

Great to see others continue to innovate the concept further!

BTW… if you want the same thing with all required the support files, I created an updated version of this here:

~BKM

@bkm I was able to use the script and modified it to backup to S3 bucket instead of a failover server. However, sometimes the backup does not upload the full file size. I don’t know why this is so. BTW I used the V2 tutorial.

Hey Felix,

Did you use the “scp” command to execute the large file transfer of some other file moving tool??

The reason I ask, is that I have found that “scp” will wait almost forever through all kinds of communications timeouts to complete the xfer. Most other tools give up and time out.

Also if you are using “incron” tool to wait for a file to drop, it is important to use it with the exact syntax that I have listed in the tutorial, otherwise it will leave more than half the file behind and it will be useless. You must use the [IN_CLOSE_WRITE] command with incron to make it wait until the complete file is finished before moving it.

Hope that helps. :sunglasses:

BKM

@bkm. The Database Backup is taking lot of time. Example, my compressed db size is 2.3GB and it takes close to 20 minutes to create the DB Backup(*.sql.gz)file. Any way can reduce this time?

I’m afraid, this backup time is going to increase with the DB size.

If we plan to do frequent backup, how do we reduce the backup time and how to mitigate any performance bottlenecks resulting out of this?

We are running on 4 Core 16 GB RAM, with both APP and DB on the same server.

Thanks,
Saravana

Related to db backup:

Try maria-backup command.

at 18:40 in the video you will find slide for performance.
at 19:35 it gives you idea of performance, DB size is 64GB, takes from 44s to 219s.

For public/private files: How to backup with restic to S3 compatible storage

I am currently using the Poor Man Backup v2 (different forum thread from this one) configuration on most of my servers and the backup of a 7.3gb database is taking about 2 minutes. This time is based on my 8 CPU cores and 16gb of memory VPS server. I had similar performance (about two and a half minutes) with a 4.8gb ERPNext database on a 4 CPU core 8gb memory server as well.

It may be time to evaluate the performance of your VPS. I have never had backup time of more than 3 to 5 minutes on 5+gb databases using the mysqldump command.

Additionally, you may be able to improve the mariadb performance by tuning the number and size of the buffers used and the number of workers. Search the forum on this and you will find several threads related to mariadb tuning.

Hope this helps.

BKM