I used AWS RDS to host the db of a multitenants ERPNext.
Then I backup and restore the db to a mariadb in EC2.
Comparing the size of the two databases, I found that the size of db in EC2 is larger than the RDS.
My questions:
- Does this happen because of the backup/restore process?
- Or because of some cache of new database?
- Or just because the RDS is more efficient in using the space?
Thank you.
Edit:
For example, if I run this:
SELECT table_schema "database", sum(data_length + index_length)/1024/1024 "size in MB" FROM information_schema.TABLES GROUP BY table_schema;
gives:
RDS EC2
+--------------------+--------------+ +--------------+
| database | size in MB | | size in MB |
+--------------------+--------------+ +--------------+
| information_schema | 0.18750000 | | 0.18750000 |
| _77xxxxxxxxxxxxxx | 89.53978348 | | 101.53166580 |
+--------------------+--------------+ +--------------+
But if I omit the index_length, these are the result:
RDS EC2
+--------------------+-------------+ +-------------+
| database | size in MB | | size in MB |
+--------------------+-------------+ +-------------+
| information_schema | 0.11718750 | | 0.09375000 |
| _77xxxxxxxxxxxxxx | 51.57591629 | | 58.26311111 |
+--------------------+-------------+ +-------------+
Then I run the table level:
SELECT table_schema "database", table_name, (data_length+index_length)/1024/1024 "table size" FROM information_schema.TABLES WHERE (data_length+index_length)/1024/1024 > 0;
And iterate the size to narrow down the result gives:
RDS EC2
+-------------------+-----------------+-------------+ +-------------+
| database | table name | table size | | table size |
+-------------------+-----------------+-------------+ +-------------+
| _77xxxxxxxxxxxxxx | tabVersion | 11.03125000 | | 24.06250000 |
| _77xxxxxxxxxxxxxx | __global_search | 8.68001938 | | 8.05545425 |
| _77xxxxxxxxxxxxxx | tabGL Entry | 6.67187500 | | 6.03906250 |
+-------------------+-----------------+-------------+ +-------------+
the tabVersion gives the most difference. Why?
Edit 2:
From running the command to get tables size above 1MB, I get this approximate calculation:
8+1.9+1.3+2.3+2.8+3.3+6+1.1+1.2+5.3+3+4+1.6+4.9+24 = 70.7
then adding the tables with sizes < 1MB (around 800 tables with avg 0.02-0.06MB):
70,7 + 800 * 0.04 = 102.7
So it seems it is not tabVersion after all (deleting versions more than 30 days old only give 1-1,5MB saving).
But still the question is: Why the difference between before and after restore? (system is not used at all between the process)