Database is using too much space

Rudder stores a lot of data in the PostgreSQL database, and most historical data is removed from it. You can configure how many days of logs you want to keep in the database.


However, due to the nature of PostgreSQL, when data is removed, space is not reclaimed on the storage system, it is simply marked as “free” for the database to write again in the removed rows. This space can be reclaimed by running a VACUUM FULL operation, but it needs at least as much free space on the drive as the database size. Moreover, if you are using PostgreSQL 8.3 (or to a lesser extent 8.4), you’ll be likely to experience indexes bloating, where the physical size of the indexes grows without real reason, and need to be regularly purged.


There are two ways to reclaim space, the fast one (which doesn’t completely reclaim all wasted space), and the complete one (which is unfortunately very slow).


Fast solution (especially for 8.x version of PostgreSQL): Simply reindexing the database will save some space; depending on the size of your database, it may take several minutes to a couple of hours:

# First, stop the Rudder server
rudder agent disable
service rudder-jetty stop

# Then log into PostgreSQL
psql -U rudder -h localhost
REINDEX TABLE ruddersysevents;

# Exit postgresql
\q

# Restart the Rudder server
rudder agent enable
service rudder-jetty start


Complete solution: this solution will reclaim all space that can be reclaimed, but is really really slow (can take several hours)

# First, stop the Rudder server
rudder agent disable
service rudder-jetty stop

# Then log into PostgreSQL
psql -U rudder -h localhost
drop index component_idx;
drop index composite_node_execution_idx;
drop index keyvalue_idx;
drop index nodeid_idx;
drop index ruleid_idx;
drop index executiontimestamp_idx;
vacuum full ruddersysevents; — this will take several hours

create index nodeid_idx on RudderSysEvents (nodeId);
CREATE INDEX executionTimeStamp_idx on RudderSysEvents (executionTimeStamp);
CREATE INDEX composite_node_execution_idx on RudderSysEvents (nodeId, executionTimeStamp);
CREATE INDEX component_idx on RudderSysEvents (component);
CREATE INDEX keyValue_idx on RudderSysEvents (keyValue);
CREATE INDEX ruleId_idx on RudderSysEvents (ruleId);

# Exit PostgreSQL
\q

# Restart the Rudder server
rudder agent enable
service rudder-jetty start

Is this article helpful for you?