I'm trying to fetch more than 1 million records in the microservices. I'm getting an error. I need to keep increasing the app memory for the service in order to fetch a huge amount of data. Is there any method I can use to fetch huge data instead of increasing the app memory.

2022-10-11T11:22:04.898+08:00 [APP/PROC/WEB/0] [ERR] Resource exhaustion event: the JVM was unable to allocate memory from the heap.2022-10-11T11:22:04.898+08:00 [APP/PROC/WEB/0] [ERR] ResourceExhausted! (1/0)

@Value("${batch-size}")private int batchSize;public void archiveTableRecords(JdbcTemplate sourceDbTemplate, JdbcTemplate targetDbTemplate,ArchiveConfigDTO archiveObj) {try {String sourceTable = archiveObj.getSourceTable();String archive_months =archiveObj.getArchiveCriteriaMonths();List<Object> primaryKeyValueList = new ArrayList<>();String compareDate1 = getCSTDateNew(archive_months);logger.info("Archive criteria date: {}", compareDate1);List<Map<String, Object>> sourceRecords = sourceDbTemplate.queryForList(ArchiveSQLQueries.buildSQLQueryToFetchSourceRecords(sourceTable), compareDate1);int sourceRecordsSize = sourceRecords.size();logger.info("Fetched {} {} record(s)", sourceRecords.size(), sourceTable);if (sourceRecordsSize > 0) {int recordsInserted = copySourceRecords(targetDbTemplate, archiveObj.getTargetTable(),archiveObj.getPrimaryKeyColumn(), sourceRecords, primaryKeyValueList);if (recordsInserted > 0)deleteSourceRecords(sourceDbTemplate, sourceTable, archiveObj.getPrimaryKeyColumn(),primaryKeyValueList);}} catch (Exception e) {logger.error("Exception in archiveTableRecords: {} {}", e.getMessage(), e);}}public static String buildSQLQueryToFetchSourceRecords(String sourceTable) {StringBuilder sb = new StringBuilder("SELECT * FROM " + sourceTable + " where update_dts <= ?");return sb.toString();}public int copySourceRecords(JdbcTemplate targetDbTemplate, String targetTable, String primaryKeyColumn,List<Map<String, Object>> sourceRecords, List<Object> primaryKeyValueList) {int result = 0;logger.info("Copying records to {}", targetTable);int[][] insertResult = targetDbTemplate.batchUpdate(ArchiveSQLQueries.buildSQLTargetRecordInsertionQuery(targetTable, sourceRecords.get(0),primaryKeyColumn),sourceRecords, batchSize, new ParameterizedPreparedStatementSetter<Map<String, Object>>() {@Overridepublic void setValues(PreparedStatement ps, Map<String, Object> argument) throws SQLException {int index = 1;for (Entry<String, Object> obj : argument.entrySet()) {if (obj.getKey().equals(primaryKeyColumn))primaryKeyValueList.add(obj.getValue());elseps.setObject(index++, obj.getValue());}}});result = getSumOfArray(insertResult);logger.info("Inserted {} record(s) in {}", result, targetTable);return result;}I have tried above code when fetching the data somehow i’m getting error .
1

Best Answer


Judging by method names, you copy from one table to another. Do it in batches using LIMIT and OFFSET. Simplified example:

SELECT *FROM tableWHERE conditionORDER BY column_which_ensures_last_added_data_is_fetched_lastLIMIT 5OFFSET 0;

For the next batch change the offset using the number of processed records. Refer to the documentation for more examples. ORDER BY a column so you get oldest data first, newest last to ensure you won't get duplicates. If primary key is auto incremented id, it's good choice for this purpose.

If you need the count for logging purposes either use COUNT function, or calculate number of processed entities using count of batches.

SELECT COUNT(*)FROM tableWHERE condition;

Algorithm pseudo code:

int processed = 0;List<Map<String, Object>> list = //fetch using above query, replace offset with processedwhile(!list.isEmpty()) {//copy contents of list to target//handle exceptions, data not inserted in target, etc.processed += list.size();list = //fetch using above query, replace offset with processed}