背景
系统调用外部接口获取数据同步至本地库,无法100%同步成功。
问题定位
查看日志后最终定位到是由于 fastjson 工具类解析 json 文件时由于文件过大到导致OOM。单独调用接口后发现返回的数据量达到了300M。
错误信息:
java.lang.OutOfMemoryError: Java heap space
at com.alibaba.fastjson.serializer.SerializeWriter.expandCapacity(SerializeWriter.java:290)
at com.alibaba.fastjson.serializer.SerializeWriter.writeFieldValueStringWithDoubleQuoteCheck(SerializeWriter.java:1487)
at com.alibaba.fastjson.serializer.ASMSerializer_1_BigJsonPO.write(Unknown Source)
at com.alibaba.fastjson.serializer.ListSerializer.write(ListSerializer.java:130)
at com.alibaba.fastjson.serializer.JSONSerializer.write(JSONSerializer.java:278)
at com.alibaba.fastjson.JSON.toJSONString(JSON.java:665)
at com.alibaba.fastjson.JSON.toJSONString(JSON.java:607)
at com.alibaba.fastjson.JSON.toJSONString(JSON.java:572)
尝试解决
换解析工具
将解析工具由 fastjson 换为 hutool 中的 json 工具类后重试:失败!!!
加大内存
项目采用的时微服务架构,启动时指定了JVM堆内存初始大小及最大值:Xms300m -Xmx2048m,修改最大值参数(修改至4096m)后重启服务,再次尝试:失败!!!
最终方案
查阅资料后发现是Fastjson使用方式不对,不能直接使用JSONArray.parseArray(),当需要处理超大JSON文本时,需要使用Stream API(版本>=1.1.32)
引入依赖
<dependency>
<groupId>cn.hutool</groupId>
<artifactId>hutool-all</artifactId>
<version>5.7.4</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.alibaba/fastjson -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.36</version>
</dependency>
新建一个po类
用于测试json字符串与实体类之间的类型转换。
@Data
public class BigJsonPO implements Serializable {
private static final long serialVersionUID = -63528737089874523564L;
private String key1;
private String key2;
private String key3;
...
private String key100;
}
模拟数据
// 模拟文件的路径
private final String filePath = "F:\\test-json\\createByFastJson.json";
@Test
public void testWrite() {
TimeInterval timeInterval = new TimeInterval();
int dataCount = 200000;
File file = new File(filePath);
// 需要赋值的实体类字段
List<Field> fieldList = Arrays.stream(ReflectUtil.getFields(BigJsonPO.class)).filter(f -> !StrUtil.equalsIgnoreCase("serialVersionUID", f.getName())).collect(Collectors.toList());
try {
JSONWriter jsonWriter = new JSONWriter(new FileWriter(file));
// 写入 {
jsonWriter.startObject();
jsonWriter.writeKey("data");
// 写入 [
jsonWriter.startArray();
for (int i = 1; i <= dataCount; i++) {
// 写入 {
jsonWriter.startObject();
for (int i1 = 0; i1 < fieldList.size(); i1++) {
String name = fieldList.get(i1).getName();
jsonWriter.writeKey(name);
jsonWriter.writeValue((i1 + 1) + "-" + IdUtil.fastSimpleUUID());
}
// 写入 }
jsonWriter.endObject();
}
// 写入 ]
jsonWriter.endArray();
// 写入 }
jsonWriter.endObject();
jsonWriter.close();
System.out.println("文件大小:【" + FileUtil.size(file) / (1024 * 1024) + "M】,写入数据:【" + dataCount + "】条,耗时:【" + timeInterval.intervalMs() + "】ms");
} catch (IOException e) {
e.printStackTrace();
}
读取数据
@Test
public void testParse() {
TimeInterval timeInterval = new TimeInterval();
try {
File file = new File(filePath);
List<BigJsonPO> list = this.parseBigJsonStr(new FileReader(file), "data", BigJsonPO.class);
System.out.println("文件大小:【" + FileUtil.size(file) / (1024 * 1024) + "M】,解析数据:【" + list.size() + "】条,耗时:【" + timeInterval.intervalMs() + "】ms");
} catch (Exception e) {
e.printStackTrace();
}
}
/**
* 解析 大 json 对象,格式:
* {
* jsonArrKey :[ "key1":"value1"...]
* }
* @param reader
* @param jsonArrKey
* @param jsonArrKeyType
* @param <T>
* @return
*/
private <T> List<T> parseBigJsonStr(Reader reader, String jsonArrKey, Class<T> jsonArrKeyType) {
List<T> list = new ArrayList<>();
try {
JSONReader jsonReader = new JSONReader(reader);
jsonReader.startObject();
while (jsonReader.hasNext()) {
String tempJsonKey = jsonReader.readObject(String.class);
if (StrUtil.equalsIgnoreCase(jsonArrKey, tempJsonKey)) {
jsonReader.startArray();
while (jsonReader.hasNext()) {
list.add(jsonReader.readObject(jsonArrKeyType));
}
jsonReader.endArray();
}
}
jsonReader.endObject();
jsonReader.close();
} catch (Exception e) {
e.printStackTrace();
}
return list;
}
至此,问题完美解决。
其他解决方案
搜了资料后发现, jackson 类库也可以解决问题,代码类似。
引入依赖
<dependency>
<groupId>cn.hutool</groupId>
<artifactId>hutool-all</artifactId>
<version>5.7.4</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-core -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.13.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-databind -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.13.1</version>
</dependency>
模拟数据
@Test
public void testWriteJson() {
TimeInterval timeInterval = new TimeInterval();
int dataCount = 100000;
this.writeJson(filePath, dataCount, BigJsonPO.class);
System.out.println("写入数据量:" + dataCount + ",耗时:" + timeInterval.intervalMs());
}
private <T> void writeJson(String filePath, long dataCount, Class<T> poClass) {
JsonFactory factory = JsonFactory.builder().build();
try {
List<Field> fieldList = Arrays.stream(ReflectUtil.getFields(poClass)).filter(f -> !StrUtil.equalsIgnoreCase("serialVersionUID", f.getName())).collect(Collectors.toList());
JsonGenerator jg = factory.createGenerator(new File(filePath), JsonEncoding.UTF8);
jg.writeStartObject();
jg.writeFieldName("data");
jg.writeStartArray();
for (long i = 0; i < dataCount; i++) {
jg.writeStartObject();
for (int j = 0; j < fieldList.size(); j++) {
String fieldName = fieldList.get(j).getName();
Object value = (j + 1) + "-" + IdUtil.fastSimpleUUID();
jg.writeObjectField(fieldName, value);
}
jg.writeEndObject();
}
jg.writeEndArray();
jg.writeEndObject();
jg.close();
} catch (IOException e) {
e.printStackTrace();
}
}
读取数据
@Test
public void testParseJson() {
TimeInterval timeInterval = new TimeInterval();
JsonFactory factory = JsonFactory.builder()
.build().setCodec(new ObjectMapper());
List<BigJsonPO> list = new ArrayList<>();
File file = new File(filePath);
try {
JsonParser jp = factory.createParser(file);
JsonToken outToken = jp.nextToken();
// 校验格式是否正确
if (outToken != JsonToken.START_OBJECT) {
throw new RuntimeException("Error json string");
}
// 获取当前节点名称
String currentName;
// 循环 直至 结尾 符号 出现
while (jp.nextToken() != JsonToken.END_OBJECT) {
currentName = jp.getCurrentName();
if (jp.nextToken() == JsonToken.START_ARRAY) {
if (currentName.equals("data")) {
while (jp.nextToken() != JsonToken.END_ARRAY) {
BigJsonPO po = jp.readValueAs(BigJsonPO.class);
list.add(po);
}
} else {
jp.skipChildren();
}
} else {
jp.skipChildren();
}
}
jp.close();
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("文件大小:【" + FileUtil.size(file) / (1024 * 1024) + "M】,解析数据:【" + list.size() + "】条,耗时:【" + timeInterval.intervalMs() + "】ms");
}
性能对比
本机处理器: ADM 4700U 8核 16G
fastjson VS jackson
数据量(W) | 数据大小(M) | 写入耗时(ms) | 解析耗时(ms) |
---|---|---|---|
0.2 | 8.74 | 316/198 | 298/338 |
2 | 87.4 | 1397/1007 | 937/848 |
20 | 874 | 12103/8224 | 9479/9992 |
200 | 8430 | 105898/79756 | 暂未测试 |