3 回答

TA貢獻(xiàn)1831條經(jīng)驗(yàn) 獲得超4個(gè)贊
我認(rèn)為你需要自己壓平它,好在它并不復(fù)雜:
s = [[k, i, *j.values()] for k,v in data["reports"].items() for i, j in v.items()]
print (pd.DataFrame(s))
0 1 2 3
0 Google-Pixel 2 XL -MIoCtD9YUF2G9Esfrfz 04 Oct 2020 23:25:17:047 onCreate MainActivity 1601825117067
1 Google-Pixel 2 XL -MIoCtFVOxu8wdEHtm6q 04 Oct 2020 23:25:17:214 onCreate Service 1601825117216
2 Google-Pixel 2 XL -MIoCyBtKMQqQzUHEXsW 04 Oct 2020 23:25:37:682 onStartCommand Service 1601825137685
3 Google-Pixel 2 XL -MIoFWll9r3qwzWNoGMn 04 Oct 2020 23:36:47:687: (1.3212517, 103.860314) 1601825807693
4 Vivo 1820 -MIoF14JUm6JMZrOzDlL 04 Oct 2020 23:34:37:623 onCreate MainActivity 1601825677653
5 Vivo 1820 -MIoF1A9ZZNqTu5W-rQD 04 Oct 2020 23:34:38:016 onCreate Service 1601825678026
6 Vivo 1820 -MIoF2gNDua9FfLBTg6q 04 Oct 2020 23:34:44:235 onCreate MainActivity 1601825684248
分享

TA貢獻(xiàn)1852條經(jīng)驗(yàn) 獲得超7個(gè)贊
根據(jù) 的官方文檔,pd.json_normalize()
它假設(shè)一個(gè)數(shù)組(列表)輸入。然而,原始的 json 遠(yuǎn)非字典列表之類的東西,最重要的是,鍵“id”不存在。因此我認(rèn)為絕對需要一個(gè)手工制作的解析器。
代碼:
import pandas as pd
import json
file_path = "/mnt/ramdisk/in.json"
with open(file_path) as f:
dic = json.load(f)
# discard the redundant "report" layer
dic = dic["reports"]
# produce a flattened list of dict
ls = []
for k1, v1 in dic.items():
# k1 = model
for k2, v2 in v1.items():
# k2 = the hash-like id
v2["model"] = k1
v2["id"] = k2
ls.append(v2)
df = pd.json_normalize(ls)
輸出
# Trim the message for printing purpose
df2 = df.copy()
df2["message"] = df["message"].apply(lambda s: s[:10])
df2
Out[28]:
message timestamp model id
0 04 Oct 202 1601825117067 Google-Pixel 2 XL -MIoCtD9YUF2G9Esfrfz
1 04 Oct 202 1601825117216 Google-Pixel 2 XL -MIoCtFVOxu8wdEHtm6q
2 04 Oct 202 1601825137685 Google-Pixel 2 XL -MIoCyBtKMQqQzUHEXsW
3 04 Oct 202 1601825807693 Google-Pixel 2 XL -MIoFWll9r3qwzWNoGMn
4 04 Oct 202 1601825677653 Vivo 1820 -MIoF14JUm6JMZrOzDlL
5 04 Oct 202 1601825678026 Vivo 1820 -MIoF1A9ZZNqTu5W-rQD
6 04 Oct 202 1601825684248 Vivo 1820 -MIoF2gNDua9FfLBTg6q
注意:深入到類哈希id所在的層似乎是有必要的。這是因?yàn)樽畛鮥d是keys,但似乎必須重新格式化它們才能values正確解釋為值pd.json_normalize。我在互聯(lián)網(wǎng)上的簡單調(diào)查也沒有找到使用簡單的內(nèi)置方法來解析這種遞歸結(jié)構(gòu)的示例。

TA貢獻(xiàn)1807條經(jīng)驗(yàn) 獲得超9個(gè)贊
嘗試一下這個(gè)(參見我上面的評論)
import pandas as pd
data = []
for k, v in test['reports'].items():
model_name = k
for model in v.items():
_data = {}
_data['model'] = model_name
_data['id'] = model[0]
_data['message'] = model[1]['message']
_data['timestamp'] = model[1]['timestamp']
data.append(_data)
df = pd.DataFrame(data)
test你的數(shù)據(jù)在哪里,從而test['reports']訪問你想要解析的嵌套信息
添加回答
舉報(bào)