第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

為了賬號安全,請及時綁定郵箱和手機立即綁定
已解決430363個問題,去搜搜看,總會有你想問的

使用 PySpark 數(shù)據(jù)框解析 json 字符串列表

使用 PySpark 數(shù)據(jù)框解析 json 字符串列表

白板的微信 2023-07-27 10:32:23
我正在嘗試使用 pyspark 數(shù)據(jù)幀讀取 JSON 列表。您將在下面找到我的輸入數(shù)據(jù),我的目標是獲取具有兩列 user (string ) 和 ips Array[Sting] 的數(shù)據(jù)框。sampleJson = [ ('{"user":100, "ips" : ["191.168.192.101", "191.168.192.103", "191.168.192.96", "191.168.192.99"]}',),  ('{"user":101, "ips" : ["191.168.192.102", "191.168.192.105", "191.168.192.103", "191.168.192.107"]}',),  ('{"user":102, "ips" : ["191.168.192.105", "191.168.192.101", "191.168.192.105", "191.168.192.107"]}',),  ('{"user":103, "ips" : ["191.168.192.96", "191.168.192.100", "191.168.192.107", "191.168.192.101"]}',),  ('{"user":104, "ips" : ["191.168.192.99", "191.168.192.99", "191.168.192.102", "191.168.192.99"]}',),  ('{"user":105, "ips" : ["191.168.192.99", "191.168.192.99", "191.168.192.100", "191.168.192.96"]}',),  ]感謝您的幫助。
查看完整描述

1 回答

?
汪汪一只貓

TA貢獻1898條經(jīng)驗 獲得超8個贊

使用from_json函數(shù)通過defining schema.


Example:


from pyspark.sql.functions import *

from pyspark.sql.types import *


sampleJson = [ ('{"user":100, "ips" : ["191.168.192.101", "191.168.192.103", "191.168.192.96", "191.168.192.99"]}',),  ('{"user":101, "ips" : ["191.168.192.102", "191.168.192.105", "191.168.192.103", "191.168.192.107"]}',),  ('{"user":102, "ips" : ["191.168.192.105", "191.168.192.101", "191.168.192.105", "191.168.192.107"]}',),  ('{"user":103, "ips" : ["191.168.192.96", "191.168.192.100", "191.168.192.107", "191.168.192.101"]}',),  ('{"user":104, "ips" : ["191.168.192.99", "191.168.192.99", "191.168.192.102", "191.168.192.99"]}',),  ('{"user":105, "ips" : ["191.168.192.99", "191.168.192.99", "191.168.192.100", "191.168.192.96"]}',),  ]


df1=spark.createDataFrame(sampleJson)


sch=StructType([StructField('user', StringType(), False),StructField('ips',ArrayType(StringType()))])


df1.withColumn("n",from_json(col("_1"),sch)).select("n.*").show(10,False)

#+----+--------------------------------------------------------------------+

#|user|ips                                                                 |

#+----+--------------------------------------------------------------------+

#|100 |[191.168.192.101, 191.168.192.103, 191.168.192.96, 191.168.192.99]  |

#|101 |[191.168.192.102, 191.168.192.105, 191.168.192.103, 191.168.192.107]|

#|102 |[191.168.192.105, 191.168.192.101, 191.168.192.105, 191.168.192.107]|

#|103 |[191.168.192.96, 191.168.192.100, 191.168.192.107, 191.168.192.101] |

#|104 |[191.168.192.99, 191.168.192.99, 191.168.192.102, 191.168.192.99]   |

#|105 |[191.168.192.99, 191.168.192.99, 191.168.192.100, 191.168.192.96]   |

#+----+--------------------------------------------------------------------+



#schema


df1.withColumn("n",from_json(col("_1"),sch)).select("n.*").printSchema()

#root

# |-- user: string (nullable = true)

# |-- ips: array (nullable = true)

# |    |-- element: string (containsNull = true)


查看完整回答
反對 回復 2023-07-27
  • 1 回答
  • 0 關注
  • 253 瀏覽
慕課專欄
更多

添加回答

舉報

0/150
提交
取消
微信客服

購課補貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動學習伙伴

公眾號

掃描二維碼
關注慕課網(wǎng)微信公眾號