剛開始接觸GraphX,拿到了一份類似用戶關(guān)注的測試數(shù)據(jù)。用戶ID10000條,對應(yīng)10000個節(jié)點。用戶關(guān)系的數(shù)量不明。數(shù)據(jù)格式如下:建圖代碼如下:import org.apache.spark.graphx.Edgeimport org.apache.spark.graphx.Graphval vertexRdd = sc.textFile("hdfs://ubt1:9820/WBNW/Vertex")val edgeRdd = sc.textFile("hdfs://ubt1:9820/WBNW/Edge")val users = vertexRdd.map(line => line.split(",")).map(parts => (parts(0).toLong, parts(1)))val follow_relation = edgeRdd.map(line => line.split(",")).map(parts => new Edge(parts(0).toLong,parts(1).toLong,parts(2).toLong))val graph = Graph(users, follow_relation)val v_count = vertexRdd.countval e_count = edgeRdd.countval gv_count = graph.vertices.countval ge_count = graph.edges.count輸出結(jié)果如下:發(fā)現(xiàn)邊的數(shù)量在Rdd與Graph中相同,點卻明顯不一致。請問是什么原因,謝謝各位。
GraphX構(gòu)建圖的時候,頂點數(shù)增多了是為什么,求教!
炎炎設(shè)計
2018-08-22 10:09:49