3 回答

TA貢獻(xiàn)1876條經(jīng)驗(yàn) 獲得超5個(gè)贊
哦,請(qǐng)不要使用runJar,Java API非常好。
了解如何從常規(guī)代碼開(kāi)始工作:
// create a configuration
Configuration conf = new Configuration();
// create a new job based on the configuration
Job job = new Job(conf);
// here you have to put your mapper class
job.setMapperClass(Mapper.class);
// here you have to put your reducer class
job.setReducerClass(Reducer.class);
// here you have to set the jar which is containing your
// map/reduce class, so you can use the mapper class
job.setJarByClass(Mapper.class);
// key/value of your reducer output
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
// this is setting the format of your input, can be TextInputFormat
job.setInputFormatClass(SequenceFileInputFormat.class);
// same with output
job.setOutputFormatClass(TextOutputFormat.class);
// here you can set the path of your input
SequenceFileInputFormat.addInputPath(job, new Path("files/toMap/"));
// this deletes possible output paths to prevent job failures
FileSystem fs = FileSystem.get(conf);
Path out = new Path("files/out/processed/");
fs.delete(out, true);
// finally set the empty out path
TextOutputFormat.setOutputPath(job, out);
// this waits until the job completes and prints debug out to STDOUT or whatever
// has been configured in your log4j properties.
job.waitForCompletion(true);
如果您使用的是外部群集,則必須通過(guò)以下方式將以下信息放入配置中:
// this should be like defined in your mapred-site.xml
conf.set("mapred.job.tracker", "jobtracker.com:50001");
// like defined in hdfs-site.xml
conf.set("fs.default.name", "hdfs://namenode.com:9000");
當(dāng)hadoop-core.jar位于您的應(yīng)用程序容器類(lèi)路徑中時(shí),這應(yīng)該沒(méi)問(wèn)題。但是我認(rèn)為您應(yīng)該在網(wǎng)頁(yè)上放置某種進(jìn)度指示器,因?yàn)橥瓿梢豁?xiàng)Hadoop工作可能需要幾分鐘到幾小時(shí);)
對(duì)于YARN(> Hadoop 2)
對(duì)于YARN,需要設(shè)置以下配置。
// this should be like defined in your yarn-site.xml
conf.set("yarn.resourcemanager.address", "yarn-manager.com:50001");
// framework is now "yarn", should be defined like this in mapred-site.xm
conf.set("mapreduce.framework.name", "yarn");
// like defined in hdfs-site.xml
conf.set("fs.default.name", "hdfs://namenode.com:9000");

TA貢獻(xiàn)1788條經(jīng)驗(yàn) 獲得超4個(gè)贊
因?yàn)橛成浜蜏p少在不同機(jī)器上的運(yùn)行,所以所有引用的類(lèi)和jar必須在機(jī)器之間移動(dòng)。
如果您有程序包jar,并且在您的桌面上運(yùn)行,則@ThomasJungblut的答案是可以的。但是,如果您在Eclipse中運(yùn)行,請(qǐng)右鍵單擊您的類(lèi)并運(yùn)行,它不起作用。
代替:
job.setJarByClass(Mapper.class);
使用:
job.setJar("build/libs/hdfs-javac-1.0.jar");
同時(shí),您的jar清單必須包含Main-Class屬性,這是您的主類(lèi)。
對(duì)于gradle用戶(hù),可以將這些行放在build.gradle中:
jar {
manifest {
attributes("Main-Class": mainClassName)
}}
添加回答
舉報(bào)