-
Hdfs的特點(diǎn):
數(shù)據(jù)冗余,硬件容錯(cuò)(3個(gè)備份)
流式的數(shù)據(jù)訪問(一次寫入,多次讀取,無法刪改,只能通過寫入到新的塊刪除舊文件)
存儲(chǔ)大文件 (否則namenode消耗高,頭大身?。?br />
實(shí)用性和局限性:
適合數(shù)據(jù)批量讀寫,吞吐量高
不適合交互式應(yīng)用,低延遲很難滿足
適合一次寫入多次讀取,順序讀寫
不支持多用戶并發(fā)寫相同文件
查看全部 -
Hdfs文件讀取流程:
Hdfs文件寫入流程:
查看全部 -
Hdfes數(shù)據(jù)管理:
?數(shù)據(jù)塊副本:每個(gè)數(shù)據(jù)塊3個(gè)副本,分布在2機(jī)架3節(jié)點(diǎn)上(容錯(cuò)性)
2.?心跳檢測(cè):DataNode定期向NameNode發(fā)心跳消息,刷新存在感!!
?
3.?二級(jí)NameNode:NameNdoe定期同步元數(shù)據(jù)映像文件到二級(jí)NameNode(secondryNameNode),一旦故障,備胎轉(zhuǎn)正,成正宮娘娘!
查看全部 -
HDFS的文件被分成塊進(jìn)行存儲(chǔ),HDFS塊默認(rèn)大小是64MB,快是整個(gè)文件存儲(chǔ)處理的邏輯單元
HDFS中有兩類節(jié)點(diǎn)NameNode和DataNode
namenode是管理節(jié)點(diǎn),存放文件元數(shù)據(jù),元數(shù)據(jù)包含兩個(gè)部分
文件與數(shù)據(jù)快的映射表
數(shù)據(jù)塊與數(shù)據(jù)節(jié)點(diǎn)的映射表
namenode是唯一的管理節(jié)點(diǎn),里面存放大量元數(shù)據(jù),客戶進(jìn)行訪問請(qǐng)求,首先會(huì)到namenode查看元數(shù)據(jù),這個(gè)文件放在哪些節(jié)點(diǎn)上面然后從這些節(jié)點(diǎn)拿數(shù)據(jù)塊,然后組裝成想要的文件
DateNode是HDFS的工作節(jié)點(diǎn),存放數(shù)據(jù)塊
查看全部 -
Linux下Java程序運(yùn)行:
先創(chuàng)建Java程序,
然后編譯:
后打包:
查看全部 -
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
public class WordCount {
public static class WordCountMap extends
Mapper<LongWritable, Text, Text, IntWritable> {
private final IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer token = new StringTokenizer(line);
while (token.hasMoreTokens()) {
word.set(token.nextToken());
context.write(word, one);
}
}
}
public static class WordCountReduce extends
Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values,
Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
context.write(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf);
job.setJarByClass(WordCount.class);
job.setJobName("wordcount");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(WordCountMap.class);
job.setReducerClass(WordCountReduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
}
查看全部 -
void main()
{
int x;
x = 0;
function(1,2,3);
x =x+ 1;
printf("%d\n",x);
}
void function(int a, int b, int c)
{
char buffer1[15];
char buffer2[10];
int *ret;
ret = buffer1 + 28;
(*ret) += 5;
}查看全部 -
void main()
{
int x;
x = 0;
function(1,2,3);
x = 1;
printf("%d\n",x);
}void function(int a, int b, int c)
{
char buffer1[5];
char buffer2[10];
int *ret;
ret = buffer1 + 12;
(*ret) += 8;
}查看全部 -
void function(int a, int b, int c)
{
char buffer1[5];
char buffer2[10];
int *ret; ret = buffer1 + 28;
// (*ret) += 8;
printf(“%237x%hnn\n”,0,
(int*)&ret);}
void main()
{
int x;
x = 0;
function(1,2,3);
x = 1;
printf("%d\n",x);
}查看全部 -
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.Partitioner;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
public class Sort {
public static class Map extends
Mapper<Object, Text, IntWritable, IntWritable> {
private static IntWritable data = new IntWritable();
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
data.set(Integer.parseInt(line));
context.write(data, new IntWritable(1));
}
}
public static class Reduce extends
Reducer<IntWritable, IntWritable, IntWritable, IntWritable> {
private static IntWritable linenum = new IntWritable(1);
public void reduce(IntWritable key, Iterable<IntWritable> values,
Context context) throws IOException, InterruptedException {
for (IntWritable val : values) {
context.write(linenum, key);
linenum = new IntWritable(linenum.get() + 1);
}
}
}
public static class Partition extends Partitioner<IntWritable, IntWritable> {
@Override
public int getPartition(IntWritable key, IntWritable value,
int numPartitions) {
int MaxNumber = 65223;
int bound = MaxNumber / numPartitions + 1;
int keynumber = key.get();
for (int i = 0; i < numPartitions; i++) {
if (keynumber < bound * i && keynumber >= bound * (i - 1))
return i - 1;
}
return 0;
}
}
/**
* @param args
*/
public static void main(String[] args) throws Exception {
// TODO Auto-generated method stub
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args)
.getRemainingArgs();
if (otherArgs.length != 2) {
System.err.println("Usage WordCount <int> <out>");
System.exit(2);
}
Job job = new Job(conf, "Sort");
job.setJarByClass(Sort.class);
job.setMapperClass(Map.class);
job.setPartitionerClass(Partition.class);
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
查看全部 -
3
查看全部 -
需要配置Hadoop的環(huán)境變量
查看全部 -
hdfs:
高成本
低成本
成熟生態(tài)圈
hive
查看全部 -
Hadoop
查看全部 -
大數(shù)據(jù) 是個(gè)好東西
查看全部
舉報(bào)