• 热门专题

按字段分组的Mapper

作者:  发布日期:2016-03-07 20:38:23
Tag标签:字段  
  • /***
     * @author YangXin
     * @info 按字段分组的Mapper
     */
    package unitTwelve;
    
    import java.io.IOException;
    import java.util.regex.Pattern;
    
    import org.apache.hadoop.io.LongWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapreduce.Mapper;
    
    public class ByKeyMapper extends Mapper<LongWritable, Text, Text, Text>{
    	private Pattern splitter = Pattern.compile("	");
    	private int selectedField = 1;
    	private int groupByField = 0;
    	
    	protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException{
    		String[] fields = splitter.split(value.toString());
    		if(fields.length - 1 < selectedField || fields.length - 1 < groupByField){
    			context.getCounter("Map", "LinesWithErrors").increment(1);
    			return;
    		}
    		String oKey = fields[groupByField];
    		String oValue = fields[selectedField];
    		context.write(new Text(oKey), new Text(oValue));
    	}
    }
    
About IT165 - 广告服务 - 隐私声明 - 版权申明 - 免责条款 - 网站地图 - 网友投稿 - 联系方式
本站内容来自于互联网,仅供用于网络技术学习,学习中请遵循相关法律法规