无任何干货,仅供复制 程序说明: 1. 分析一个应该的访问日志文件,找出每个用户ID的访问次数。日志格式基本上是:"2012-10-26 14:41:30,748 userNameId-777 from IP-10.232.25.144 invoked URL-http://xxx/hello.jsonp" 2. Standalone模式,但直接用maven项目所依赖的hadoop库,你不必再另装hadoop <!– pom.xml –> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-core</artifactId> <version>1.0.4</version> </dependency> //Mapper public class Coupon11LogMapper extends Mapper<LongWritable, Text, Text, LongWritable> { @Override protected void map(LongWritable key, Text value, Context context) throws java.io.IOException, InterruptedException { String line = value.toString(); String accessRegex = ".*userNameId\\-(\\d+).*"; Pattern pattern …
hadoop map-reduce 入门示例代码 Read More »