どうもHiveのReduce数が変だなと思ったら、、、
HadoopだけじゃなくてHiveにももっているらしいと。
HiveQLで確認できた。
hive> set; javax.jdo.option.ConnectionUserName=APP hive.exec.reducers.bytes.per.reducer=1000000000 hive.mapred.local.mem=0 hive.metastore.connect.retries=5 datanucleus.autoStartMechanismMode=checked hive.mapjoin.bucket.cache.size=100 hive.optimize.pruner=true datanucleus.validateColumns=false hadoop.config.dir=/usr/hadoop/conf hive.script.recordwriter=org.apache.hadoop.hive.ql.exec.TextRecordWriter hive.metastore.rawstore.impl=org.apache.hadoop.hive.metastore.ObjectStore datanucleus.autoCreateSchema=true javax.jdo.option.ConnectionPassword=mine datanucleus.validateConstraints=false datancucleus.transactionIsolation=read-committed hive.udtf.auto.progress=false datanucleus.validateTables=false hive.map.aggr.hash.min.reduction=0.5 datanucleus.storeManagerType=rdbms hive.merge.size.per.task=256000000 hive.exec.script.maxerrsize=100000 hive.test.mode.prefix=test_ hive.groupby.skewindata=false hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat hive.default.fileformat=TextFile hive.script.auto.progress=false hive.groupby.mapaggr.checkinterval=100000 hive.script.serde=org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe hive.hwi.listen.port=9999 datanuclues.cache.level2=true hive.exec.script.allow.partial.consumption=false hive.hwi.war.file=lib/hive-hwi-0.5.0.war hive.merge.mapfiles=true hive.fileformat.check=true hive.exec.compress.output=false hive.optimize.groupby=true datanuclues.cache.level2.type=SOFT javax.jdo.option.ConnectionDriverName=org.apache.derby.jdbc.EmbeddedDriver hive.map.aggr=true hive.join.emit.interval=1000 hive.metastore.warehouse.dir=/usr/hive/warehouse hive.mapred.reduce.tasks.speculative.execution=true javax.jdo.PersistenceManagerFactoryClass=org.datanucleus.jdo.JDOPersistenceManagerFactory hive.mapred.mode=nonstrict hive.script.recordreader=org.apache.hadoop.hive.ql.exec.TextRecordReader hive.script.operator.id.env.var=HIVE_SCRIPT_OPERATOR_ID mapred.reduce.tasks=-1 hive.exec.scratchdir=/tmp/hive/${user.name} javax.jdo.option.NonTransactionalRead=true hive.metastore.local=true hive.test.mode.samplefreq=32 hive.test.mode=false hive.exec.parallel=false javax.jdo.option.ConnectionURL=jdbc:derby:;databaseName=metastore_db;create=true javax.jdo.option.DetachAllOnCommit=true hive.heartbeat.interval=1000 hive.map.aggr.hash.percentmemory=0.5 hive.exec.reducers.max=999 hive.join.cache.size=25000 hive.hwi.listen.host=0.0.0.0 hive.mapjoin.cache.numrows=25000 hive.exec.compress.intermediate=false hive.optimize.cp=true hadoop.job.ugi=user,user hive.optimize.ppd=true hive.session.id=201010262134 hive.mapjoin.maxsize=100000 hive.merge.mapredfiles=false
mapred.reduce.tasks=-1
になってます。
「-1」?
調べてみると、
「By setting this property to -1, Hive will automatically figure out what should be the number of reducers.」
automatically!そうなんですね。
ちなみに、set (property name)= (value);でも
設定できたのですね。
hive> set mapred.reduce.tasks=5;
最近のコメント