Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

" -file option is deprecated, please use generic option -files instead." #79

Open
UMDTERPS opened this issue Oct 31, 2013 · 1 comment

Comments

@UMDTERPS
Copy link

Hello! I am trying to run a job for our data team and we are getting errors using dumbo. We are using the latest version of Dumbo and Cloudera.

Command used to run the job:

"ls[benjamin@arya dedup]$ dumbo start jaccard.py -input products -output products-output13 -hadoop /usr/ -hadooplib /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/"

Stacktrace:

13/10/30 13:05:32 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
13/10/30 13:05:32 WARN streaming.StreamJob: -jobconf option is deprecated, please use -D instead.
packageJobJar: [/home/benjamin/mapreduce/jobs/dedup/typedbytes.pyc, /home/benjamin/mapreduce/jobs/dedup/jaccard.py, /home/benjamin/mapreduce/jobs/dedup/dumbo/backends/common.pyc] [] /tmp/streamjob5478521893861821465.jar tmpDir=null
13/10/30 13:05:33 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/10/30 13:05:34 INFO mapred.FileInputFormat: Total input paths to process : 1
13/10/30 13:05:35 INFO mapred.JobClient: Running job: job_201310231818_0015
13/10/30 13:05:36 INFO mapred.JobClient: map 0% reduce 0%
13/10/30 13:05:47 INFO mapred.JobClient: Task Id : attempt_201310231818_0015_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.streaming.AutoInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1649)
at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:620)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:394)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.streaming.AutoInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1617)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1641)

Any help would be greatly appreciated!

@a4tunado
Copy link

a4tunado commented Nov 5, 2013

Seams like hadoop-streaming_.jar is missing on your nodes.
Check if your environtment points to the correct HADDOP__ paths or try to add hadoop-streaming*.jar with -libjar option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants