rdfhdt / hdt-mr Goto Github PK

View Code? Open in Web Editor NEW

9.0 9.0 3.0 43 KB

MapReduce-based generation of HDT

License: GNU Lesser General Public License v2.1

Java 100.00%

hdt-mr's People

Contributors

Stargazers

Watchers

Forkers

flamingofugang earasoft decisym

hdt-mr's Issues

Update README to point to GitHub repository of hdt-java instead of Google Code

The top-level README links to the old Google Code repository for the hdt-java dependency. The link should be updated to the current GitHub repository at https://github.com/rdfhdt/hdt-java.

In addition, the README needs a .md extension to be rendered with GitHub's web UI with headers formatted in HTML.

Project purpose

Is hdt-mr only for generating HDT files, or also for querying HDT files?

ArrayIndexOutOfBoundsException

I got the following exception error when I run hadoop mr:

Sampling started
16/07/14 09:25:29 INFO input.FileInputFormat: **Total input paths to process : 0**
16/07/14 09:25:29 INFO partition.InputSampler: Using 0 samples
16/07/14 09:25:29 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
16/07/14 09:25:29 INFO compress.CodecPool: Got brand-new compressor [.deflate]
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
        at org.apache.hadoop.mapreduce.lib.partition.InputSampler.writePartitionFile(InputSampler.java:340)
        at org.rdfhdt.mrbuilder.HDTBuilderDriver.runDictionaryJob(HDTBuilderDriver.java:242)
        at org.rdfhdt.mrbuilder.HDTBuilderDriver.main(HDTBuilderDriver.java:112)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Here is the code snippet causing exception:
InputSampler.writePartitionFile(job, new InputSampler.IntervalSampler<Text, Text>(this.conf.getSampleProbability()));

It seems the input files are not found... I created 'input' directory, and put ntriples '.nt' files in it.

Any idea?

Best,
Gang

FileNotFoundException in Mapreduce program using ant build

Hello All,

I got a file not found exception when I ran a Hadoop (hadoop - 2.8.2 version) mapreduce program build using ant for face recognition (However, everything went fine when I ran a wordcount example). I have attached the java code if required. I need help resolving this.

These are the errors I got:

Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://localhost:9000/user/abiodun/lbpcascade_frontalface.xml#lbpcascade_frontalface.xml
at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1440)
at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1433)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1433)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:300)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:179)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:97)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:192)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1359)
at FaceCount.run(FaceCount.java:207)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at FaceCount.main(FaceCount.java:212)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)

I will appreciate if any one could help fix this problem,

Abi

FaceCount.txt

BUILD FAILURE

$ mvn clean install package

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 16.564 s
[INFO] Finished at: 2017-10-27T12:45:58+02:00
[INFO] Final Memory: 13M/142M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project hdt-mr: Could not resolve dependencies for project org.rdfhdt:hdt-mr:jar:2.0: The following artifacts could not be resolved: org.rdfhdt:hdt-api:jar:2.0, org.rdfhdt:hdt-java-core:jar:2.0, com.hadoop.gplcompression:hadoop-lzo:jar:0.4.20-SNAPSHOT: Could not find artifact org.rdfhdt:hdt-api:jar:2.0 in central (https://repo.maven.apache.org/maven2) -> [Help 1]

FileNotFoundException when run hadoop mapreduce

Hi,

I used maven the compile the code, here are the maven hadoop dependencies:

org.apache.hadoop
hadoop-common
2.7.0

org.apache.hadoop
hadoop-mapreduce-client-core
2.6.0

I created an executable jar, and put the input ntriples file in input folder. Then I run hadoop:
hadoop jar /home/fug2/hdtrdf/hdt-mr/target/hdt-mr-2.0-jar-with-dependencies.jar -i input

I got a warning message:
WARNING: Only one Reducer. Dictionary creation as a single job is more efficient.
and file not found exception error:
Shared section = dictionary/shared
Exception in thread "main" java.io.FileNotFoundException: File hdfs://mesosdev/user/fug2/dictionary/shared does not exist.
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:658)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:104)
at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:716)
at org.apache.hadoop.hdfs.DistributedFileSystem$14.doCall(DistributedFileSystem.java:712)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:712)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1485)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1525)
at org.rdfhdt.mrbuilder.HDTBuilderDriver.loadFromDir(HDTBuilderDriver.java:646)
at org.rdfhdt.mrbuilder.HDTBuilderDriver.buildDictionary(HDTBuilderDriver.java:382)
at org.rdfhdt.mrbuilder.HDTBuilderDriver.main(HDTBuilderDriver.java:119)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Does anyone know how to fix this?

Best,
Gang

Hello

Any dev on the list? Is this project still active? Or dead?

org.rdfhdt.hdt.trans is not defined

Hi,

When I built it using maven, I cannot find org.rdfhdt.hdt.trans package in the git repository, but it is required by two files:

[fug2@cbbdev11 hdt-java]$ grep -r 'org.rdfhdt.hdt.trans' .
./hdt-mr/src/main/java/org/rdfhdt/hdt/dictionary/impl/section/**TransientDictionarySection.java**:import org.rdfhdt.hdt.trans.TransientElement;
./hdt-mr/src/main/java/org/rdfhdt/mrbuilder/**HDTBuilderDriver**.java:import org.rdfhdt.hdt.trans.TransientElement;

Any idea about where I can find the package?

Best,
Gang

rdfhdt / hdt-mr Goto Github PK

hdt-mr's People

Contributors

Stargazers

Watchers

Forkers

hdt-mr's Issues

Update README to point to GitHub repository of hdt-java instead of Google Code

Project purpose

ArrayIndexOutOfBoundsException

FileNotFoundException in Mapreduce program using ant build

BUILD FAILURE

FileNotFoundException when run hadoop mapreduce

Hello

org.rdfhdt.hdt.trans is not defined

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs