Comments (16)
Can you try to use a full URL path for the model file instead of just metarank_model_movie.model
? Something like file:///home/user/metarank.model
? It seems like a bug, I will make a fix tomorrow.
from metarank.
@laxmimerit we're working on the fix for your case
from metarank.
It may sound ironic, but I've again fixed this new issue in https://github.com/metarank/metarank/releases/tag/0.2.5
Please, try again :)
from metarank.
Solved!
Since this file is stored with Git LFS
. I had to download events.jsonl.gz
separately from here .
from metarank.
Everything worked well as described in the tutorial. I got stuck at Inference step. While running the inference step
java -cp metarank-assembly-0.2.2.jar ai.metarank.mode.inference.Inference --config config.yml --model metarank_model_movie.model --redis-host localhost --format json --savepoint-dir ./output/savepoint
it is throwing following error
02:19:54.447 INFO a.m.mode.inference.InferenceCmdline$ - Port: 8080
02:19:54.451 INFO a.m.mode.inference.InferenceCmdline$ - Model path: metarank_model_movie.model
scala.MatchError: config.yml (of class java.lang.String)
at ai.metarank.mode.FileLoader$.load(FileLoader.scala:14)
at ai.metarank.mode.inference.Inference$.$anonfun$run$3(Inference.scala:35)
at apply @ ai.metarank.mode.inference.InferenceCmdline$.$anonfun$parse$4(InferenceCmdline.scala:131)
at map @ ai.metarank.mode.inference.InferenceCmdline$.$anonfun$parse$4(InferenceCmdline.scala:131)
at apply @ ai.metarank.mode.inference.InferenceCmdline$.$anonfun$parse$2(InferenceCmdline.scala:130)
at flatMap @ ai.metarank.mode.inference.InferenceCmdline$.$anonfun$parse$2(InferenceCmdline.scala:130)
at flatMap @ ai.metarank.mode.inference.InferenceCmdline$.parse(InferenceCmdline.scala:125)
at flatMap @ ai.metarank.mode.inference.Inference$.$anonfun$run$2(Inference.scala:34)
at apply @ ai.metarank.mode.inference.Inference$.run(Inference.scala:33)
at flatMap @ ai.metarank.mode.inference.Inference$.run(Inference.scala:33)
What am I missing here?
from metarank.
Hey @laxmimerit , thanks for pointing this out, I've updated the docs a bit to include the full link in the bootstrapping part
from metarank.
@laxmimerit as for the inference step, can you post your config.yml file, or did you use it directly from our repo?
from metarank.
from metarank.
config.yml
file is as follows
interactions:
- name: click
weight: 1.0
features:
- name: popularity
type: number
scope: item
source: metadata.popularity
- name: vote_avg
type: number
scope: item
source: metadata.vote_avg
- name: vote_cnt
type: number
scope: item
source: metadata.vote_cnt
- name: budget
type: number
scope: item
source: metadata.budget
- name: release_date
type: number
scope: item
source: metadata.release_date
- name: runtime
type: number
scope: item
source: metadata.runtime
- name: title_length
type: word_count
source: metadata.title
scope: item
- name: genre
type: string
scope: item
source: metadata.genres
values:
- drama
- comedy
- thriller
- action
- adventure
- romance
- crime
- science fiction
- fantasy
- family
- horror
- mystery
- animation
- history
- music
- name: ctr
type: rate
top: click
bottom: impression
scope: item
bucket: 24h
periods: [7,30]
- name: liked_genre
type: interacted_with
interaction: click
field: metadata.genres
scope: session
count: 10
duration: 24h
- name: liked_actors
type: interacted_with
interaction: click
field: metadata.actors
scope: session
count: 10
duration: 24h
- name: liked_tags
type: interacted_with
interaction: click
field: metadata.tags
scope: session
count: 10
duration: 24h
- name: liked_director
type: interacted_with
interaction: click
field: metadata.director
scope: session
count: 10
duration: 24h
- name: visitor_click_count
type: interaction_count
interaction: click
scope: session
- name: global_item_click_count
type: interaction_count
interaction: click
scope: item
- name: day_item_click_count
type: window_count
interaction: click
scope: item
bucket: 24h
periods: [7,30]
from metarank.
Yes. I had already tried it by giving full path for all files but error is same. In fact I also tried to run trained model available in test resources folder.
from metarank.
Should be fixed in https://github.com/metarank/metarank/releases/tag/0.2.3
Please reopen in case if you encounter any further issues.
from metarank.
Hi,
After the previous fix, I have started getting this new error at the first step, Data Bootstraping, itself
14:00:05.492 INFO o.a.f.r.t.slot.TaskSlotTableImpl - Activate slot 66896173a0f00d248c21b1915f638c66.
14:00:05.511 INFO o.a.flink.runtime.taskmanager.Task - GroupReduce (reduce(OperatorSubtaskState)) (1/1)#0 (1fc0ed9b1b03ecea41769fe96191f62c) switched from CREATED to DEPLOYING.
14:00:05.512 INFO o.a.flink.runtime.taskmanager.Task - Loading JAR files for task GroupReduce (reduce(OperatorSubtaskState)) (1/1)#0 (1fc0ed9b1b03ecea41769fe96191f62c) [DEPLOYING].
14:00:05.511 INFO o.a.flink.runtime.taskmanager.Task - MapPartition (0af8aae8d1672968360cf5c5b0cfd272) (1/1)#0 (708a532c53f1aa6f5e7064e134333adb) switched from DEPLOYING to INITIALIZING.
14:00:05.512 INFO o.a.flink.runtime.taskmanager.Task - MapPartition (0af8aae8d1672968360cf5c5b0cfd272) (1/1)#0 (708a532c53f1aa6f5e7064e134333adb) switched from INITIALIZING to RUNNING.
14:00:05.513 INFO o.a.f.r.e.ExecutionGraph - MapPartition (0af8aae8d1672968360cf5c5b0cfd272) (1/1) (708a532c53f1aa6f5e7064e134333adb) switched from DEPLOYING to INITIALIZING.
14:00:05.513 INFO o.a.f.r.e.ExecutionGraph - MapPartition (0af8aae8d1672968360cf5c5b0cfd272) (1/1) (708a532c53f1aa6f5e7064e134333adb) switched from INITIALIZING to RUNNING.
14:00:05.521 INFO o.a.f.r.taskexecutor.TaskExecutor - Received task CHAIN Partition -> Map (Key Remover) (1/1)#0 (db09c8277e8799fdd97e29636c52520b), deploy into slot with allocation id 66896173a0f00d248c21b1915f638c66.
14:00:05.521 INFO o.a.flink.runtime.taskmanager.Task - GroupReduce (reduce(OperatorSubtaskState)) (1/1)#0 (1fc0ed9b1b03ecea41769fe96191f62c) switched from DEPLOYING to INITIALIZING.
14:00:05.521 INFO o.a.flink.runtime.taskmanager.Task - GroupReduce (reduce(OperatorSubtaskState)) (1/1)#0 (1fc0ed9b1b03ecea41769fe96191f62c) switched from INITIALIZING to RUNNING.
14:00:05.522 INFO o.a.f.r.e.ExecutionGraph - GroupReduce (reduce(OperatorSubtaskState)) (1/1) (1fc0ed9b1b03ecea41769fe96191f62c) switched from DEPLOYING to INITIALIZING.
14:00:05.525 INFO o.a.f.r.e.ExecutionGraph - GroupReduce (reduce(OperatorSubtaskState)) (1/1) (1fc0ed9b1b03ecea41769fe96191f62c) switched from INITIALIZING to RUNNING.
14:00:05.526 ERROR o.a.f.runtime.operators.BatchTask - Error in task code: MapPartition (6d34697451896c6270f6053830e3820a) (1/1)
java.lang.NullPointerException: null
at org.apache.flink.state.api.output.BoundedStreamTask.cleanUpInternal(BoundedStreamTask.java:120)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:1021)
at org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUp(StreamTask.java:925)
at org.apache.flink.state.api.output.BoundedOneInputStreamTaskRunner.mapPartition(BoundedOneInputStreamTaskRunner.java:89)
at org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:113)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:519)
at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:360)
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:937)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)
at java.base/java.lang.Thread.run(Thread.java:829)
from metarank.
Looks like that the e2e test was missing a set -e
trick in bash, so when one of subtasks failed, it still was marked as successful. Should be fixed (again!) in https://github.com/metarank/metarank/releases/tag/0.2.4, please try the new build.
from metarank.
Hi,
Step 1 and 2 passed successfully but this time it got stuck in Step 3, Upload.
10:42:32.155 INFO o.a.flink.runtime.blob.BlobServer - Stopped BLOB server at 0.0.0.0:42907
10:42:32.161 INFO o.a.f.r.rpc.akka.AkkaRpcService - Stopped Akka RPC service.
org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find Flink job (de51f989b310d16d5839b75752aa778c)
at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$cancelJob$8(Dispatcher.java:533)
at java.base/java.util.Optional.orElseGet(Optional.java:369)
at org.apache.flink.runtime.dispatcher.Dispatcher.cancelJob(Dispatcher.java:530)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRpcInvocation$1(AkkaRpcActor.java:316)
at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:314)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:217)
at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:78)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:163)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20)
at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at akka.actor.Actor.aroundReceive(Actor.scala:537)
at akka.actor.Actor.aroundReceive$(Actor.scala:535)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
at akka.actor.ActorCell.invoke(ActorCell.scala:548)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
at akka.dispatch.Mailbox.run(Mailbox.scala:231)
at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
at apply @ ai.metarank.mode.AsyncFlinkJob$.$anonfun$execute$3(AsyncFlinkJob.scala:24)
at fromCompletableFuture @ ai.metarank.mode.AsyncFlinkJob$.execute(AsyncFlinkJob.scala:16)
at map @ ai.metarank.mode.AsyncFlinkJob$.$anonfun$execute$3(AsyncFlinkJob.scala:24)
at apply @ ai.metarank.mode.upload.Upload$.isFinished(Upload.scala:59)
at fromCompletableFuture @ ai.metarank.mode.AsyncFlinkJob$.execute(AsyncFlinkJob.scala:16)
at map @ ai.metarank.mode.upload.Upload$.isFinished(Upload.scala:61)
at flatMap @ ai.metarank.mode.upload.Upload$.$anonfun$blockUntilFinished$1(Upload.scala:50)
from metarank.
Hi,
Thanks for a quick fix. Successfully executed all steps from scratch. Will be testing feedback
and rank
API next week. Thanks.
from metarank.
Hopefully these steps will go much smoother :)
from metarank.
Related Issues (20)
- Import LightGBM trained model to metarank HOT 3
- Feature storage is not optimal for shared fields
- The "termfreq" command could detect an existing target file and abort instead of failing at the end
- The “sort” command should not erase the source file when “data” equals “out”
- Diversity feature throws an exception when the data is missing.
- The "import" command should be able to continue importing when the state file exists.
- Offline import should be able to wipe the state at the end
- doc mentions obsolete relative_number feature
- Add blocklist for feature names: values, models
- validate crashes when dataset has clicks with references to non-existing ranking events
- Make eval metrics configurable
- GCP Memstore redis: print a warning for lack of cache invalidation support
- warmup with synthetic traffic support
- doc: describe setup with file-based immutable store
- Memory leak when using Redis persistence
- : and / cannot be used in search queries using Redis persistence layer HOT 3
- Kinesis client throws errors on expired iterator
- There seems to be a bug after training a model that is flushed to redis.
- Add feature names to the dataset export command HOT 1
- Add option to use ndcg@{conf.ndcgCutoff} objective function
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from metarank.