mc-cat-tty / placerank Goto Github PK
View Code? Open in Web Editor NEWFinal assigment for "Gestione dell'Informazione" ("Search Engines") course @ UniMoRe
Final assigment for "Gestione dell'Informazione" ("Search Engines") course @ UniMoRe
An ad-hoc folder has been created for this task: presskit/assets
Add reviews generation to setup
An exception is raised by the sentiment analyzer when the input text dimension is larger than the maximum allowed (which is 512 characters). Should we define a light preprocessing for reviews to remove markup and other characters that don't influence the sentiment?
The raised exception is in detail:
Token indices sequence length is longer than the specified maximum sequence length for this model (647 > 512). Running this sequence through the model will result in indexing errors
Cache listings when first downloading it, since different inverted index buildings require to download the dataset each time and on slow connections can take up to a minute of total processing time.
Ref.:
Proposals:
To speed up things, we should provide a convenient method to change the corpus analyzer on the fly.
Right now you have to redefine the getDefaultAnalyzer
, which is referenced in placerank.logic_views.DocumentLogicView
but this makes it impossible to change the default analyzer at runtime.
We could change the function code for getDefaultAnalyzer by making its func_code
field referencing the func_code
of another ad-hoc function, but I think this is somewhat inelegant and convoluted.
Maybe let's make a factory that returns the appropriate analyzer. Another problem arises: the call to getDefaultAnalyzer, or this virtual factory, is inside the constructor of a class field. I think we should move the inverted index schema outside and create a separate class.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.