GithubHelp home page GithubHelp logo

analy_z's Introduction

AnalyZ

This is gem for text analyze. Now you can analyze hse-tf-idf value about each words.

Dependency

This gem need 'mecab'

so please install 'mecab'

MacOS

$ brew install mecab

$ brew install mecab-ipadic

What is hse-tf-idf

hse-tf-idf = hse * tf-idf

What is hse

Hse is HTML Semantic Element (valuation). Evaluate HTML tag and express it's value in number.

for example

tag name font-size font-weight valuation
h1 2 1.75 3.5
h2 1.5 1.75 2.625
h3 1.17 1.75 2.0475
h4 1 1.75 1.75

valuation = font-size * font-weight

And I'm looking for another valuation. Please tell me if you find out more good tag or style.

I want to add hse valuation logic below,

  • font size by css
  • font color

Installation

Add this line to your application's Gemfile:

gem 'analy_z'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install analy_z

Usage

require 'analy_z'

# file_path : file path for files you want to analyze
#             for example 'html/*.html'
#             NOTE please add more than 2 files
#             because only 1 file, analy_z can't calucurate idf
# selector  : selector for place you want to analyze
#             for example '#main .content'

a = AnalyZ::HTML.word_val(file_path, selector)

a.tf          # tf
a.idf         # idf
a.tf_idf      # tf-idf
a.hse_tf_idf  # hse-tf-idf
a.words       # words analy_z analyzed
a.texts       # texts analy_z analyzed
a.sentences   # sentences analy_z analyzed

Development

After checking out the repo, run bin/setup to install dependencies. Then, run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release to create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

  1. Fork it ( https://github.com/nao215/analy_z/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.