GithubHelp home page GithubHelp logo

rtools-ngrams's Introduction

rtools-ngrams

R code for querying and parsing results from Google n-grams. This code is very much a work in progress and should only be used as a reference point for writing your own.

I've used these to query noun-verb co-occurrence patterns, with the final goal of creating pseudo-sentences with a range of probability values. Here's a brief overview:

  1. run_queries.R: specify the server address or local path for your rotated ngrams database, then run this query. Variables not declared in this script are found in ngrams.RData.

  2. preproc_query.R: basically adds line breaks to a continuous stream of text.

  3. parse_query_output.R: counts the occurrences of target words in query results and saves these counts in some data frames.

  4. count_subj_verb_pairs.R: co-occurrence counts for subject/verb pairs.

  5. calc_freq.R: co-occurrence counts and pointwise mutual information (PMI) for noun-verb pairs.

  6. make_proto_sentences.R: put together sentences with desired

rtools-ngrams's People

Contributors

jeffrey-phillips avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.