GithubHelp home page GithubHelp logo

markovifyr's Introduction

MarkovifyR

markovifyR : R wrapper for Markovify

Ref: https://github.com/jsvine/markovify

"Markovify is a simple, extensible Markov chain generator. Right now, its main use is for building Markov models of large corpora of text, and generating random sentences from that."

This package requires Python and markovify to be installed.

To install markovify in R you can run:

system("pip install markovify")

The following functions are implemented:

  • generate_markovify_model: Generates a markov model
  • markovify_text: Generates text from a markov model
  • generate_sentence_starting_with: Generates text, if possible, with your specified start word
  • generate_start_words: Produces a data frame with the starting words for each input sentence

Installation

devtools::install_github("abresler/markovifyR")
options(width=120)

Usage

library(markovifyR)
library(dplyr)

Generate New Peter Linneman "Life Lessons""

Here we are going to showcase how to use the package to create new Life Lessons from my favorite professor from college Peter Linneman.

Step 1 -- Build the Corpus

data("linneman_lessons")

lessons <-
  linneman_lessons %>% 
  pull(textLesson)

lessons %>% str()
##  chr [1:101] "You always have time for what is important to you" ...

Step 2 -- Build the Model

markov_model <-
  generate_markovify_model(
    input_text = lessons,
    markov_state_size = 2L,
    max_overlap_total = 25,
    max_overlap_ratio = .85
  )

Step 3 -- Generate the Text

markovify_text(
  markov_model = markov_model,
  maximum_sentence_length = NULL,
  output_column_name = 'textLinnemanBot',
  count = 25,
  tries = 100,
  only_distinct = TRUE,
  return_message = TRUE
)
## textLinnemanBot: “What is the bet?” is the bet?” is the outcome of hard work every day; not a eureka experience

## textLinnemanBot: Your job in life that can’t be fixed, so don’t go crazy when things break

## textLinnemanBot: Putting profits before power will generally result in a life of power

## textLinnemanBot: Members of the people you love

## textLinnemanBot: Generosity is cheap over the long term, as you will never realize it was your fault; then ask how you feel about them

## textLinnemanBot: Life isn’t fair, so don’t go crazy when things are unfair

## textLinnemanBot: “What is the right thing simply because it is what it is what it is the right thing simply because it is the bet?” is the outcome of hard work every day; not a eureka experience

## textLinnemanBot: Just when you are doing something, and do it Then it is learned as you wrinkle

## textLinnemanBot: Let the people you love

## textLinnemanBot: Genius is the critical question

## textLinnemanBot: Genius is the bet?” is the bet?” is the critical question

## textLinnemanBot: Enjoy what you do not respect yourself, why should anyone else respect you

## textLinnemanBot: In the battle between fear and greed, greed wins about 80% of the people you love

## textLinnemanBot: Assume it was your fault; then ask how you feel about them

## textLinnemanBot: Stick to what you do not feel so special

## textLinnemanBot: People are the problem

## textLinnemanBot: To interview well, think of yourself like a stripper; you need to be me and us; most people want it to be mean and they will view it as a very nasty statement

## textLinnemanBot: Why ever do less than the sadness of their life rather than the sadness of their life rather than the best you that you can be

## textLinnemanBot: Take ownership of your death, so do not love yourself, there is no reason others should love you

## textLinnemanBot: Stick to what you do not love yourself, there is no reason others should love you

## textLinnemanBot: There are very few things in life that can’t be fixed, so don’t take it personally if yours are strange and difficult; so don’t take it personally if yours are strange and difficult

## textLinnemanBot: Find out who you are If people don’t like it, that’s their problem

## textLinnemanBot: There are just a humble school teacher”

## textLinnemanBot: Life isn’t fair, so don’t take it personally if yours are strange and difficult

## textLinnemanBot: The machine is rarely the problem; the people you love

## # A tibble: 25 x 2
##    idRow textLinnemanBot                                                                                               
##    <int> <chr>                                                                                                         
##  1     1 "“What is the bet?” is the bet?” is the outcome of hard work every day; not a eureka experience"              
##  2     2 "Your job in life that can’t be fixed, so don’t go crazy when things break"                                   
##  3     3 "Putting profits before power will generally result in a life of power"                                       
##  4     4 "Members of the people you love"                                                                              
##  5     5 "Generosity is cheap over the long term, as you will never realize it was your fault; then ask how you feel a…
##  6     6 "Life isn’t fair, so don’t go crazy when things are unfair"                                                   
##  7     7 "“What is the right thing simply because it is what it is what it is the right thing simply because it is the…
##  8     8 "Just when you are doing something, and do it Then it is learned as you wrinkle"                              
##  9     9 "Let the people you love"                                                                                     
## 10    10 "Genius is the critical question"                                                                             
## # ... with 15 more rows

Step 4 -- Other Features

Generate random sentence starting with.

markovify_text(
  markov_model = markov_model,
  maximum_sentence_length = NULL,
  start_words = c("The", "You", "Life"),
  output_column_name = 'textLinnemanBot',
  count = 25,
  tries = 100,
  only_distinct = TRUE,
  return_message = TRUE
)
## textLinnemanBot: The machine is rarely the problem; the people you love

## textLinnemanBot: The machine is rarely the problem; the people who matter know how you feel about them

## textLinnemanBot: The machine is rarely the problem; the people operating the machine are the ultimate assets

## textLinnemanBot: The machine is rarely the problem; the people who matter know how you could have made it better

## textLinnemanBot: You may never get a second chance to tell people how you could have made it better

## textLinnemanBot: Life isn’t fair, so don’t be upset when things break

## textLinnemanBot: Life isn’t fair, so don’t take it personally if yours are strange and difficult

## textLinnemanBot: Life will end; face it and embrace the joy of their life rather than the best you can?

## textLinnemanBot: Life isn’t fair, so don’t go crazy when things are unfair

## textLinnemanBot: Life isn’t fair, so don’t go crazy when things break

## textLinnemanBot: Life will end; face it and embrace the joy of their life rather than the sadness of their life rather than the best you can?

## textLinnemanBot: Life will end; face it and embrace the joy of their life rather than the sadness of their life rather than the sadness of their life rather than the sadness of their death

## textLinnemanBot: Life isn’t fair, so don’t take it personally if yours are strange and difficult; so don’t go crazy when things break

## textLinnemanBot: Life will end; face it and embrace the joy of their life rather than the best you that you can be

## textLinnemanBot: Life will end; face it and embrace the joy of their life rather than the sadness of their life rather than the sadness of their life rather than the best you that you can be

## # A tibble: 15 x 3
##    idRow wordStart textLinnemanBot                                                                                     
##    <int> <chr>     <chr>                                                                                               
##  1     1 The       "The machine is rarely the problem; the people you love"                                            
##  2     2 The       "The machine is rarely the problem; the people who matter know how you feel about them"             
##  3     3 The       "The machine is rarely the problem; the people operating the machine are the ultimate assets"       
##  4     4 The       "The machine is rarely the problem; the people who matter know how you could have made it better"   
##  5     1 You       "You may never get a second chance to tell people how you could have made it better"                
##  6     1 Life      "Life isn’t fair, so don’t be upset when things break"                                              
##  7     2 Life      "Life isn’t fair, so don’t take it personally if yours are strange and difficult"                   
##  8     3 Life      "Life will end; face it and embrace the joy of their life rather than the best you can?"            
##  9     4 Life      "Life isn’t fair, so don’t go crazy when things are unfair"                                         
## 10     5 Life      "Life isn’t fair, so don’t go crazy when things break"                                              
## 11     6 Life      "Life will end; face it and embrace the joy of their life rather than the sadness of their life rat…
## 12     7 Life      "Life will end; face it and embrace the joy of their life rather than the sadness of their life rat…
## 13     8 Life      "Life isn’t fair, so don’t take it personally if yours are strange and difficult; so don’t go crazy…
## 14     9 Life      "Life will end; face it and embrace the joy of their life rather than the best you that you can be" 
## 15    10 Life      "Life will end; face it and embrace the joy of their life rather than the sadness of their life rat…

Look at corpus start-words

generate_start_words(markov_model = markov_model)
## # A tibble: 63 x 3
##    idSentence wordStart distanceCumulative
##         <int> <chr>                  <int>
##  1          1 There                      3
##  2          2 Travel,                    4
##  3          3 Genius                     5
##  4          4 Take                       9
##  5          5 You                       14
##  6          6 Analyze                   15
##  7          7 Careers                   16
##  8          8 Remember                  17
##  9          9 Be                        21
## 10         10 Life                      23
## # ... with 53 more rows

markovifyr's People

Contributors

abresler avatar mhenderson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

markovifyr's Issues

python issue

I've installed markovify and markovifyR using the installation instructions, but I'm getting an error when I try to run generate_markovify_model().

I'm trying to run
model <- generate_markovify_model( input_text = sayings$text, max_overlap_ratio = 0.85, max_overlap_total = 25 )

And get the error message

Error in initialize_python(required_module, use_environment) : Your current architecture is 64bit however this version of Python is compiled for 32bit.

Do I need to install 32bit Python or is there a way to use 64bit with markovifyR?

Readme error: could not find function "future_map_dfr"

Running the readme example I get the error:

Error in future_map_dfr(., function(x) { : 
  could not find function "future_map_dfr"

The function future_map_dfr is called here and here but doesn't appear to be in the package or any dependencies.

I'm guessing markovifyr is just missing a dependency on furrr, which has a function by this name. The readme code works if I install and load furrr.

markovifyR crashing on some corpuses?

So for example, this:

library(tidyverse)
library(gutenbergr)
library(markovifyR)

# system("pip install markovify")

# Step 1 -- Build the Corpus ----------------------------------------------
texts_emerson <- gutenberg_works(author == "Emerson, Ralph Waldo") %>% pull(gutenberg_id) %>% gutenberg_download(.)

texts <- texts_emerson %>% 
  sample_n(.,5000) %>% 
  pull(text) %>% 
  discard(. == "")

# Step 2 -- Build the Model -----------------------------------------------
markov_model <-
  generate_markovify_model(
    input_text = texts,
    markov_state_size = 2L,
    max_overlap_total = 25,
    max_overlap_ratio = .85
  )

Works fine, but this:

library(tidyverse)
library(gutenbergr)
library(markovifyR)

# system("pip install markovify")

# Step 1 -- Build the Corpus ----------------------------------------------
texts_shake <- gutenberg_works(author == "Shakespeare, William") %>% pull(gutenberg_id) %>% gutenberg_download(.)

texts <-
  texts_shake %>% 
  sample_n(.,5000) %>% 
  pull(text) %>% 
  discard(. == "")


# Step 2 -- Build the Model -----------------------------------------------
markov_model <-
  generate_markovify_model(
    input_text = texts,
    markov_state_size = 2L,
    max_overlap_total = 25,
    max_overlap_ratio = .85
  )

Seems to crash my R session. Does it crash for you all?

My session:

R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets 
[6] methods   base     

loaded via a namespace (and not attached):
[1] compiler_3.5.2 tools_3.5.2    yaml_2.1.19   

Installation instructions

Nice package, thank you!

It might make sense to have the Python installation instruction under the installation header?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.