GithubHelp home page GithubHelp logo

bible's Introduction

cat NASB.json | jq -c '.[]' | grep '"book":"Jonah"' > jonah.json
cat jonah.json | wc -l
#      48
grep '"chapter":1' jonah.json > jonah_1.json
grep '"chapter":2' jonah.json > jonah_2.json
grep '"chapter":3' jonah.json > jonah_3.json
grep '"chapter":4' jonah.json > jonah_4.json

count verses by chapter:
  cat jonah_1.json  | wc -l # 17
  cat jonah_2.json  | wc -l # 10
  cat jonah_3.json  | wc -l # 10
  cat jonah_4.json  | wc -l # 11

compare ngrams:

./bible_ngrams.rb jonah_3.json

./bible_ngrams.rb jonah.json
# grep for all ngrams that are in chapter-3 -- notice the other places they show-up too
grep '"chapter": 3' index_3_line_per_verse_nasb.json > index_3_jonah_3_all.json


# the big one:
./bible_ngrams.rb line_per_verse_nasb.json 
grep '"book": "Jonah", "chapter": 3' index_3_line_per_verse_nasb.json > index_3_jonah_3_all.json

./bible_ngrams.rb line_per_verse_nasb.json 5
grep '"book": "Jonah", "chapter": 3' index_5_line_per_verse_nasb.json > index_5_jonah_3_all.json

./bible_ngrams.rb line_per_verse_nasb.json 7
grep '"book": "Jonah", "chapter": 3' index_7_line_per_verse_nasb.json > index_7_jonah_3_all.json

grep "now the word of the lord" index_7_jonah_3_all.json

grep exceedingly index_7_jonah_3_all.json

wc -l index_*_jonah_3_all.json
   233 index_3_jonah_3_all.json
   217 index_5_jonah_3_all.json
   198 index_7_jonah_3_all.json


extract passage-text from a chapter:
cat jonah_3.json | jq  -c '.|.passage' | awk '{ print substr( $0, 2, length($0)-2 ) }' | tr -d "\\" > jonah_3_passages.json 
trigrams:
cd ../word_gram_sentence/
./ngram.rb ../bible/jonah_3_passages.json 3 > ../bible/jonah_3_trigrams.txt

unigrams:
cd ../word_gram_sentence/
./ngram.rb ../bible/jonah_3_passages.json > ../bible/jonah_3_unigrams.txt
grep -v '=>' jonah_3_rle_unigrams.txt # unrepeated
grep '=>' jonah_3_rle_unigrams.txt > jonah_3_repeated_rle_unigrams.txt

wc -l jonah_3_rle_unigrams.txt
     132 jonah_3_rle_unigrams.txt # unique words in ch 3
wc -l jonah_3_unigrams.txt
     258 jonah_3_unigrams.txt # total words in ch 3


run-length encode the unigrams (ignore ending-punctuation):
./run_length_encode.rb jonah_3_unigrams.txt > jonah_3_rle_unigrams.txt
./run_length_encode.rb jonah_3_unigrams.txt 1 |json_pp > jonah_3_rle_unigrams.json


run-length encode the trigrams (ignore ending-punctuation):
./run_length_encode.rb jonah_3_trigrams.txt > jonah_3_rle_trigrams.txt

./word_index.rb jonah_1.json > jonah_1_index.json

####

↪ wc -l line_per_verse_nasb.json
   31102 line_per_verse_nasb.json
(base) ¿jthomas? ~/dev/bible_fun[master*]
↪ ./word_index.rb line_per_verse_nasb.json > nasb_index.json
starting w/ file "line_per_verse_nasb.json"...
found 16897 words in the index


grab a book:
  cat NASB.json | jq -c '.[]' | grep '"book":"Philippians"' > philippians.json

count total verses:
  cat philippians.json  | wc -l # 104

grab a chapter:
  grep '"chapter":1' philippians.json > philippians_1.json
  grep '"chapter":2' philippians.json > philippians_2.json
  grep '"chapter":3' philippians.json > philippians_3.json
  grep '"chapter":4' philippians.json > philippians_4.json

count verses by chapter:
  cat philippians_1.json  | wc -l # 30
  cat philippians_2.json  | wc -l # 30
  cat philippians_3.json  | wc -l # 21
  cat philippians_4.json  | wc -l # 23

extract unique words for analysis:
↪ ./word_index.rb philippians.json > philippians_index.json
starting w/ file "philippians.json"...
found 592 words in the index

total unique words:
  cat philippians_index.json  | jq -c '.[]' | wc -l # 592

↪ ./word_index.rb philippians_1.json > philippians_1_index.json
starting w/ file "philippians_1.json"...
found 243 words in the index

↪ ./word_index.rb philippians_2.json > philippians_2_index.json
starting w/ file "philippians_2.json"...
found 251 words in the index

↪ ./word_index.rb philippians_3.json > philippians_3_index.json
starting w/ file "philippians_3.json"...
found 207 words in the index

↪ ./word_index.rb philippians_4.json > philippians_4_index.json
starting w/ file "philippians_4.json"...
found 217 words in the index

Extract just the passage (including JSON key, though):
cat philippians_2.json | jq  -c '.|{passage}' # ....; {"passage":"holding fast the word of lif...}; ....

Just the passage values (quoted):
cat philippians_2.json | jq  -c '.|.passage' # ....; "holding fast the word of life..."; ....

remove starting and ending quotes with awk:
cat philippians_2.json | jq  -c '.|.passage' | awk '{ print substr( $0, 2, length($0)-2 ) }'

bible's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.