Thank you for taking the time to review my Ibotta Dev Project. I appreciate your consideration.
Clone the repository:
git clone https://github.com/PlanetEfficacy/anagram.git
The project is to build an API that allows fast searches for anagrams.
rake db:create
rake db:migrate
The schema for this migration is
create_table "words", force: :cascade do |t|
t.string "value"
t.string "alphabetize"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
end
As per the project instructions:
Ingesting the file doesn’t need to be fast, and you can store as much data in memory as you like.
Running the seed file takes a while. The time comes from loading each word as well as its alphabetized form into the database. A word's alphabetized form is derived by:
- splitting the word by character
- sorting the characters alphabetically
- joining the split word back together
It is more performant to query the database for anagrams once we include both the word's value (text) and its alphabetized form in our word record since the heavy lifting part of the anagram algorithm is done up front upon running the db seed and upon the subsequent creation of words.
Seed your database with the following command:
rake db:seed
This project has 100% test coverage according to the SimpleCov gem. I followed a strict TDD methodology when writing this code. My approach was to rigorously test at the unit level. I wrote request specs that checked for proper routing, but use mocks and stubs because I trust my unit tests and this makes the suite overall more performant. The exception to this approach is the spec/requests/words_spec.rb
test, which creates database records. The entire suite of 40 examples runs on my machine consistently under 1.5 seconds. To run the test suite type:
rspec
A secondary test file, provided in the instructions for this project, can be run from the root of the project with the following command. The project must be running on localhost:3000 prior to running this command. Also, DO NOT run this test once you have seeded the database as one of the tests will check an endpoint that will delete the entire database.
Caution, do not run this command without commenting out the delete all test once you have seeded your database
ruby anagram_test.rb
rails server
Takes a JSON array of English-language words and adds them to the corpus (data store)
The following curl command will add words to the corpus
curl --request POST \
--url http://localhost:3000/words.json \
--header 'cache-control: no-cache' \
--header 'content-type: application/json' \
--data '{ "words": ["read", "dear", "dare"] }'
and return:
[
{
"id": 236538,
"value": "read",
"alphabetize": "ader",
"created_at": "2017-02-16T21:44:46.919Z",
"updated_at": "2017-02-16T21:44:46.919Z"
},
{
"id": 236539,
"value": "dear",
"alphabetize": "ader",
"created_at": "2017-02-16T21:44:46.924Z",
"updated_at": "2017-02-16T21:44:46.924Z"
},
{
"id": 236540,
"value": "dare",
"alphabetize": "ader",
"created_at": "2017-02-16T21:44:46.929Z",
"updated_at": "2017-02-16T21:44:46.929Z"
}
]
Returns a JSON array of English-language words that are anagrams of the word passed in the URL.
The following curl command will get all of the anagrams of a given word:
curl --request GET \
--url http://localhost:3000/anagrams/read.json \
--header 'cache-control: no-cache'
and returns:
{
"anagrams": [
"dear",
"dare"
]
}
The endpoint supports an optional query param that indicates the maximum number of results to return. The following command will limit the query to 1 anagram.
curl --request GET \
--url 'http://localhost:3000/anagrams/read.json?limit=1' \
--header 'cache-control: no-cache'
The above request returns:
{
"anagrams": [
"dear"
]
}
Deletes a single word from the data store.
curl -i -X DELETE http://localhost:3000/words/read.json
and returns:
HTTP/1.1 200 OK
Deletes all contents of the data store.
curl -i -X DELETE http://localhost:3000/words.json
and returns:
HTTP/1.1 204 No Content
In addition to the above base functionality the following features have also been implemented.
Returns a count of words in the corpus and min/max/median/average word length.
curl --request GET --url 'http://localhost:3000/word-count'
and returns:
{
"count": 3,
"min": 4,
"max": 4,
"median": 4.0,
"average": 4.0
}
All of the following GET requests can be scoped to exclude proper nouns from the corpus.
curl --request GET --url 'http://localhost:3000/anagrams/:word.json?proper=false'
curl --request GET --url 'http://localhost:3000/word-count?proper=false'
curl --request GET --url 'http://localhost:3000/most-anagrams?proper=false'
curl --request GET --url 'http://localhost:3000/group-anagrams?proper=false'
Returns the words with the most anagrams.
curl --request GET --url 'http://localhost:3000/most-anagrams'
and returns:
{
"anagrams": ["read","dear","dare"]
}
Returns true if the provided word pair are anagrams of each other.
curl --request GET --url 'http://localhost:3000/check-anagrams?anagrams=read,dare'
and returns:
{
"words": ["read","dare"],
"anagram": true
}
Returns all anagram groups of size >= x
curl --request GET --url 'http://localhost:3000/group-anagrams?size=2'
and returns:
{
"anagram_groups": [["read","dear","dare"]]
}
Note, the nested array in the above example connotes a set of anagram words. If the corpus contains multiple sets of anagram words, the above array would have an equivalent number of child elements.
Deletes the given word as well as all of its anagrams.
curl --request DELETE --url 'http://localhost:3000/anagrams/read.json'
and returns:
HTTP/1.1 204 No Content
If I was to continue developing this project, I might consider:
- Using raw sql to create the alphabetize attribute instead of the after_create method I have implemented in the word.rb model.
- Adding word length limits / maximums in each query. For example find all anagrams of words with length 10.
Thank you very much for your time and consideration.