Read these instructions carefully. Understand exactly what is expected before starting this Sprint Challenge.
This challenge allows you to practice the concepts and techniques learned over the past sprint and apply them in a concrete project. This sprint explored hash tables. During this sprint, you studied hash functions, collision resolution, complexity analysis of hash tables, load factor, resizing, and various use cases for hash tables. In your challenge this week, you will demonstrate mastery of these skills by solving five problems related to hash tables.
The sprint challenge is an individual assessment. All work must be your own. Your challenge score is a measure of your ability to work independently using the material covered through this sprint. You need to demonstrate proficiency in the concepts and objectives introduced and practiced in preceding days.
You are not allowed to collaborate during the sprint challenge. However, you are encouraged to follow the twenty-minute rule and seek support from your TL if you need direction.
You have three hours to complete this challenge. Plan your time accordingly.
This challenge requires you to solve algorithm problems that are amenable to being solved efficiently with a hash table.
Commit your code regularly and meaningfully. This practice helps both you (in case you ever need to return to old code for any number of reasons) and your Team Lead as they evaluate your solution.
Be prepared to demonstrate your understanding of this week's concepts by answering questions on the following topics. You might prepare by writing down your answers beforehand.
-
Hashing functions
A hash function is a function where the input is any data, and the output is a number.
-
Collision resolution
When two items hash to the same slot, we must have a systematic method for placing the second item in the hash table. To mitigate this, you can also allow each slot to hold a reference to a collection (or chain) of items.
-
Performance of basic hash table operations
Once we take collisions into account, the worst caes is linear time O(n) The average case is still constant time O(1) - If we handled collisions well and we have a hashing function.
-
Load factor
The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased You take the number of items stored in the hash table divided by the number of slots.
-
Automatic resizing
When the load factor exceeds the threshold (greater than 0.7), then a new larger table is allocated. Each entry is removed from the old table and inserted in the new one. Also resize is common to double the size of the has table. Resizing you must re-insert all items into the new hash table and not copy old items.
-
Various use cases for hash tables
Most important data structure, used to implement objects and dictionaries over distributed computer networks. They Provide key/value storage with constant time complexity for insertion, deletion, and search.
We expect you to be able to answer questions in these areas. Your responses contribute to your Sprint Challenge grade.
- Create a forked copy of this project
- Add your team lead as a collaborator on Github
- Clone your OWN version of the repository (Not Lambda's by mistake!)
- Create a new branch: git checkout -b
<firstName-lastName>
. - Implement the project on your newly created
<firstName-lastName>
branch, committing changes regularly - Push commits: git push origin
<firstName-lastName>
Your finished project must include all of the following requirements:
- Solve any three of the five problems
For each problem that you choose to solve, complete the following:
- Navigate into each exercise's directory
- Read the instructions for the exercise in the README
- Implement your solution in the
.py
skeleton file - Make sure your code passes the tests running the test script with make tests
Note: For these exercises, we expect you to use Python's built-in dict
as a hashtable. That said, if you wish, you can attempt to solve using your own hashtable implementation, as well. All solutions should utilize a dict
or hashtable. You should not use Sets. (Though you can make a dict
behave like a set if you wish.)
After finishing your required elements, you can push your work further. These goals may or may not be things you have learned in this module, but they build on the material you just studied. Time allowing, stretch your limits, and see if you can deliver on the following optional goals:
- Solve any four of the five problems
- Solve all five problems
Follow these steps to complete your project.
- Submit a Pull-Request to merge Branch into master (student's Repo). Please don't merge your own pull request
- Add your team lead as a reviewer on the pull-request
- Your team lead will count the project as complete after receiving your pull-request