ekinoguz / sprout-database Goto Github PK
View Code? Open in Web Editor NEWDatabase project of a graduate class.
Database project of a graduate class.
We can do this in CLI but I am not sure if there can a be private test case in which they test this?
if 100000 tuples are added as in @8fcc6a6b0f537534f9eeb8f2e8dd132ba36795b1
then we get an error. I'll run valgrind and report back any findings.
There is something wired about it every time I think about it. We are splitting nodes while going down the tree. We might split a node in level 1 and then when we go down the lower layer node, in level 2, might not need splitting, this will leave a page pointer in the node at level 1 with nothing to point at. Is that right?
It just needs to call insertKey if applicable.
it is the extra test case 2.
What is it supposed to do? More importantly, does it pass?
I notice that a scan of a single key returns
488 488
462 462
436 436
410 410
384 384
358 358
332 332
306 306
280 280
254 254
228 228
202 202
176 176
150 150
124 124
98 98
72 72
46 46
20 20
Is this correct?
This should be done in the cli layer.
How are we going to keep track of the root node of the B-Tree (which page contains the root node). I don't think we can assume that it's always page zero, because if we split the root and we had to create a new root, its page will be appended to the end of the file.
cuz of cache (most probably), in drop table
when cache is enabled, I get seg fault in test case 3, cache.cc line 72,
when cache is disabled, I get seg fault after the last test case, cache.cc line 40
This is an issue in their test case 7 and our 4.
Here's what I did in FindEntryPage and InsertEntry. Nothing is tested and splitting is not implemented yet.
This function returns the pageNum (it must a leaf node) in the index file where the entry should be inserted. It should always do splitting. We need to write another search without splitting in IX_IndexScan that does not do splitting unless we can merge the two functions are in two classes.
The function starts with page 0 which is the root and goes all the way down. At every level of the tree it parses the keys in the node and locate the lower level pageNum that should be traversed to next. Once it hits a leaf, it stops.
Splitting is not yet implemented and I have some doubt about it, check issue #13.
I used memcmp to compare the keys, I'm not sure if its usage is correct here, but it should.
This function calls FindEntryPage first to locate the leaf node page where the key should be inserted. Then it reads the page, and scan through all the keys on it to locate where the new key should be inserted (note that all keys should sorted). The key will be inserted preceded with 4 byte length if needed, and prefixed with rid.pageNum and rid.slotNum.
While I am trying to add 2 records using file import in cli, I see "VarChar is larger than maximum error". However size of the data is 12 and max size of the varchar in database is 100. I don't know if there is a problem in cli or rm. If you run clitest in ixcli branch, you will see the errors. (I commented out other test cases)
seg fault with cache on
double free with cache off.
We should pass it through valgrind sometime way later in the future.
For the curious here is the stack trace
#0 0x00007fff887d4d46 in __kill ()
#1 0x00007fff8520adf0 in abort ()
#2 0x00007fff851e6905 in szone_error ()
#3 0x00007fff851e743e in small_free_list_remove_ptr ()
#4 0x00007fff851e510f in szone_free_definite_size ()
#5 0x00007fff851de8f8 in free ()
#6 0x000000010001baeb in RM_ScanFormattedIterator::~RM_ScanFormattedIterator (this=0x7fff5fbfee70) at rm.h:73
#7 0x000000010001500d in RM::getAttributesFromCatalog () at rm.cc:411
#8 0x0000000100015c9f in RM::insertTuple (this=0x10003ba20, tableName={static npos = , _M_dataplus \
= {<allocator> = {<__gnu_cxx::new_allocator> = {}, }, _M_p = 0x10030443
8 "emptestO5"}}, data=0x115f307b0, rid=@0x7fff5fbff050, useRid=false) at rm.cc:556
#9 0x0000000100001269 in insertTuples (tablename={static npos = , _M_dataplus = {<allocator> \
= {<__gnu_cxx::new_allocator> = {}, }, _M_p = 0x100304438 "emptestO5"}}, rids
=@0x7fff5fbff4e0, tuples=@0x7fff5fbff4c0, number=600000, uniq=true, dups=4) at ixtest.cc:127
#10 0x00000001000073f7 in testCase_O5 () at ixtest.cc:1851
#11 0x000000010000a63d in ourTests () at ixtest.cc:2366
#12 0x000000010000a6ee in main () at ixtest.cc:2388
This should include large amounts of data, and all different types of scans (also scan and delete).
There are two errors outside of _Supposed to fail line_
It would just have to do a drop and then an add...
Thoughs?
Here's the scenario, which is happening in private test case 7:
The test case is adding 50 pages to the file. Since the cache size is only 10 frames, then pages 0 to 9 will be stored in cache (in frames 0 to 9 respectively). At this point all the pages in cache are used once (which means accessed once for either read or write). Now when page 10 is added, the LFU will evict frame 0 from cache and since the page is dirty it will be written to disk. Page 10 will be stored in the cache at frame 0 and the usage for this page will be 1. The problem occurs when page 11 is added to the cache. What will happen is that LFU will evict frame 0 (because all frames has usage of 1), and then page 10 (which is in frame 0) will be written to disk. An error will occur because the file on disk has only page 0.
An easy and simple solution for this, which I don't like because it contradict with the requirements for project 1, is to remove the condition that if the page number that we are writing to disk is larger than the number of pages in the file. This will allow for empty pages between page 0 and page 10. However, since we know that these pages are in the cache, they will be eventually written to disk at some point.
Another solution is when writing pages to file, consider the number of pages in the file and in the cache together. This will also allow for empty pages between 0 and 10, but will make sure that these pages are in the cache.
EDIT: For reference here is the piazza question https://piazza.com/class#spring2013/cs222cs122c/62
Currently GetNumberOfPages will read the number of pages directly from disk. This is incorrect usage, if the file has dirty pages stored in the cache (specifically new dirty pages).
I'm not sure what the best way to handle it is, maybe check to see if there are any new dirty pages before returning the final page count?
This is low priority, but the error given when print can't find a table is confusing.
When there is a call made to insertTuple the tuple needs to be correctly translated using the catalog and then passed to insertFormattedTuple.
It looks like the drop attribute in clitest is returning a failure code.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.