In the referenced video, we are told that you can find a dataset on Kaggle. The da

i used this <a href="https://www.kaggle.com/sa6anv/btc-dataset-37-features/data" rel="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Also tried with kaggle dataset - I am getting: <div class="snippet-clipboard-conte

What data set should this model be used with? about ethereum_future HOT 20 OPEN

llsourcell commented on July 27, 2024 19

What data set should this model be used with?

from ethereum_future.

Comments (20)

issxjl2015 commented on July 27, 2024 3

where are Dataset?

from ethereum_future.

Sa6a commented on July 27, 2024 3

i used this dataset which include all features that Raval was talking about. But my statistic is following:
Precision: 0.5376884422110553
Recall: 0.5783783783783784
F1 score: 0.5572916666666665
Mean Squared Error: 0.217757766929

from ethereum_future.

darevolution commented on July 27, 2024 2

@llSourcell Could you please share the link to data source used to train this model ?

from ethereum_future.

zhivko commented on July 27, 2024

Also tried with kaggle dataset - I am getting:

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-18-c25ebc6ecaf6> in <module>()
----> 1 X_train, Y_train, X_test, Y_test, Y_daybefore, unnormalized_bases, window_size = load_data("Bitcoin Data.csv", 50)
      2 print (X_train.shape)
      3 print (Y_train.shape)
      4 print (X_test.shape)
      5 print (Y_test.shape)

<ipython-input-8-c56ba76693f8> in load_data(filename, sequence_length)
     36     #Normalizing data by going through each window
     37     #Every value in the window is divided by the first value in the window, and then 1 is subtracted
---> 38     d0 = np.array(result)
     39     dr = np.zeros_like(d0)
     40     dr[:,1:,:] = d0[:,1:,:] / d0[:,0:1,:] - 1

MemoryError:

Please post url of datasource where you got csv from.

from ethereum_future.

murchie85 commented on July 27, 2024

Same thing all the time with these code links in the vid - says 'its easy to find any data'. But there are tons of CSV files on Kaggle, and most of them don't work (no doubt for a reason). Siraj needs to specify a little more than, check out this code its easy to use. FYI should also have mentioned its for Python2 not python3.

from ethereum_future.

triestpa commented on July 27, 2024

All of the mentioned fields can be retrieved from (or computed using data from), a bevy of free data sources.

We'll have to do a bit more legwork to get the data formatted correctly, but perhaps to fully understand how the network configuration / preprocessing works it can be valuable to reconfigure the existing code for a custom dataset.

I can't find any CSV online either that 100% matches the specified schema, but hey, sometimes building/cleaning your own dataset can be half the fun.

from ethereum_future.

simonhughes22 commented on July 27, 2024

I'd like to see the data too. Many we can collaborate to build that dataset unless the author can provide the code to do so. I think most of the value in this approach is from the dataset and not the modeling techniques, although RNN's are powerful for time series prediction. But right now I am much more interested in the data.

from ethereum_future.

esemve commented on July 27, 2024

Please post a valid csv example, because it don't work... :( I will create my own dataset, but what is the correct schema?

from ethereum_future.

Shaitender-Intg commented on July 27, 2024

Dataset for training the model or post the correct data schema for the same.

from ethereum_future.

triestpa commented on July 27, 2024

The tutorial dataset schema is specified in the Step 1 notebook cell -

The columns of data and their definitions are as follows:

Annual Hash Growth: Growth in the total network computations over the past 365 days
Block Height: The total number of blocks in the blockchain
Block Interval: Average amount of time between blocks
Block Size: The storage size of each block (i.e. megabytes)
BlockChain Size: The storage size of the blockchain (i.e. gigabytes)
Daily Blocks: Number of blocks found each day
Chain Value Density: The value of bitcoin's blockchain, in terms of dollars per megabyte
Daily Transactions: The number of transactions included in the blockchain per day
Difficulty: The minimum proof-of-work threshold required for a bitcoin miner to mine a block
Fee Percentage: Average fee paid as a percentage of transaction volume
Fee Rate: Average fee paid per transaction
Two-Week Hash Growth: Growth in the total network computations over the past 14 days
Hash Rate: The number of block solutions computed per second by all miners
Market Capitalization: The market value of all bitcoin in circulation
Metcalfe's Law - TX: A variant of Metcalfe's Law in which price is divided by n log n number of daily transactions
Metcalfe's Law - UTXO: A variant of Metcalfe's Law in which price is divided by n log n number of unspent transaction outputs
Miner Revenue Value: The amount of dollars earned by the mining network
Miner Revenue: The amount of bitcoin earned by the mining network, in the form of block rewards and transaction fees
Money Supply: The amount of bitcoin in circulation
Output Value: The dollar value of all outputs sent over the network
Output Volume: The amount of Bitcoin sent over the network
Bitcoin Price: The amount of dollars a single bitcoin is worth
Quarterly Hash Growth: Growth in the total network computations in the past 90 days
Total Transactions: The running total number of transactions processed by the Bitcoin network
Transaction Amount: The average amount of bitcoin moved per transaction
Fees Value: The dollar value of mining fees
Transaction Fees: The amount of bitcoin paid to miners in fees
Transaction Size: The average data size of a transaction
Transaction Value: The average dollar value moved in each transaction
Transactions per Block: The number of transactions in each block
Average UTXO Amount: The average amount of bitcoin contained in each unspent transaction output
UTXO Growth: The net number of unspent transaction outputs created
UTXO Set Size: The total number of unspent transaction outputs
Average UTXO Value: The average dollar value of each uspent transaction output
Velocity - Daily: The proportion of the money supply transacted each day
Velocity - Quarterly: The proportion of the money supply transacted each day, computed on a rolling-quarter basis
Velocity of Money: How many times the money supply changes hands in a given year

I imagine that the model can still be trained effectively on different schemas too - but you may have to adjust the shape of the tensor depending on the number of features.

Check the code near this comment for reference -

#Convert the data to a 3D array (a x b x c) 
#Where a is the number of days, b is the window size, and c is the number of features in the data file

from ethereum_future.

zhivko commented on July 27, 2024

url of dataset please.

from ethereum_future.

zhaosongyi commented on July 27, 2024

please provide the datasets, thanks

from ethereum_future.

davecerr commented on July 27, 2024

yeah as above....can you please provide dataset? thank you.

from ethereum_future.

simonhughes22 commented on July 27, 2024

Note you can use any dataset quite easily in his code, it is mostly generic. The main part you'd have to be aware of is the index of the BTC prices. They seem to be 20 in his code - look where he gets the labels in y_train. So if you change that index to match the index in your own data the rest of the code should work with whatever dataset you want.

from ethereum_future.

esemve commented on July 27, 2024

TypeError Traceback (most recent call last)
in ()
----> 1 model = initialize_model(window_size, 0.2, 'linear', 'mse', 'adam')
2 print(model.summary())

in initialize_model(window_size, dropout_value, activation_function, loss_function, optimizer)
18
19 #First recurrent layer with dropout
---> 20 model.add(Bidirectional(LSTM(window_size, return_sequences=True), input_shape=(window_size, X_train.shape[-1]),))
21 model.add(Dropout(dropout_value))
22

/usr/local/lib/python3.5/dist-packages/keras/models.py in add(self, layer)
278 else:
279 input_dtype = None
--> 280 layer.create_input_layer(batch_input_shape, input_dtype)
281
282 if len(layer.inbound_nodes) != 1:

/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py in create_input_layer(self, batch_input_shape, input_dtype, name)
368 # and create the node connecting the current layer
369 # to the input layer we just created.
--> 370 self(x)
371
372 def assert_input_compatibility(self, input):

/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py in call(self, x, mask)
485 'layer.build(batch_input_shape)')
486 if len(input_shapes) == 1:
--> 487 self.build(input_shapes[0])
488 else:
489 self.build(input_shapes)

/usr/local/lib/python3.5/dist-packages/keras/layers/wrappers.py in build(self, input_shape)
228
229 def build(self, input_shape):
--> 230 self.forward_layer.build(input_shape)
231 self.backward_layer.build(input_shape)
232

/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py in build(self, input_shape)
708 self.W_o, self.U_o, self.b_o]
709
--> 710 self.W = K.concatenate([self.W_i, self.W_f, self.W_c, self.W_o])
711 self.U = K.concatenate([self.U_i, self.U_f, self.U_c, self.U_o])
712 self.b = K.concatenate([self.b_i, self.b_f, self.b_c, self.b_o])

/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py in concatenate(tensors, axis)
716 return tf.sparse_concat(axis, tensors)
717 else:
--> 718 return tf.concat(axis, [to_dense(x) for x in tensors])
719
720

/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_ops.py in concat(values, axis, name)
1045 ops.convert_to_tensor(axis,
1046 name="concat_dim",
-> 1047 dtype=dtypes.int32).get_shape(
1048 ).assert_is_compatible_with(tensor_shape.scalar())
1049 return identity(values[0], name=scope)

/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py in convert_to_tensor(value, dtype, name, preferred_dtype)
649 name=name,
650 preferred_dtype=preferred_dtype,
--> 651 as_ref=False)
652
653

/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype)
714
715 if ret is None:
--> 716 ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
717
718 if ret is NotImplemented:

/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/constant_op.py in _constant_tensor_conversion_function(v, dtype, name, as_ref)
174 as_ref=False):
175 _ = as_ref
--> 176 return constant(v, dtype=dtype, name=name)
177
178

/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/constant_op.py in constant(value, dtype, shape, name, verify_shape)
163 tensor_value = attr_value_pb2.AttrValue()
164 tensor_value.tensor.CopyFrom(
--> 165 tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
166 dtype_value = attr_value_pb2.AttrValue(type=tensor_value.tensor.dtype)
167 const_tensor = g.create_op(

/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_util.py in make_tensor_proto(values, dtype, shape, verify_shape)
365 nparray = np.empty(shape, dtype=np_dt)
366 else:
--> 367 _AssertCompatible(values, dtype)
368 nparray = np.array(values, dtype=np_dt)
369 # check to them.

/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_util.py in _AssertCompatible(values, dtype)
300 else:
301 raise TypeError("Expected %s, got %s of type '%s' instead." %
--> 302 (dtype.name, repr(mismatch), type(mismatch).name))
303
304

TypeError: Expected int32, got <tensorflow.python.ops.variables.Variable object at 0x7fdee083b518> of type 'Variable' instead

from ethereum_future.

TingALin commented on July 27, 2024

@simonhughes22 I agree with you. I test the codes with a 4 features dataset from Kaggle, they work. However, other than some codes questions, I am wondering how many days of those codes can predict and how can you see them? Will be appreciated if I can get your take on this.

from ethereum_future.

calvinchankf commented on July 27, 2024

hi @TingALin, which dataset did u use?
this https://www.kaggle.com/mczielinski/bitcoin-historical-data/data ?

from ethereum_future.

TingALin commented on July 27, 2024

@calvinchankf yes, but only with open, close, high, low as the features for testing

from ethereum_future.

TingALin commented on July 27, 2024

@calvinchankf do you know how to print out the predicted price btw? cause I don't see the predicted price from the code.

from ethereum_future.

calvinchankf commented on July 27, 2024

@TingALin no i gave up trying this sample cos i think using future features(bi-direction) is kind of unrealistic to predict future prices. so i ended up studying other samples

from ethereum_future.

What data set should this model be used with? about ethereum_future HOT 20 OPEN

Comments (20)

Related Issues (12)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs