GithubHelp home page GithubHelp logo

sjvasquez / web-traffic-forecasting Goto Github PK

View Code? Open in Web Editor NEW
660.0 30.0 239.0 920 KB

Kaggle | Web Traffic Forecasting 📈

Python 100.00%
time-series forecasting convolutional-neural-networks tensorflow

web-traffic-forecasting's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

web-traffic-forecasting's Issues

Decode Features

In the decode features, why are we passing the one hot encoded values of the categorical variables?

        self.decode_features = tf.concat([
            tf.one_hot(decode_idx, self.num_decode_steps),
            tf.tile(tf.reshape(self.log_x_encode_mean, (-1, 1, 1)), (1, self.num_decode_steps, 1)),
            tf.tile(tf.expand_dims(tf.one_hot(self.project, 9), 1), (1, self.num_decode_steps, 1)),
            tf.tile(tf.expand_dims(tf.one_hot(self.access, 3), 1), (1, self.num_decode_steps, 1)),
            tf.tile(tf.expand_dims(tf.one_hot(self.agent, 2), 1), (1, self.num_decode_steps, 1)),
        ], axis=2)

Two errors occurred while running cnn.py

For anaconda python 3.6 version:
1.
File "D:\Anaconda\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 32, in init
self._value = int(value)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'Tensor'

File "D:\Anaconda\lib\site-packages\tensorflow\python\framework\tensor_util.py", line 302, in _AssertCompatible
(dtype.name, repr(mismatch), type(mismatch).name))
TypeError: Expected int32, got 1.0 of type 'float' instead.

What is the hierarchy of the codes/files in this repo?

Hi,
Is there anybody that can help me to figure out how can I run the repo codes in order? I cannot figure out the hierarchy of the codes/files in the repo that I can run them step by step to produce the results.
Thanks

Data folder is empty

The data folder does not contain train and test dataset or processed folder, and the train dataset from Kaggle is train_1 and train_2. How can we use these?

sequence smape loss function

zero_loss = 2.0*tf.ones_like(smape)
nonzero_loss = smape
smape = tf.where(tf.logical_or(tf.equal(y, 0.0), tf.equal(y_hat, 0.0)), zero_loss, nonzero_loss)

There is 'or' condition. What if y !=0.0 and y_hat=0.0. Sequence smape will still give value of zero loss.

It should be 'and' condition.

Code not running -Tensorflow gather_nd bounds problem

I was trying to get this code running on my local system --> I am facing this error-

Traceback (most recent call last):
File "cnn.py", line 414, in
nn.fit()
File "/Users/srikanthjammy/Documents/midterm/tf_base_model.py", line 142, in fit
feed_dict=val_feed_dict
File "/Users/srikanthjammy/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/Users/srikanthjammy/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/Users/srikanthjammy/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/Users/srikanthjammy/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: flat indices[15493, :] = [121, -1] does not index into param (shape: [128,486,32]).
[[Node: GatherNd_23 = GatherNd[Tindices=DT_INT32, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](add_24, stack_23)]]

Caused by op u'GatherNd_23', defined at:
File "cnn.py", line 412, in
num_decode_steps=64,
File "cnn.py", line 121, in init
super(cnn, self).init(**kwargs)
File "/Users/srikanthjammy/Documents/midterm/tf_base_model.py", line 99, in init
self.graph = self.build_graph()
File "/Users/srikanthjammy/Documents/midterm/tf_base_model.py", line 344, in build_graph
self.loss = self.calculate_loss()
File "cnn.py", line 366, in calculate_loss
y_hat_decode = self.decode(y_hat_encode, conv_inputs, features=self.decode_features)
File "cnn.py", line 265, in decode
slices = tf.reshape(tf.gather_nd(conv_input, idx), (batch_size, dilation, shape(conv_input, 2)))
File "/Users/srikanthjammy/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1338, in gather_nd
name=name)
File "/Users/srikanthjammy/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)

On searching online, looks like it is a tensorflow bug -- tensorflow/tensorflow#12608
Did you face this issue?
I'm running the data on CPU btw.
versions of numpy, pandas, scikit and tensorflow are as you had mentioned.

line 342 in cnn.py

(next_finished, emit_output, state_queues) = loop_fn(time, initial_input, state_queues)
this code that call loop_fun with initial_input,so,I think the initial_input parameter is not update in all loop。can you explain this for me?

cnn.py line260 queue_begin_time = self.encode_len - dilation - 1

I think the code in this line should be self.encode_len - dilation . for example [0,1,2,3,4,5,6,7,8,9] dilation=4 idx=10-4=6 ,
slices = tf.reshape(tf.gather_nd(conv_input, idx), (batch_size, dilation, shape(conv_input, 2)))

should be [6,7,8,9] .(the last dilation of th seq).or you will loss the last day value

how to understand seperate parameters handling the accumulating ?

WaveNet was trained using next step prediction, so errors can accumulate as the model generates long sequences in the absence of conditioning information. To remedy this, we trained the model to minimize the loss when unraveled for 64 steps. We adopt a sequence to sequence approach where the encoder and decoder do not share parameters. This allows the decoder to handle the accumulating noise when generating long sequences.

above said that using seperate parameters the accumulating noise will not be a big issue, basically the encoder part still accumulating the noise then transfer to the decoder part. I think I may miss something for better understanding the picture, can you please tell us more about it ?

padding seems wrong

In the function temporal_convolution_layer.
shift = (kernel_size // 2) + (int(dilation_rate - 1) // 2)

In Keras and some other implementations. The equation is like this
shift = dilation_rate * (kernel_size - 1)

If it is wrong here, you may use some future information.

shift should plus 1?

    if causal:
        shift = int((convolution_width / 2) + (int(dilation_rate[0] - 1) / 2))
        pad = tf.zeros([tf.shape(inputs)[0], shift, inputs.shape.as_list()[2]])
        inputs = tf.concat([pad, inputs], axis=1)

shift may should plus 1

Always uses initial_input for loop_fn

Hi,

Thanks so much for sharing your perfect work. But I was confused in the decode part:

def body(time, elements_finished, emit_ta, *state_queues):
(next_finished, emit_output, state_queues) = loop_fn(time, initial_input, state_queues)
emit = tf.where(elements_finished, tf.zeros_like(emit_output), emit_output)
emit_ta = emit_ta.write(time, emit)
elements_finished = tf.logical_or(elements_finished, next_finished)
return [time + 1, elements_finished, emit_ta] + list(state_queues)

In line 343, function loop_fn, always takes initial_input as the parameter current_input.

I wonder why we don't use previous prediction for loop_fn? Just likes:

def body(time, elements_finished, emit_ta, *state_queues):
    current_input = tf.cond(time == 0, initial_input, emit_ta.read(time - 1)
    (next_finished, emit_output, state_queues) = loop_fn(time, current_input, state_queues)
    ...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.