runt18 / google-bigquery Goto Github PK

View Code? Open in Web Editor NEW

0.0 3.0 0.0 0 B

Automatically exported from code.google.com/p/google-bigquery

google-bigquery's Introduction

google-bigquery

google-bigquery's People

Contributors

Watchers

google-bigquery's Issues

Python Code Sample for Result Pagination

http://stackoverflow.com/questions/17751944/google-bigquery-incomplete-query-rep
lies-on-odd-attempts

We would like some Python code demonstrating pagination of a query reply. We 
found the documentation page: developers.google.com/bigquery/docs/data#paging, 
but it didn't have any code samples. It's not clear from the documentation how 
this would be done.

Original issue reported on code.google.com by [email protected] on 31 Jul 2013 at 7:55

Dataset Disappears

What steps will reproduce the problem?
1. Log into the BigQuery Console.
2. Under my project named "API Project", click the dropdown and select "Create 
new dataset".
3. Note that the dataset appears fine in that you can interact with it (create 
tables, upload data, etc.)
4. Refresh the page and the dataset is gone.
5. Try to create a dataset with the same name and it says it exists even though 
it is not displayed.

What is the expected output? What do you see instead?
I expect the dataset to still be listed under my project. 

What version of the product are you using? On what operating system?
Tried with Chrome and IE.


Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 22 Oct 2013 at 6:51

load file to bq failed while I used API, but success while I used bq command line tool

What steps will reproduce the problem?
1. file store at 
gs://nested_logs_tagtootrack/log2bq-20130704-1580687957719EDE6238F-output

2. while I used bq command line tool to load the file, everything is good

bq load --source_format=NEWLINE_DELIMITED_JSON --max_bad_records=0 
logs_tagtootrack.nested_logs_20130704 
gs://nested_logs_tagtootrack/log2bq-20130704-1580687957719EDE6238F-output 
tagtootrack.new.schema

The tagtootrack.new.schema looks like

[{"type": "string", "name": "session"}, {"type": "string", "name": "slot"}, 
{"type": "string", "name": "target"}, {"type": "string", "name": "vars", 
"mode": "repeated"}, {"type": "string", "name": "title"}, {"type": "string", 
"name": "ip"}, {"type": "float", "name": "start_time"}, {"type": "string", 
"name": "publisher"}, {"type": "string", "name": "ext"}, {"type": "string", 
"name": "host"}, {"type": "string", "name": "tag"}, {"type": "string", "name": 
"features", "mode": "repeated"}, {"type": "string", "name": "user_agent"}, 
{"type": "string", "name": "version"}, {"fields": [{"type": "string", "name": 
"var"}, {"type": "string", "name": "advertiser"}, {"type": "string", "name": 
"campaign"}], "type": "record", "name": "items", "mode": "repeated"}, {"type": 
"string", "name": "referral"}, {"type": "string", "name": "type"}, {"type": 
"string", "name": "page"}, {"type": "string", "name": "user"}]

3. while I used api, load the same file, it failed and return

{u'status': {u'state': u'DONE', u'errors': [{u'reason': u'internalError', 
u'message': u'Unexpected. Please try again.'}, {u'reason': u'invalid', 
u'message': u'Too many errors encountered. Limit is: 0.'}], u'errorResult': 
{u'reason': u'invalid', u'message': u'Too many errors encountered. Limit is: 
0.'}}, u'kind': u'bigquery#job', u'statistics': {u'load': {u'outputRows': 
u'214468', u'inputFiles': u'1', u'inputFileBytes': u'10260708075', 
u'outputBytes': u'0'}, u'endTime': u'1372990828961', u'startTime': 
u'1372990544203'}, u'jobReference': {u'projectId': u'tagtoosql', u'jobId': 
u'job_417754af2f234f61b917ab2583e6a3df'}, u'etag': 
u'"hGuPPOkA35eFd5045aqCjqFeM_s/_YV1JYm-8zv5rNmA7GSBkZYt-3M"', u'configuration': 
{u'load': {u'encoding': u'UTF-8', u'sourceFormat': u'NEWLINE_DELIMITED_JSON', 
u'destinationTable': {u'projectId': u'tagtoosql', u'tableId': 
u'nested_logs_20130704', u'datasetId': u'logs_tagtootrack'}, 
u'writeDisposition': u'WRITE_TRUNCATE', u'sourceUris': 
[u'gs://nested_logs_tagtootrack/log2bq-20130704-1580687957719EDE6238F-output'], 
u'createDisposition': u'CREATE_IF_NEEDED', u'schema': {u'fields': [{u'type': 
u'STRING', u'name': u'session'}, {u'type': u'STRING', u'name': u'slot'}, 
{u'type': u'STRING', u'name': u'target'}, {u'type': u'STRING', u'name': 
u'title'}, {u'type': u'STRING', u'name': u'ip'}, {u'type': u'FLOAT', u'name': 
u'start_time'}, {u'type': u'STRING', u'name': u'publisher'}, {u'type': 
u'STRING', u'name': u'ext'}, {u'type': u'STRING', u'name': u'creative'}, 
{u'type': u'STRING', u'name': u'host'}, {u'type': u'STRING', u'name': u'tag'}, 
{u'type': u'STRING', u'name': u'features', u'mode': u'REPEATED'}, {u'type': 
u'STRING', u'name': u'user_agent'}, {u'type': u'STRING', u'name': u'version'}, 
{u'fields': [{u'type': u'STRING', u'name': u'advertiser'}, {u'type': u'STRING', 
u'name': u'campaign'}], u'type': u'RECORD', u'name': u'items', u'mode': 
u'REPEATED'}, {u'type': u'STRING', u'name': u'referral'}, {u'type': u'STRING', 
u'name': u'type'}, {u'type': u'STRING', u'name': u'page'}, {u'type': u'STRING', 
u'name': u'user'}]}}}, u'id': 
u'tagtoosql:job_417754af2f234f61b917ab2583e6a3df', u'selfLink': 
u'https://www.googleapis.com/bigquery/v2/projects/tagtoosql/jobs/job_417754af2f2
34f61b917ab2583e6a3df'}

The two method should return the same results and please provide more 
information for user to debug the API.

Original issue reported on code.google.com by [email protected] on 5 Jul 2013 at 7:39

Add a parameter to filter jobs listing by ids

Actually we can either :
- request for a specific job specifying its id with a jobs().get(id) call,
- or request for all jobs (with a few filtering params like job state) with a 
jobs().list() call

It could by really usefull to be able to specify a list of ids to filter the 
job listing. (eg. a task lauching some jobs could checks more easily for the 
jobs completion, instead of making call to jobs().get(id) for each job and 
check each state)

Original issue reported on code.google.com by [email protected] on 27 Mar 2013 at 3:54

setProjectId(), setQuery() and other BigQuery functions stopped working as of 18 June 2013

setProjectId(), setQuery() and other BigQuery functions have stopped working as 
of 18 June 2013.

What is the expected output? What do you see instead?
On using these functions, the variables remain undefined when they should 
ideally be populated.

A small sample apps script code that reproduces the issue:
var newJobReference = BigQuery.newJobReference().setProjectId(yourProjectID);
var jobConfig = 
BigQuery.newJobConfiguration().setQuery(yourJobQueryConfiguration);

What version of the product are you using? On what operating system?
v2, OS X 10.8.3

Original issue reported on code.google.com by [email protected] on 25 Jun 2013 at 12:38

Service Outage - Import

We are having import issues since 8:15 am PST, June 1st. Looks like import is 
stuck or timing out. I am still seeing problem @ 11:00 am PST, June 1st.

Are there any known issues at this time?

Original issue reported on code.google.com by [email protected] on 1 Jun 2013 at 5:51

Provide timezone conversion functions in bigquery

At present, bigquery does not have any function that can provide time zone c 
onversions.

This issue is discussed in more detail here: 
http://stackoverflow.com/questions/12482637/bigquery-converting-to-a-different-t
imezone/12482905#comment20030832_12482905

As a workaround one can use static conversion by using something like this: 
UTC_USEC_TO_DAY(timestamp * 1000000 - (5*60*60*1000*1000000) ) but it doesn't 
work for daylight savings time.

It would be great if this functionality was provided as the date could be 
stored in GMT but the users may wish to see the report in any given timezone.

Original issue reported on code.google.com by [email protected] on 21 Feb 2013 at 3:40

Service Outage - 502

Hi,

Our imports are failing with 502. Here is message from the log..

It is now over 1 hr we are getting this issue. Web Browser for BQ is also down.

However, when I visit the Google API console, it shows no issues. Is anyone 
looking into the problem?

How do I report service issues? Is this the right forum?

INFO:root:--response-end--
INFO:root:--response-start--
INFO:root:date: Wed, 01 May 2013 02:15:44 GMT
INFO:root:status: 502
INFO:root:content-length: 983
INFO:root:content-type: text/html; charset=UTF-8
INFO:root:server: GFE/2.0
INFO:root:<!DOCTYPE html>
<html lang=en>
  <meta charset=utf-8>
  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
  <title>Error 502 (Server Error)!!1</title>
  <style>
    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}
  </style>
  <a href=//www.google.com/><img src=//www.google.com/images/errors/logo_sm.gif alt=Google></a>
  <p><b>502.</b> <ins>That’s an error.</ins>
  <p>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.  <ins>That’s all we know.</ins>

INFO:root:--response-end--

Original issue reported on code.google.com by [email protected] on 1 May 2013 at 2:29

Allow expressions/constants in join predicates

Currently join predicates must be qualified field names.
Expressions or constants are not allowed.

E.g. this query does not work:

SELECT *
FROM
(SELECT year,
  COUNT(*) AS cnt
 FROM publicdata:samples.natality
 GROUP BY year) cur

 LEFT OUTER JOIN (SELECT year,
  COUNT(*) AS cnt
 FROM publicdata:samples.natality
 GROUP BY year) prev

 ON cur.year - 1 = prev.year

and fails with:  "Error: ON clause must be AND of = comparisons of one field 
name from each table, with all field names prefixed with table name."

Using "ON prev.year = 2000" also fails as constants are not allowed.

The workaround is to use nested subqueries with the expression/constant as a 
column, which can then then joined on in the outer scope. However this 
increases query complexity and slows down ad hoc query development.

Please allow expressions and constants in join predicates.

Original issue reported on code.google.com by [email protected] on 4 Oct 2013 at 7:02

Merged into: #448

Define virtual objects in bigquery

As I have some "complicated" extractions (using regex functions). I would like 
to have some of them predefined, so every time someone approaches such 
extraction, they use same predefined logic. I would prefer to avoid pre 
calculating them and holding them in a new table, as it has its overhead 
maintenance costs.

I am looking for a way to define virtual tables or virtual fields. the concept 
is similar to views, or calculated columns in RDBMS systems such as MS SQL 
Server.

I would like to have a way to define such virtual objects that will be 
calculated only on run time.

(http://stackoverflow.com/questions/19376414/virtual-objects-on-bigquery)

Original issue reported on code.google.com by [email protected] on 16 Oct 2013 at 5:14

BigQuery Access from AppEngine is timeouted several times during last week

What steps will reproduce the problem?
Run BigQuery Query using Java API library. It will intermediately fail with 
timeout. 

What is the expected output? What do you see instead?
Response returned within 10 sec.

What version of the product are you using? On what operating system?
Java AppEngine

Please provide any additional information below.
The stack trace for failure is the following:

2013-06-07 17:05:31.006 /_ah/queue/__deferred__ 500 39253ms 0kb 
AppEngine-Google; (+http://code.google.com/appengine)

...

Caused by: com.electionear.fw.v2.shared.exception.DataAccessException: Can not 
obtain request result (Timeout while fetching URL: 
https://www.googleapis.com/bigquery/v2/projects/920037298476/queries)
    at [...].service.dataaccess.QueryBQVoterTableCommand.query(QueryBQVoterTableCommand.java:...)
    at [...].service.dataaccess.QueryBQVoterTableCommand.execute(QueryBQVoterTableCommand.java:...)
    at [...].service.DataAccessService.queryBQVotersTable(DataAccessService.java:...)
    at [...].service.DataAccessService$$FastClassByCGLIB$$3ac0c5b6.invoke(<generated>)
    at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
    at org.springframework.aop.framework.Cglib2AopProxy$CglibMethodInvocation.invokeJoinpoint(Cglib2AopProxy.java:688)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
    at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:80)
    at com.electionear.fw.v2.server.util.profiler.ProfilerAspect.log(ProfilerAspect.java:28)
    at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:45)
    at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621)
    at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610)
    at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:65)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
    at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:80)
    at com.electionear.fw.v2.server.util.logger.LoggerAspect.log(LoggerAspect.java:55)
    at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:45)
    at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621)
    at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610)
    at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:65)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
    at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:90)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
    at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:621)
    at com.electionear.fw.v2.server.service.DataAccessService$$EnhancerByCGLIB$$252cd130.queryBQVotersTable(<generated>)
    at com.electionear.fw.v2.server.tasks.bigquery.counters.RequestCountersTask.run(RequestCountersTask.java:40)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
    at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
    at com.googlecode.objectify.cache.AsyncCacheFilter.doFilter(AsyncCacheFilter.java:59)
    at com.googlecode.objectify.ObjectifyFilter.doFilter(ObjectifyFilter.java:49)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)

Original issue reported on code.google.com by [email protected] on 10 Jun 2013 at 5:16

importing a datastore backup into bigquery does not work with WRITE_TRUNCATE

We are loading datastore backups into big query with the big query v2 api. We 
are specifying this JSON configuration:

{'configuration': {
    'load': {
        'sourceFormat'     : 'DATASTORE_BACKUP',
        'writeDisposition' : 'WRITE_TRUNCATE',
        'sourceUris'       : sourceUris,
        'destinationTable' : {
            'projectId': settings.PROJECT_ID,
            'datasetId': datasetId,
            'tableId'  : entityKind
            }
        }
    }
}
We have already loaded this entity into BigQuery once and are now expecting 
further loads to replace the existing table with the new data. We are not 
seeing this but an error in the insert job request:

u'status': {
u'state': u'DONE',
u'errors': [
  {
    u'reason': u'invalid',
    u'message': u'Cannot import a datastore backup to a table that already has a schema.'
  }
],
u'errorResult': {
  u'reason': u'invalid',
  u'message': u'Cannot import a datastore backup to a table that already has a schema.'
}
},

Original issue reported on code.google.com by [email protected] on 11 Feb 2013 at 6:31

Bigquery Select is too slow - Takes 6.2 seconds to complete for a table with 16K rows & 18.6MB data

What version of the product are you using? On what operating system?
Using BigQuery Web tool using Chrome browser in Ubuntu.

Please provide any additional information below.

My query is pretty simple:

SELECT user_id,path from happyLatte.highnoon5_path

# Rows in table: 15837
# Columns in table: 2
# Size of data: 18.6MB

The query takes 6.2 seconds to complete. Is that expected? I would think 
something like this would be under 1 second. I've attached a screenshot.

Please advice.

Thanks,
Navneet

Original issue reported on code.google.com by [email protected] on 18 Oct 2013 at 11:03

Attachments:

BigQuerySlow.png

Not Found: Dataset on CSV load

What steps will reproduce the problem?
1. Delete a dataset
2. Create dataset
3. Try to load csv into dataset

What is the expected output? What do you see instead?
Should process the job but as of last night I now see:

Errors:
Not Found: Dataset blar
Job ID: job_952e6f613c7749278503fc59a207ab2c
Start Time: 2:10pm, 27 Jun 2013
End Time: 2:10pm, 27 Jun 2013
Destination Table:blar
Source URI: gs://blar
Source Format: CSV
Max Bad Records: 100
Schema:
f_id: INTEGER
f_screen_name: STRING
u_id: INTEGER
created: TIMESTAMP


What version of the product are you using? On what operating system?
Web interface and command line tool

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 27 Jun 2013 at 1:16

Query Not responding - Resource exceeded Error

Hi,

It was reported to me my our analyst that they have recently seen "Resource 
exceeded Error" in BQ Web Interface. Please see the job id below.

Query Failed
Error: Resources exceeded during query execution.

Job ID: job_826aaad1c0b645ae8b616785679975db

Let me know if you need anything from me.

Original issue reported on code.google.com by [email protected] on 7 Nov 2013 at 5:02

BigQuery Apps Script: ReferenceError: "BigQuery" is not defined.

What steps will reproduce the problem?
1. I have an Apps Script up and running that used BigQuery to pull in data - 
which worked perfectly
2. As of yesterday, as soon as I would run the script, I get the following 
error: "ReferenceError: "BigQuery" is not defined."
3. I then tried to run the tutorial 
(https://developers.google.com/apps-script/service_bigquery), but am seeing the 
same error. All of this used to work.

What is the expected output? What do you see instead?
Expected output would be to populate the cells in the Google Spreadsheet. 
Instead I get: "ReferenceError: "BigQuery" is not defined."

What version of the product are you using? On what operating system?
Apps Script using Google Spreadsheets. Mac OS X

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 6 Feb 2013 at 10:34

Import format: Protocol buffers

Today, according to [1], the only files which can be imported into BigQuery are 
[compressed] CSV or JSON files.

Protocol Buffers [2, 3] is an efficient method for serializing sets of data. It 
would be really useful if BigQuery knew natively how to ingest such files.


[1] https://developers.google.com/bigquery/articles/ingestioncookbook
[2] https://developers.google.com/protocol-buffers/
[3] http://code.google.com/p/protobuf/

Original issue reported on code.google.com by [email protected] on 31 Jan 2013 at 3:28

Bigquery stopped working

What version of the product are you using? On what operating system?

We are using BigQuery through Apps Scripts.  

Please provide any additional information below.

All of a sudden BigQuery is not returning data or error message.  Doesn't 
matter if accessed through Apps Script or bigquery.cloud.google.com.  Same 
issue for all of our tables as well as the public example tables (no query 
returned).

Original issue reported on code.google.com by [email protected] on 14 May 2013 at 5:57

Merged into: #16

Allow BigQuery Data Location Setting


Apparently there are some legal & tax issues for companies operating outside of 
the US when using services hosted in the US. I would like to be able to 
configure BigQuery location to be in EU (Similar to some other Google 
services). 

http://stackoverflow.com/questions/19488515/bigquery-data-location-setting/19503
411?noredirect=1#19503411

Original issue reported on code.google.com by [email protected] on 22 Oct 2013 at 11:01

Select * should return columns in the order specified by the schema, not alphabetical order

See title:

"Select * from [data] where [conditions]"

returns the correct results, but with the columns order in lexicographical 
order. It makes more sense for the columns to be in the order of the original 
schema.

Not sure if this is a bug or a feature.

Original issue reported on code.google.com by [email protected] on 2 Jul 2013 at 10:15

Import of string encoded boolean values raises InternalError


What steps will reproduce the problem?

1. Create a schema with a boolean field type
2. Import json data where the boolean value is a string ("false" instead of 
false)

What is the expected output? What do you see instead?

It used to be that this would work without errors (not sure if you would end up 
with a string or boolean though). 
I would expect a specific error to be raised, something like: "Expected 
boolean, got string".
At the moment an InternalError is raised with the message "Unexpected. Please 
try again.". 

What version of the product are you using? On what operating system?

I'm using JSON (Newline Delimited) format, it's easy to reproduce using the web 
interface or REST API.


thanks,
Jasper Op de Coul

Original issue reported on code.google.com by [email protected] on 19 Jul 2013 at 10:22

Add a function that provides URL decoding

Add a function to BigQuery's URL functions that provides URL decoding.

Example:

URL_DECODE(http%3A%2F%2Fwww.example.com%2Fhello%3Fv%3D12345)

returns:

http://www.example.com/hello?v=12345

Original issue reported on code.google.com by [email protected] on 24 Dec 2012 at 7:35

Can create table names that start with a number but cannot stream data to them

How to reproduce:
Im using: Google APIs Client Library for Python.
Create a table that has a name that starts with a number (in my case that was a 
zero)
Stream records into it (service.tabledata().insertAll ... )


What is the expected output? What do you see instead?
I've done the test twice. Create a table with a name that starts with a number 
and create one that start with a letter. Both creations succeed but I can only 
stream into the table that starts with a letter. I get the following error 
message (which doesn't really help a lot)

Error: {
 "error": {
  "errors": [
   {
    "domain": "global",
    "reason": "internalError",
    "message": "Unexpected. Please try again."
   }
  ],
  "code": 500,
  "message": "Unexpected. Please try again."
 }
}

What version of the product are you using? On what operating system?
I'm running python 2.7 on a debian compute engine instance release: 
3.3.8-gcg-201308121035
Api client version:
>>> import apiclient
>>> apiclient.__version__
'1.0'

I've also tested whether or not updating the apiclient libary made any 
difference:

sudo pip install -U apiclient

Downloading/unpacking apiclient
  Downloading apiclient-1.0.2.tar.gz
  Running setup.py egg_info for package apiclient

Installing collected packages: apiclient
  Running setup.py install for apiclient

Successfully installed apiclient
Cleaning up...


>>> import pkg_resources
>>> pkg_resources.get_distribution("apiclient").version
'1.0.2'



Please provide any additional information below.
I've looked through the documentation 
(https://developers.google.com/bigquery/docs/tables) and I didn't find anything 
on table names. I would understand it if table names weren't allowed to start 
with a number. But if that was so I shouldn't be able to create the table in 
the first place. Furthermore I would like to have a more descriptive error 
message. I only found out what the issue was by trial and error.

Original issue reported on code.google.com by [email protected] on 9 Oct 2013 at 7:05

AdSense earnings reduce unexpectedly

What steps will reproduce the problem?
1. At 3 o'clock I see $0.30
2. At 3:30 o'clock I see $0.25
3. At 4 o'clock I see 0.20

What is the expected output? What do you see instead?
Expected output is $0.30 at 4 o'clock. But the data was reduced.


What version of the product are you using? On what operating system?
I use online bigquery browser.


Please provide any additional information below.
I think the problem is that the BigQuery gathers information from various 
sources and sometimes it's overwrites the data.

Original issue reported on code.google.com by [email protected] on 5 Nov 2013 at 1:32

Periodic BadStatusLine when loading data into table

What steps will reproduce the problem?
1. Programmatically initiate a job load with a single destination table using a 
handful of sourceUris that contain gzipped entries.
2. About 2/3 of the time this will work correctly, the other times it will 
result in the traceback shown below.

What is the expected output? What do you see instead?
Expected output is a successful response containing a Job ID or at least a 
valid HTTP-compliant response. The following traceback shows up instead:

  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1027, in getresponse
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 407, in begin
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 371, in _read_status
BadStatusLine: ''

What version of the product are you using? On what operating system?
Mac OS X 10.8.3, it also occurs on Ubuntu. Using python v.2.7.2 with apiclient 
v.1.1.

Please provide any additional information below.
Another user has independently experienced the same problem: 
http://stackoverflow.com/questions/16326222/unable-to-load-data-into-bigquery-ba
dstatusline

Original issue reported on code.google.com by [email protected] on 30 May 2013 at 12:35

JOIN EACH not returning result

Hi,

I have detail issue on Stackoverflow

http://stackoverflow.com/questions/16571635/join-each-not-returning-result

Let me know if I am doing anything wrong.

Thanks,

Original issue reported on code.google.com by [email protected] on 31 May 2013 at 5:50

Allow * when importing from Web Interface and Cloud Storage

I have Cloud Storage logs stored on Cloud Storage and it would be very 
convenient to be able to import them by prefix into Big Query directly via the 
web interface.

That is, when you click on a Dataset, "Create and Import", in the "Select  
data" step, I can specify a Google Cloud Storage file. I would like to be able 
to specify only a prefix and add "*" at the end, requesting to import every 
file that start with what I specified.

Original issue reported on code.google.com by [email protected] on 11 Jul 2013 at 8:27

Better documentation

I would like a better documentation for working with the APIs.

In particular, maybe some best practices, or code fragments, for some of the 
most common scenarios.

I don't think I'm the only one in the Internet that should load automatically a 
file on Google Cloud Storage, and then build a program that automatically loads 
that file on BigQuery with all the right table fields and types.

It's not easy like I expected to browse through 4 or 5 different guides and 
tutorials (some of them old, and always different when comes the authentication 
phase) and then put all the pieces together to get a working prototype, I mean, 
I would like to invest my working hours in something more productive.

Original issue reported on code.google.com by [email protected] on 6 Apr 2013 at 4:22

Datasets get deleted upon logout

What steps will reproduce the problem?
1. Have two BigQuery accounts. One with my workplace ([email protected]), and 
the other is personal ([email protected]). Login in to 
[email protected]
2. Create dataSet for minke-data project and upload table
3. Log out -> Log back in 

What is the expected output? What do you see instead?
Expected datasets previously created with tables loaded. However, all datasets 
disappear. If I create another dataset with the same name, I get an error that 
suggests the dataset already exists. This only happens with my personal 
account. The [email protected] works fine.

What version of the product are you using? On what operating system?
Using Latest BigQuery. On Windows 7.

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 10 Sep 2013 at 1:17

BigQuery connector for excel - Request failed: Error. Unable to execute query. Timeout while fetching URL

What steps will reproduce the problem?
1.Run a heavy query that takes long to return using the connector for excel
2.
3.

What is the expected output? What do you see instead?
i was expecting to get the results after a few minutes
instead i get an error message:
Request failed: Error. Unable to execute query. Timeout while fetching URL: 
https://www.googleapis.com/bigquery/v2/projects/{my-project}/queries.


What version of the product are you using? On what operating system?
Excel 2013 on windows 7 64bit


Please provide any additional information below.

http://stackoverflow.com/questions/19684618/bigquery-connector-for-excel-request
-failed-error-unable-to-execute-query-t

Original issue reported on code.google.com by [email protected] on 31 Oct 2013 at 7:35

Errors encountered during job execution. Unexpected. Please try again.

In the last few days, I am trying to upload files to my BigQuery table but it 
keeps failing with: "Errors encountered during job execution. Unexpected. 
Please try again." JobsIds for example: job_8bf7e7d257884d3bab2e04ac1208fedb 
job_4354aa20427a4453ab14f9f18365d216 job_8b287b73b19d4a8a9dacc5f002e265a9

1. I have checked that there are no lines greater than 64K. 
2. There is nothing wrong with the data syntax. I have divided the file into 4 
pieces and they all completed the uploading job. The original gzipped file size 
was 328M and it failed. Even a half size (177M) of the gzipped file was failed. 
Each of the divided parts (>90M) completed the job. However, this solution is 
not working either. It is sometimes failing for even smaller file size.

Original issue reported on code.google.com by [email protected] on 4 Feb 2013 at 3:51

Table alias-qualified fields not allowed without a join

What steps will reproduce the problem?
1. Run this query:
SELECT nat.year
FROM publicdata:samples.natality nat
LIMIT 10

What is the expected output? What do you see instead?
Expected output: 10 rows of "year" values from publicdata:samples.natality

Actual output:
Query Failed
Error: Field 'nat.year' not found in table 'publicdata:samples.natality'.

What version of the product are you using? On what operating system?
Browser Tool

Please provide any additional information below.

Running this does work:
SELECT nat.year
FROM publicdata:samples.natality nat
LEFT OUTER JOIN (SELECT 2000 AS YEAR) tab
  ON nat.year = tab.year
LIMIT 10

Specifying a table alias without a join does work, but you cannot then 
reference the alias in the fields in the SELECT. This gives an inconsistent 
signal to the user as to whether table aliases are supported in such a query.
E.g. the following does work:
SELECT year
FROM publicdata:samples.natality nat
LIMIT 10

BigQuery is only "SQL-like", however this seems like a very artificial 
restriction and will trip up first time users.

Original issue reported on code.google.com by [email protected] on 8 Oct 2013 at 5:42

bigquery document issue for query reference

https://developers.google.com/bigquery/docs/query-reference

SELECT expr1 [[AS] alias1] [, expr2 [[AS] alias2], ...]
    [agg_function(expr3) WITHIN expr4]
    [FROM [(FLATTEN(table_name1|(subselect1)] [, table_name2|(subselect2), ...)]
    [([INNER]|LEFT OUTER) JOIN table_2|(subselect2) [[AS] tablealias2]
      ON join_condition_1 [... AND join_condition_N ...]]
    [WHERE condition]
    [HAVING condition]
    [GROUP BY field1|alias1 [, field2|alias2, ...]]
    [ORDER BY field1|alias1 [DESC|ASC] [, field2|alias2 [DESC|ASC], ...]]
    [LIMIT n]
    ;

the order of [having condition] and [group by closure] should be wrong
please confirm.

Original issue reported on code.google.com by [email protected] on 29 May 2013 at 6:16

BigQuery Sandbox API

We are currently working on a pandas plugin for BigQuery 
(https://github.com/pydata/pandas/pull/4140) and would like the ability to unit 
test uploading/downloading API calls without requiring billing info. We would 
only need to process the public dataset information for testing, and small 
uploads. Are there any existing solutions for our problem?

Original issue reported on code.google.com by [email protected] on 31 Jul 2013 at 7:54

Service Outage - Bigquery for import

I had similar ticket before in StackOverflow.

We just noticed that our imports for Bigquery is failing with following message:

INFO:root:{
 "error": {
  "errors": [
   {
    "domain": "global",
    "reason": "internalError",
    "message": "Unexpected. Please try again."
   }
  ],
  "code": 500,
  "message": "Unexpected. Please try again."
 }
}

INFO:root:{
 "error": {
  "errors": [
   {
    "domain": "global",
    "reason": "backendError",
    "message": "Backend Error"
   }
  ],
  "code": 503,
  "message": "Backend Error"
 }
}

This has been happening now for over 5+ hr. It started at 6:15 am PST April 
1st, 2013. The console is reporting no known issues.

Do we know when the service will be back up.

Here are some of the job example: job_b4c87ef9931b4a75b1869b6e7157725b 
job_369c14f202ca46f3a2a6b931b97ed99d

Original issue reported on code.google.com by [email protected] on 1 Apr 2013 at 7:13

Slow response time and multiple 503 responses

Hi 

Over the last couple of days, we're experiencing slow response time from 
Bigquery - selects that used to take a couple of seconds now take more than a 
minute.

On top of that, we started receiving 503 responses form the service with the 
description "Backend Error" - at first it was sporadic, but the rate of these 
errors keeps building up.

Is there some issue? Is there anything we can do to mitigate these problems?

Thanks,

Nir

Original issue reported on code.google.com by [email protected] on 19 Sep 2013 at 10:20

Null fields lost when uploading datastore backup with repeated nested records

What steps will reproduce the problem?

1.  Create data in the app engine datastore which has a repeated nested model 
and includes some null values in the repeating data:

e.g.
      Session ID:      2343243
      Start Time:      11:32
      Events [
          {Event Type: A
           Event Time: 1
           Event Error:  },
          {Event Type: B
           Event Time: 7
           Event Error:  null pointer},
          {Event Type: A
           Event Time: 12
           Event Error:  },


2.  Upload to BQ via datastore backup.


What is the expected output? What do you see instead?

  I expect to be able to establish that the null pointer error is associated with event B.

  The data uploads as multiple repeated fields, with no null value placeholders in the repeating value list:
      Session ID:      2343243
      Start Time:      11:32
      Event Type:     [A, B, A]
      Event Time:     [1, 7, 12]
      Event Error:     [null pointer]

  It is therefore not possible to usefully query the nested fields with null values.


What version of the product are you using? On what operating system?

  Tested on GAE and BQ versions live on Thursday, July 25th, 2013 (NZST).


Please provide any additional information below.

This issue was addressed on SO in June 2013, but the issue of null data was not 
raised.  The workaround provided works in the absence of null data, so the 
issue may have received little priority.

http://stackoverflow.com/questions/17228281

Original issue reported on code.google.com by [email protected] on 26 Jul 2013 at 5:11

tar compressed file is not supported

What steps will reproduce the problem?
1. Use tar -z to create a compressed file instead of usual gzip
2. Try to load data in bigquery
3. Big query does not like tar compressed files 

What is the expected output? What do you see instead?
Allow compressed or non-compressed tar files.

What version of the product are you using? On what operating system?
Linux CentOS

Please provide any additional information below.
I need to uncompress the tar files and use gzip. It takes a lot of time. Please 
add the support at your end.

Original issue reported on code.google.com by shantanu.oak on 30 Jan 2013 at 3:34

Warning message disappears too quickly

What steps will reproduce the problem?
1. Create a very large dataset
2. Perform a SELECT * (which is accidental)
3. Abandon query

What is the expected output? What do you see instead?
I expect to abandon the query, going back to the query tool. Instead a warning 
message to the effect of "this is still running on the server, you'll still be 
billed for XX GB of transfer" appears quickly but disappears.

Then I get the message "Query Failed Error: Response too large to return." in 
the results pane and I'm not sure how to check if / how much I'll be billed for 
my mistake.

What version of the product are you using? On what operating system?
Latest on Mac OS / Safari.

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 24 Oct 2013 at 5:47

Nearly all BQ loads seem to be failing with 502/503.

What steps will reproduce the problem?
1. Attempt to load a JSON file via the BQ tool.
2. For example: 
3. bq --headless --nosync load --source_format NEWLINE_DELIMITED_JSON 
hgs.hgs_20131030_00 monwork/monwork-worker01.tmp

What is the expected output? What do you see instead?

I expect to see a message that the load succeeded. Instead I recieve either a 
502 or 503. Examples of results:

[2013-10-29 18:57:44] <worker01> < Command result 1: 'BigQuery error in load 
operation: Could not connect with BigQuery server.\nHttp response status: 
502\nHttp response content:\n<!DOCTYPE html>\n<html lang=en>\n<meta 
charset=utf-8>\n<meta name=viewport content="initial-scale=1, 
minimum-scale=1,\nwidth=device-width">\n<title>Error 502 (Server 
Error)!!1</title>\n<style>\n*{margin:0;padding:0}html,code{font:15px/22px\narial
,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7%\nauto 
0;max-width:390px;min-height:180px;padding:30px 0 15px}* 
>\nbody{background:url(//www.google.com/images/errors/robot.png) 100% 
5px\nno-repeat;padding-right:205px}p{margin:11px 
0\n22px;overflow:hidden}ins{color:#777;text-decoration:none}a 
img{border:0}@media\nscreen and 
(max-width:772px){body{background:none;margin-top:0;max-width:none;pa\ndding-rig
ht:0}}\n</style>\n<a href=//www.google.com/><img 
src=//www.google.com/images/errors/logo_sm.gif\nalt=Google></a>\n<p><b>502.</b> 
<ins>That\\ufffd\\ufffd\\ufffds an error.</ins>\n<p>The server encountered a 
temporary error and could not complete your\nrequest.<p>Please try again in 30 
seconds. <ins>That\\ufffd\\ufffd\\ufffds all we\nknow.</ins>'

[2013-10-29 19:00:44] <worker09> < Command result 1: 'BigQuery error in load 
operation: Could not connect with BigQuery server.\nHttp response status: 
503\nHttp response content:'

Additionally, the very few jobs that are successfully submitted generally fail. 
Examples:

bqjob_r3a054f09f7c6b47_0000014205e35217_1 - failed with "Connection error. 
Please try again."
bqjob_r105ecaaa4269450c_0000014205caa77a_1 - failed with "Unexpected. Please 
try again."

What version of the product are you using? On what operating system?

Tried the BigQuery CLI versions 2.0.15 and 2.0.17 with identical results on 
CentOS 5.

Please provide any additional information below.

Previously this seemed to be working fine. Then, over the past couple days I 
noted some 502/503 failures that would occur sporadically and some instances 
where it seemed to occur consistently for 30-90min. All load requests seem to 
have failed today.

According to the documentation any sort of quota issues should cause a 4xx 
error not 5xx, so I don't believe the problem could be that. Additionally, we 
should be submitting load requests considerably below the threshold for 
throttling. We use one BQ table per hour, and roughly 2 loads per minute (when 
BQ is actually handling requests). This is 120 loads per table/day (limit is 
1,000) and 2,880 loads per day (limit is 10,000).

I am attaching logs from the tool we use to submit queries. Hopefully the 
format will be self-explanatory. The log will show the exact command executed 
and the exact output from the BQ tool.

I will also attach a small excerpt of the files we are submitting.

We just deployed the project that depended on this data for production, so any 
help would be appreciated!

Original issue reported on code.google.com by [email protected] on 30 Oct 2013 at 1:21

Attachments:

Import data Wizard - Validate inputs first

I would like that the wizard used for importing new data ("Choose job 
template", "Choose destination" .. ... ) validated the inserted data step after 
step, instead of launching a big fail at the end of the entire process.

Fore example, I want to import a csv containing 40 fields. 
I have to put a name for the table, to select the file, then to insert the 
fields names. 
If I select a non existent file (on Cloud Storage), I will have to start again 
since the beginning.
If I insert a wrong number of fields (I have forty of them, there is not much 
space in the text box, I can misspell one of them, or forget a comma, or give a 
wrong type), I will have to start again.

These are very annoying problems.

Original issue reported on code.google.com by [email protected] on 6 Apr 2013 at 4:15

Add string comparison STARTSWITH and ENDSWITH

I know I can do this with regular expressions, but like CONTAINS, it is sooo 
much easier to write

  WHERE path STARTSWITH '/some/url'

Seems like it would be a piece of cake to provide this. May not be traditional 
SQL, but sure would be convenient!

Original issue reported on code.google.com by [email protected] on 20 Sep 2013 at 8:11

Query Not responding

We are using BQ browser and our queries are not returning result back. Is there 
any known issue right now?

We noticed issue around 10:00 am PST May 14, 2013. We are still facing issues.

Original issue reported on code.google.com by [email protected] on 14 May 2013 at 5:41

document Asynchronous Query Examples for Python

https://developers.google.com/bigquery/docs/queries#asyncqueries

The code example of Async Query for python looks strange. It mixed the 
runAsyncQuery and  the checkQueryResults  together. The code block of "# Get 
query results. Results will be available for about 24 hours." should not exist 
in the function runAsyncQuery.

The code example of Async Query for Java looks fine.


def runAsyncQuery (service, projectId):
  try:
    jobCollection = service.jobs()
    queryString = 'SELECT corpus FROM publicdata:samples.shakespeare GROUP BY corpus;'
    jobData = {
      'configuration': {
        'query': {
          'query': queryString,
        }
      }
    }

    insertResponse = jobCollection.insert(projectId=projectId,
                                         body=jobData).execute()

    # Get query results. Results will be available for about 24 hours.
    currentRow = 0
    queryReply = jobCollection.getQueryResults(
                      projectId=projectId,
                      jobId=insertResponse['jobReference']['jobId'],
                      startIndex=currentRow).execute()

    while(('rows' in queryReply) and currentRow < queryReply['totalRows']):
      printTableData(queryReply, currentRow)
      currentRow += len(queryReply['rows'])
      queryReply = jobCollection.getQueryResults(
                        projectId=projectId,
                        jobId=queryReply['jobReference']['jobId'],
                        startIndex=currentRow).execute()

  except HttpError as err:
    print 'Error in runAsyncTempTable:', pprint.pprint(err.resp)

  except Exception as err:
    print 'Undefined error' % err

Original issue reported on code.google.com by [email protected] on 9 Jun 2013 at 2:59

Paging return data - pagesize sometimes miniscule

What steps will reproduce the problem?
1. Is intermittent, but pulling from a table with num returned rows > 100,000

What is the expected output? What do you see instead?
Usually, I get a return from my query that pages by 100,000. This works just 
fine. I expect to see:
s

What I have seen recently is:

Loading data...
current length: 100000
current length: 100512
current length: 101024
<then suddenly>
current length: 201024
current length: 301014

So i'm seeing a 512 byte row/buffer size returned at times for no reason.

or even more insidious:

Loading data...
Job not yet complete...
Undefined error: <python traceback to None return>

What version of the product are you using? On what operating system?
Running on Linux and Windows simultaneously, exclusively in the Python 
environment.

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 8 Mar 2013 at 7:36

Mis-labeled in "Authorized Access to your Google Account"

What steps will reproduce the problem?
1. Do an OAuth 2 grant
2. Login to Google an look at application security
3. See listing says "New Service"

What is the expected output? What do you see instead?
listing for bigquery should say "Google BigQuery"

What version of the product are you using? On what operating system? OAuth 2.0


Please provide any additional information below.

In Google's "Authorized Access to your Google Account" page, the grant is 
listed as "New Service". This should be fixed to say "Google BigQuery"

Original issue reported on code.google.com by [email protected] on 23 Jan 2013 at 8:45

add R support

http://stackoverflow.com/questions/10977969/using-bigquery-with-r-for-analyzing-
data


Unfortunately, the BigQuery R client shown there is for BigQuery version 1, 
which has been turned down. There was some work on a BigQuery V2 client, but it 
was never checked into CRAN. I'll investigate the status and get back to you. 
– Jordan Tigani Jun 11 '12 at 17:43

I'm the person Jordan asked -- and unfortunately, there's still no ETA on the 
V2 client. "Soon" is the best I can offer right now. – Craig Citro Jun 12 '12 
at 6:5

Original issue reported on code.google.com by [email protected] on 14 Mar 2013 at 4:55

Execute Highlighted Part of Query in Browser Tool

In most database gui-based query tools, like SQL Server Management Studio, 
pgAdmin3, Teradata SQL Assistant etc., when part of the query text is 
highlighted and the query is run, only the highlighted portion of the script is 
executed.

It would be very handy if the BigQuery Browser Tool also had this feature, 
particularly when BigQuery SQL queries often have multiple levels of nested 
subqueries.

Original issue reported on code.google.com by [email protected] on 2 Oct 2013 at 4:56

The column name 'prefix' cause query fail.

What steps will reproduce the problem?
1. Create a new table with schema prefix:STRING, and give it some data.
2. Query `SELECT prefix FROM [table]`

What is the expected output? What do you see instead?
The expected output is the query success, showing all data.
Instead, I got 
Error: Encountered " "SELECT" "SELECT "" at line 1, column 1. Was expecting: 
<EOF>
Seems that it doesn't even correctly parse the query.

What version of the product are you using? On what operating system?
OS is Ubuntu 12.04.

Please provide any additional information below.
I tried using Query API and online query viewer, both failed with exactly the 
same error message.

Original issue reported on code.google.com by [email protected] on 6 Aug 2013 at 3:02

implement UTC_USEC_TO_QUARTER

Shifts a UNIX timestamp in microseconds to the beginning of the quarter it 
occurs in.

Original issue reported on code.google.com by [email protected] on 21 Sep 2013 at 1:54

runt18 / google-bigquery Goto Github PK

google-bigquery's Introduction

google-bigquery's People

Contributors

Watchers

google-bigquery's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs