google-bigquery
runt18 / google-bigquery Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/google-bigquery
Automatically exported from code.google.com/p/google-bigquery
google-bigquery
http://stackoverflow.com/questions/17751944/google-bigquery-incomplete-query-rep
lies-on-odd-attempts
We would like some Python code demonstrating pagination of a query reply. We
found the documentation page: developers.google.com/bigquery/docs/data#paging,
but it didn't have any code samples. It's not clear from the documentation how
this would be done.
Original issue reported on code.google.com by [email protected]
on 31 Jul 2013 at 7:55
What steps will reproduce the problem?
1. Log into the BigQuery Console.
2. Under my project named "API Project", click the dropdown and select "Create
new dataset".
3. Note that the dataset appears fine in that you can interact with it (create
tables, upload data, etc.)
4. Refresh the page and the dataset is gone.
5. Try to create a dataset with the same name and it says it exists even though
it is not displayed.
What is the expected output? What do you see instead?
I expect the dataset to still be listed under my project.
What version of the product are you using? On what operating system?
Tried with Chrome and IE.
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 22 Oct 2013 at 6:51
What steps will reproduce the problem?
1. file store at
gs://nested_logs_tagtootrack/log2bq-20130704-1580687957719EDE6238F-output
2. while I used bq command line tool to load the file, everything is good
bq load --source_format=NEWLINE_DELIMITED_JSON --max_bad_records=0
logs_tagtootrack.nested_logs_20130704
gs://nested_logs_tagtootrack/log2bq-20130704-1580687957719EDE6238F-output
tagtootrack.new.schema
The tagtootrack.new.schema looks like
[{"type": "string", "name": "session"}, {"type": "string", "name": "slot"},
{"type": "string", "name": "target"}, {"type": "string", "name": "vars",
"mode": "repeated"}, {"type": "string", "name": "title"}, {"type": "string",
"name": "ip"}, {"type": "float", "name": "start_time"}, {"type": "string",
"name": "publisher"}, {"type": "string", "name": "ext"}, {"type": "string",
"name": "host"}, {"type": "string", "name": "tag"}, {"type": "string", "name":
"features", "mode": "repeated"}, {"type": "string", "name": "user_agent"},
{"type": "string", "name": "version"}, {"fields": [{"type": "string", "name":
"var"}, {"type": "string", "name": "advertiser"}, {"type": "string", "name":
"campaign"}], "type": "record", "name": "items", "mode": "repeated"}, {"type":
"string", "name": "referral"}, {"type": "string", "name": "type"}, {"type":
"string", "name": "page"}, {"type": "string", "name": "user"}]
3. while I used api, load the same file, it failed and return
{u'status': {u'state': u'DONE', u'errors': [{u'reason': u'internalError',
u'message': u'Unexpected. Please try again.'}, {u'reason': u'invalid',
u'message': u'Too many errors encountered. Limit is: 0.'}], u'errorResult':
{u'reason': u'invalid', u'message': u'Too many errors encountered. Limit is:
0.'}}, u'kind': u'bigquery#job', u'statistics': {u'load': {u'outputRows':
u'214468', u'inputFiles': u'1', u'inputFileBytes': u'10260708075',
u'outputBytes': u'0'}, u'endTime': u'1372990828961', u'startTime':
u'1372990544203'}, u'jobReference': {u'projectId': u'tagtoosql', u'jobId':
u'job_417754af2f234f61b917ab2583e6a3df'}, u'etag':
u'"hGuPPOkA35eFd5045aqCjqFeM_s/_YV1JYm-8zv5rNmA7GSBkZYt-3M"', u'configuration':
{u'load': {u'encoding': u'UTF-8', u'sourceFormat': u'NEWLINE_DELIMITED_JSON',
u'destinationTable': {u'projectId': u'tagtoosql', u'tableId':
u'nested_logs_20130704', u'datasetId': u'logs_tagtootrack'},
u'writeDisposition': u'WRITE_TRUNCATE', u'sourceUris':
[u'gs://nested_logs_tagtootrack/log2bq-20130704-1580687957719EDE6238F-output'],
u'createDisposition': u'CREATE_IF_NEEDED', u'schema': {u'fields': [{u'type':
u'STRING', u'name': u'session'}, {u'type': u'STRING', u'name': u'slot'},
{u'type': u'STRING', u'name': u'target'}, {u'type': u'STRING', u'name':
u'title'}, {u'type': u'STRING', u'name': u'ip'}, {u'type': u'FLOAT', u'name':
u'start_time'}, {u'type': u'STRING', u'name': u'publisher'}, {u'type':
u'STRING', u'name': u'ext'}, {u'type': u'STRING', u'name': u'creative'},
{u'type': u'STRING', u'name': u'host'}, {u'type': u'STRING', u'name': u'tag'},
{u'type': u'STRING', u'name': u'features', u'mode': u'REPEATED'}, {u'type':
u'STRING', u'name': u'user_agent'}, {u'type': u'STRING', u'name': u'version'},
{u'fields': [{u'type': u'STRING', u'name': u'advertiser'}, {u'type': u'STRING',
u'name': u'campaign'}], u'type': u'RECORD', u'name': u'items', u'mode':
u'REPEATED'}, {u'type': u'STRING', u'name': u'referral'}, {u'type': u'STRING',
u'name': u'type'}, {u'type': u'STRING', u'name': u'page'}, {u'type': u'STRING',
u'name': u'user'}]}}}, u'id':
u'tagtoosql:job_417754af2f234f61b917ab2583e6a3df', u'selfLink':
u'https://www.googleapis.com/bigquery/v2/projects/tagtoosql/jobs/job_417754af2f2
34f61b917ab2583e6a3df'}
The two method should return the same results and please provide more
information for user to debug the API.
Original issue reported on code.google.com by [email protected]
on 5 Jul 2013 at 7:39
Actually we can either :
- request for a specific job specifying its id with a jobs().get(id) call,
- or request for all jobs (with a few filtering params like job state) with a
jobs().list() call
It could by really usefull to be able to specify a list of ids to filter the
job listing. (eg. a task lauching some jobs could checks more easily for the
jobs completion, instead of making call to jobs().get(id) for each job and
check each state)
Original issue reported on code.google.com by [email protected]
on 27 Mar 2013 at 3:54
setProjectId(), setQuery() and other BigQuery functions have stopped working as
of 18 June 2013.
What is the expected output? What do you see instead?
On using these functions, the variables remain undefined when they should
ideally be populated.
A small sample apps script code that reproduces the issue:
var newJobReference = BigQuery.newJobReference().setProjectId(yourProjectID);
var jobConfig =
BigQuery.newJobConfiguration().setQuery(yourJobQueryConfiguration);
What version of the product are you using? On what operating system?
v2, OS X 10.8.3
Original issue reported on code.google.com by [email protected]
on 25 Jun 2013 at 12:38
We are having import issues since 8:15 am PST, June 1st. Looks like import is
stuck or timing out. I am still seeing problem @ 11:00 am PST, June 1st.
Are there any known issues at this time?
Original issue reported on code.google.com by [email protected]
on 1 Jun 2013 at 5:51
At present, bigquery does not have any function that can provide time zone c
onversions.
This issue is discussed in more detail here:
http://stackoverflow.com/questions/12482637/bigquery-converting-to-a-different-t
imezone/12482905#comment20030832_12482905
As a workaround one can use static conversion by using something like this:
UTC_USEC_TO_DAY(timestamp * 1000000 - (5*60*60*1000*1000000) ) but it doesn't
work for daylight savings time.
It would be great if this functionality was provided as the date could be
stored in GMT but the users may wish to see the report in any given timezone.
Original issue reported on code.google.com by [email protected]
on 21 Feb 2013 at 3:40
Hi,
Our imports are failing with 502. Here is message from the log..
It is now over 1 hr we are getting this issue. Web Browser for BQ is also down.
However, when I visit the Google API console, it shows no issues. Is anyone
looking into the problem?
How do I report service issues? Is this the right forum?
INFO:root:--response-end--
INFO:root:--response-start--
INFO:root:date: Wed, 01 May 2013 02:15:44 GMT
INFO:root:status: 502
INFO:root:content-length: 983
INFO:root:content-type: text/html; charset=UTF-8
INFO:root:server: GFE/2.0
INFO:root:<!DOCTYPE html>
<html lang=en>
<meta charset=utf-8>
<meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
<title>Error 502 (Server Error)!!1</title>
<style>
*{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}
</style>
<a href=//www.google.com/><img src=//www.google.com/images/errors/logo_sm.gif alt=Google></a>
<p><b>502.</b> <ins>That’s an error.</ins>
<p>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds. <ins>That’s all we know.</ins>
INFO:root:--response-end--
Original issue reported on code.google.com by [email protected]
on 1 May 2013 at 2:29
Currently join predicates must be qualified field names.
Expressions or constants are not allowed.
E.g. this query does not work:
SELECT *
FROM
(SELECT year,
COUNT(*) AS cnt
FROM publicdata:samples.natality
GROUP BY year) cur
LEFT OUTER JOIN (SELECT year,
COUNT(*) AS cnt
FROM publicdata:samples.natality
GROUP BY year) prev
ON cur.year - 1 = prev.year
and fails with: "Error: ON clause must be AND of = comparisons of one field
name from each table, with all field names prefixed with table name."
Using "ON prev.year = 2000" also fails as constants are not allowed.
The workaround is to use nested subqueries with the expression/constant as a
column, which can then then joined on in the outer scope. However this
increases query complexity and slows down ad hoc query development.
Please allow expressions and constants in join predicates.
Original issue reported on code.google.com by [email protected]
on 4 Oct 2013 at 7:02
As I have some "complicated" extractions (using regex functions). I would like
to have some of them predefined, so every time someone approaches such
extraction, they use same predefined logic. I would prefer to avoid pre
calculating them and holding them in a new table, as it has its overhead
maintenance costs.
I am looking for a way to define virtual tables or virtual fields. the concept
is similar to views, or calculated columns in RDBMS systems such as MS SQL
Server.
I would like to have a way to define such virtual objects that will be
calculated only on run time.
(http://stackoverflow.com/questions/19376414/virtual-objects-on-bigquery)
Original issue reported on code.google.com by [email protected]
on 16 Oct 2013 at 5:14
What steps will reproduce the problem?
Run BigQuery Query using Java API library. It will intermediately fail with
timeout.
What is the expected output? What do you see instead?
Response returned within 10 sec.
What version of the product are you using? On what operating system?
Java AppEngine
Please provide any additional information below.
The stack trace for failure is the following:
2013-06-07 17:05:31.006 /_ah/queue/__deferred__ 500 39253ms 0kb
AppEngine-Google; (+http://code.google.com/appengine)
...
Caused by: com.electionear.fw.v2.shared.exception.DataAccessException: Can not
obtain request result (Timeout while fetching URL:
https://www.googleapis.com/bigquery/v2/projects/920037298476/queries)
at [...].service.dataaccess.QueryBQVoterTableCommand.query(QueryBQVoterTableCommand.java:...)
at [...].service.dataaccess.QueryBQVoterTableCommand.execute(QueryBQVoterTableCommand.java:...)
at [...].service.DataAccessService.queryBQVotersTable(DataAccessService.java:...)
at [...].service.DataAccessService$$FastClassByCGLIB$$3ac0c5b6.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.Cglib2AopProxy$CglibMethodInvocation.invokeJoinpoint(Cglib2AopProxy.java:688)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:80)
at com.electionear.fw.v2.server.util.profiler.ProfilerAspect.log(ProfilerAspect.java:28)
at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:45)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:65)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:80)
at com.electionear.fw.v2.server.util.logger.LoggerAspect.log(LoggerAspect.java:55)
at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:45)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:621)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:610)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:65)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:90)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:621)
at com.electionear.fw.v2.server.service.DataAccessService$$EnhancerByCGLIB$$252cd130.queryBQVotersTable(<generated>)
at com.electionear.fw.v2.server.tasks.bigquery.counters.RequestCountersTask.run(RequestCountersTask.java:40)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
at com.googlecode.objectify.cache.AsyncCacheFilter.doFilter(AsyncCacheFilter.java:59)
at com.googlecode.objectify.ObjectifyFilter.doFilter(ObjectifyFilter.java:49)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
Original issue reported on code.google.com by [email protected]
on 10 Jun 2013 at 5:16
We are loading datastore backups into big query with the big query v2 api. We
are specifying this JSON configuration:
{'configuration': {
'load': {
'sourceFormat' : 'DATASTORE_BACKUP',
'writeDisposition' : 'WRITE_TRUNCATE',
'sourceUris' : sourceUris,
'destinationTable' : {
'projectId': settings.PROJECT_ID,
'datasetId': datasetId,
'tableId' : entityKind
}
}
}
}
We have already loaded this entity into BigQuery once and are now expecting
further loads to replace the existing table with the new data. We are not
seeing this but an error in the insert job request:
u'status': {
u'state': u'DONE',
u'errors': [
{
u'reason': u'invalid',
u'message': u'Cannot import a datastore backup to a table that already has a schema.'
}
],
u'errorResult': {
u'reason': u'invalid',
u'message': u'Cannot import a datastore backup to a table that already has a schema.'
}
},
Original issue reported on code.google.com by [email protected]
on 11 Feb 2013 at 6:31
What version of the product are you using? On what operating system?
Using BigQuery Web tool using Chrome browser in Ubuntu.
Please provide any additional information below.
My query is pretty simple:
SELECT user_id,path from happyLatte.highnoon5_path
# Rows in table: 15837
# Columns in table: 2
# Size of data: 18.6MB
The query takes 6.2 seconds to complete. Is that expected? I would think
something like this would be under 1 second. I've attached a screenshot.
Please advice.
Thanks,
Navneet
Original issue reported on code.google.com by [email protected]
on 18 Oct 2013 at 11:03
Attachments:
What steps will reproduce the problem?
1. Delete a dataset
2. Create dataset
3. Try to load csv into dataset
What is the expected output? What do you see instead?
Should process the job but as of last night I now see:
Errors:
Not Found: Dataset blar
Job ID: job_952e6f613c7749278503fc59a207ab2c
Start Time: 2:10pm, 27 Jun 2013
End Time: 2:10pm, 27 Jun 2013
Destination Table:blar
Source URI: gs://blar
Source Format: CSV
Max Bad Records: 100
Schema:
f_id: INTEGER
f_screen_name: STRING
u_id: INTEGER
created: TIMESTAMP
What version of the product are you using? On what operating system?
Web interface and command line tool
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 27 Jun 2013 at 1:16
Hi,
It was reported to me my our analyst that they have recently seen "Resource
exceeded Error" in BQ Web Interface. Please see the job id below.
Query Failed
Error: Resources exceeded during query execution.
Job ID: job_826aaad1c0b645ae8b616785679975db
Let me know if you need anything from me.
Original issue reported on code.google.com by [email protected]
on 7 Nov 2013 at 5:02
What steps will reproduce the problem?
1. I have an Apps Script up and running that used BigQuery to pull in data -
which worked perfectly
2. As of yesterday, as soon as I would run the script, I get the following
error: "ReferenceError: "BigQuery" is not defined."
3. I then tried to run the tutorial
(https://developers.google.com/apps-script/service_bigquery), but am seeing the
same error. All of this used to work.
What is the expected output? What do you see instead?
Expected output would be to populate the cells in the Google Spreadsheet.
Instead I get: "ReferenceError: "BigQuery" is not defined."
What version of the product are you using? On what operating system?
Apps Script using Google Spreadsheets. Mac OS X
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 6 Feb 2013 at 10:34
Today, according to [1], the only files which can be imported into BigQuery are
[compressed] CSV or JSON files.
Protocol Buffers [2, 3] is an efficient method for serializing sets of data. It
would be really useful if BigQuery knew natively how to ingest such files.
[1] https://developers.google.com/bigquery/articles/ingestioncookbook
[2] https://developers.google.com/protocol-buffers/
[3] http://code.google.com/p/protobuf/
Original issue reported on code.google.com by [email protected]
on 31 Jan 2013 at 3:28
What version of the product are you using? On what operating system?
We are using BigQuery through Apps Scripts.
Please provide any additional information below.
All of a sudden BigQuery is not returning data or error message. Doesn't
matter if accessed through Apps Script or bigquery.cloud.google.com. Same
issue for all of our tables as well as the public example tables (no query
returned).
Original issue reported on code.google.com by [email protected]
on 14 May 2013 at 5:57
Apparently there are some legal & tax issues for companies operating outside of
the US when using services hosted in the US. I would like to be able to
configure BigQuery location to be in EU (Similar to some other Google
services).
http://stackoverflow.com/questions/19488515/bigquery-data-location-setting/19503
411?noredirect=1#19503411
Original issue reported on code.google.com by [email protected]
on 22 Oct 2013 at 11:01
See title:
"Select * from [data] where [conditions]"
returns the correct results, but with the columns order in lexicographical
order. It makes more sense for the columns to be in the order of the original
schema.
Not sure if this is a bug or a feature.
Original issue reported on code.google.com by [email protected]
on 2 Jul 2013 at 10:15
What steps will reproduce the problem?
1. Create a schema with a boolean field type
2. Import json data where the boolean value is a string ("false" instead of
false)
What is the expected output? What do you see instead?
It used to be that this would work without errors (not sure if you would end up
with a string or boolean though).
I would expect a specific error to be raised, something like: "Expected
boolean, got string".
At the moment an InternalError is raised with the message "Unexpected. Please
try again.".
What version of the product are you using? On what operating system?
I'm using JSON (Newline Delimited) format, it's easy to reproduce using the web
interface or REST API.
thanks,
Jasper Op de Coul
Original issue reported on code.google.com by [email protected]
on 19 Jul 2013 at 10:22
Add a function to BigQuery's URL functions that provides URL decoding.
Example:
URL_DECODE(http%3A%2F%2Fwww.example.com%2Fhello%3Fv%3D12345)
returns:
http://www.example.com/hello?v=12345
Original issue reported on code.google.com by [email protected]
on 24 Dec 2012 at 7:35
How to reproduce:
Im using: Google APIs Client Library for Python.
Create a table that has a name that starts with a number (in my case that was a
zero)
Stream records into it (service.tabledata().insertAll ... )
What is the expected output? What do you see instead?
I've done the test twice. Create a table with a name that starts with a number
and create one that start with a letter. Both creations succeed but I can only
stream into the table that starts with a letter. I get the following error
message (which doesn't really help a lot)
Error: {
"error": {
"errors": [
{
"domain": "global",
"reason": "internalError",
"message": "Unexpected. Please try again."
}
],
"code": 500,
"message": "Unexpected. Please try again."
}
}
What version of the product are you using? On what operating system?
I'm running python 2.7 on a debian compute engine instance release:
3.3.8-gcg-201308121035
Api client version:
>>> import apiclient
>>> apiclient.__version__
'1.0'
I've also tested whether or not updating the apiclient libary made any
difference:
sudo pip install -U apiclient
Downloading/unpacking apiclient
Downloading apiclient-1.0.2.tar.gz
Running setup.py egg_info for package apiclient
Installing collected packages: apiclient
Running setup.py install for apiclient
Successfully installed apiclient
Cleaning up...
>>> import pkg_resources
>>> pkg_resources.get_distribution("apiclient").version
'1.0.2'
Please provide any additional information below.
I've looked through the documentation
(https://developers.google.com/bigquery/docs/tables) and I didn't find anything
on table names. I would understand it if table names weren't allowed to start
with a number. But if that was so I shouldn't be able to create the table in
the first place. Furthermore I would like to have a more descriptive error
message. I only found out what the issue was by trial and error.
Original issue reported on code.google.com by [email protected]
on 9 Oct 2013 at 7:05
What steps will reproduce the problem?
1. At 3 o'clock I see $0.30
2. At 3:30 o'clock I see $0.25
3. At 4 o'clock I see 0.20
What is the expected output? What do you see instead?
Expected output is $0.30 at 4 o'clock. But the data was reduced.
What version of the product are you using? On what operating system?
I use online bigquery browser.
Please provide any additional information below.
I think the problem is that the BigQuery gathers information from various
sources and sometimes it's overwrites the data.
Original issue reported on code.google.com by [email protected]
on 5 Nov 2013 at 1:32
What steps will reproduce the problem?
1. Programmatically initiate a job load with a single destination table using a
handful of sourceUris that contain gzipped entries.
2. About 2/3 of the time this will work correctly, the other times it will
result in the traceback shown below.
What is the expected output? What do you see instead?
Expected output is a successful response containing a Job ID or at least a
valid HTTP-compliant response. The following traceback shows up instead:
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1027, in getresponse
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 407, in begin
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 371, in _read_status
BadStatusLine: ''
What version of the product are you using? On what operating system?
Mac OS X 10.8.3, it also occurs on Ubuntu. Using python v.2.7.2 with apiclient
v.1.1.
Please provide any additional information below.
Another user has independently experienced the same problem:
http://stackoverflow.com/questions/16326222/unable-to-load-data-into-bigquery-ba
dstatusline
Original issue reported on code.google.com by [email protected]
on 30 May 2013 at 12:35
Hi,
I have detail issue on Stackoverflow
http://stackoverflow.com/questions/16571635/join-each-not-returning-result
Let me know if I am doing anything wrong.
Thanks,
Original issue reported on code.google.com by [email protected]
on 31 May 2013 at 5:50
I have Cloud Storage logs stored on Cloud Storage and it would be very
convenient to be able to import them by prefix into Big Query directly via the
web interface.
That is, when you click on a Dataset, "Create and Import", in the "Select
data" step, I can specify a Google Cloud Storage file. I would like to be able
to specify only a prefix and add "*" at the end, requesting to import every
file that start with what I specified.
Original issue reported on code.google.com by [email protected]
on 11 Jul 2013 at 8:27
I would like a better documentation for working with the APIs.
In particular, maybe some best practices, or code fragments, for some of the
most common scenarios.
I don't think I'm the only one in the Internet that should load automatically a
file on Google Cloud Storage, and then build a program that automatically loads
that file on BigQuery with all the right table fields and types.
It's not easy like I expected to browse through 4 or 5 different guides and
tutorials (some of them old, and always different when comes the authentication
phase) and then put all the pieces together to get a working prototype, I mean,
I would like to invest my working hours in something more productive.
Original issue reported on code.google.com by [email protected]
on 6 Apr 2013 at 4:22
What steps will reproduce the problem?
1. Have two BigQuery accounts. One with my workplace ([email protected]), and
the other is personal ([email protected]). Login in to
[email protected]
2. Create dataSet for minke-data project and upload table
3. Log out -> Log back in
What is the expected output? What do you see instead?
Expected datasets previously created with tables loaded. However, all datasets
disappear. If I create another dataset with the same name, I get an error that
suggests the dataset already exists. This only happens with my personal
account. The [email protected] works fine.
What version of the product are you using? On what operating system?
Using Latest BigQuery. On Windows 7.
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 10 Sep 2013 at 1:17
What steps will reproduce the problem?
1.Run a heavy query that takes long to return using the connector for excel
2.
3.
What is the expected output? What do you see instead?
i was expecting to get the results after a few minutes
instead i get an error message:
Request failed: Error. Unable to execute query. Timeout while fetching URL:
https://www.googleapis.com/bigquery/v2/projects/{my-project}/queries.
What version of the product are you using? On what operating system?
Excel 2013 on windows 7 64bit
Please provide any additional information below.
http://stackoverflow.com/questions/19684618/bigquery-connector-for-excel-request
-failed-error-unable-to-execute-query-t
Original issue reported on code.google.com by [email protected]
on 31 Oct 2013 at 7:35
In the last few days, I am trying to upload files to my BigQuery table but it
keeps failing with: "Errors encountered during job execution. Unexpected.
Please try again." JobsIds for example: job_8bf7e7d257884d3bab2e04ac1208fedb
job_4354aa20427a4453ab14f9f18365d216 job_8b287b73b19d4a8a9dacc5f002e265a9
1. I have checked that there are no lines greater than 64K.
2. There is nothing wrong with the data syntax. I have divided the file into 4
pieces and they all completed the uploading job. The original gzipped file size
was 328M and it failed. Even a half size (177M) of the gzipped file was failed.
Each of the divided parts (>90M) completed the job. However, this solution is
not working either. It is sometimes failing for even smaller file size.
Original issue reported on code.google.com by [email protected]
on 4 Feb 2013 at 3:51
What steps will reproduce the problem?
1. Run this query:
SELECT nat.year
FROM publicdata:samples.natality nat
LIMIT 10
What is the expected output? What do you see instead?
Expected output: 10 rows of "year" values from publicdata:samples.natality
Actual output:
Query Failed
Error: Field 'nat.year' not found in table 'publicdata:samples.natality'.
What version of the product are you using? On what operating system?
Browser Tool
Please provide any additional information below.
Running this does work:
SELECT nat.year
FROM publicdata:samples.natality nat
LEFT OUTER JOIN (SELECT 2000 AS YEAR) tab
ON nat.year = tab.year
LIMIT 10
Specifying a table alias without a join does work, but you cannot then
reference the alias in the fields in the SELECT. This gives an inconsistent
signal to the user as to whether table aliases are supported in such a query.
E.g. the following does work:
SELECT year
FROM publicdata:samples.natality nat
LIMIT 10
BigQuery is only "SQL-like", however this seems like a very artificial
restriction and will trip up first time users.
Original issue reported on code.google.com by [email protected]
on 8 Oct 2013 at 5:42
https://developers.google.com/bigquery/docs/query-reference
SELECT expr1 [[AS] alias1] [, expr2 [[AS] alias2], ...]
[agg_function(expr3) WITHIN expr4]
[FROM [(FLATTEN(table_name1|(subselect1)] [, table_name2|(subselect2), ...)]
[([INNER]|LEFT OUTER) JOIN table_2|(subselect2) [[AS] tablealias2]
ON join_condition_1 [... AND join_condition_N ...]]
[WHERE condition]
[HAVING condition]
[GROUP BY field1|alias1 [, field2|alias2, ...]]
[ORDER BY field1|alias1 [DESC|ASC] [, field2|alias2 [DESC|ASC], ...]]
[LIMIT n]
;
the order of [having condition] and [group by closure] should be wrong
please confirm.
Original issue reported on code.google.com by [email protected]
on 29 May 2013 at 6:16
We are currently working on a pandas plugin for BigQuery
(https://github.com/pydata/pandas/pull/4140) and would like the ability to unit
test uploading/downloading API calls without requiring billing info. We would
only need to process the public dataset information for testing, and small
uploads. Are there any existing solutions for our problem?
Original issue reported on code.google.com by [email protected]
on 31 Jul 2013 at 7:54
I had similar ticket before in StackOverflow.
We just noticed that our imports for Bigquery is failing with following message:
INFO:root:{
"error": {
"errors": [
{
"domain": "global",
"reason": "internalError",
"message": "Unexpected. Please try again."
}
],
"code": 500,
"message": "Unexpected. Please try again."
}
}
INFO:root:{
"error": {
"errors": [
{
"domain": "global",
"reason": "backendError",
"message": "Backend Error"
}
],
"code": 503,
"message": "Backend Error"
}
}
This has been happening now for over 5+ hr. It started at 6:15 am PST April
1st, 2013. The console is reporting no known issues.
Do we know when the service will be back up.
Here are some of the job example: job_b4c87ef9931b4a75b1869b6e7157725b
job_369c14f202ca46f3a2a6b931b97ed99d
Original issue reported on code.google.com by [email protected]
on 1 Apr 2013 at 7:13
Hi
Over the last couple of days, we're experiencing slow response time from
Bigquery - selects that used to take a couple of seconds now take more than a
minute.
On top of that, we started receiving 503 responses form the service with the
description "Backend Error" - at first it was sporadic, but the rate of these
errors keeps building up.
Is there some issue? Is there anything we can do to mitigate these problems?
Thanks,
Nir
Original issue reported on code.google.com by [email protected]
on 19 Sep 2013 at 10:20
What steps will reproduce the problem?
1. Create data in the app engine datastore which has a repeated nested model
and includes some null values in the repeating data:
e.g.
Session ID: 2343243
Start Time: 11:32
Events [
{Event Type: A
Event Time: 1
Event Error: },
{Event Type: B
Event Time: 7
Event Error: null pointer},
{Event Type: A
Event Time: 12
Event Error: },
2. Upload to BQ via datastore backup.
What is the expected output? What do you see instead?
I expect to be able to establish that the null pointer error is associated with event B.
The data uploads as multiple repeated fields, with no null value placeholders in the repeating value list:
Session ID: 2343243
Start Time: 11:32
Event Type: [A, B, A]
Event Time: [1, 7, 12]
Event Error: [null pointer]
It is therefore not possible to usefully query the nested fields with null values.
What version of the product are you using? On what operating system?
Tested on GAE and BQ versions live on Thursday, July 25th, 2013 (NZST).
Please provide any additional information below.
This issue was addressed on SO in June 2013, but the issue of null data was not
raised. The workaround provided works in the absence of null data, so the
issue may have received little priority.
http://stackoverflow.com/questions/17228281
Original issue reported on code.google.com by [email protected]
on 26 Jul 2013 at 5:11
What steps will reproduce the problem?
1. Use tar -z to create a compressed file instead of usual gzip
2. Try to load data in bigquery
3. Big query does not like tar compressed files
What is the expected output? What do you see instead?
Allow compressed or non-compressed tar files.
What version of the product are you using? On what operating system?
Linux CentOS
Please provide any additional information below.
I need to uncompress the tar files and use gzip. It takes a lot of time. Please
add the support at your end.
Original issue reported on code.google.com by shantanu.oak
on 30 Jan 2013 at 3:34
What steps will reproduce the problem?
1. Create a very large dataset
2. Perform a SELECT * (which is accidental)
3. Abandon query
What is the expected output? What do you see instead?
I expect to abandon the query, going back to the query tool. Instead a warning
message to the effect of "this is still running on the server, you'll still be
billed for XX GB of transfer" appears quickly but disappears.
Then I get the message "Query Failed Error: Response too large to return." in
the results pane and I'm not sure how to check if / how much I'll be billed for
my mistake.
What version of the product are you using? On what operating system?
Latest on Mac OS / Safari.
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 24 Oct 2013 at 5:47
What steps will reproduce the problem?
1. Attempt to load a JSON file via the BQ tool.
2. For example:
3. bq --headless --nosync load --source_format NEWLINE_DELIMITED_JSON
hgs.hgs_20131030_00 monwork/monwork-worker01.tmp
What is the expected output? What do you see instead?
I expect to see a message that the load succeeded. Instead I recieve either a
502 or 503. Examples of results:
[2013-10-29 18:57:44] <worker01> < Command result 1: 'BigQuery error in load
operation: Could not connect with BigQuery server.\nHttp response status:
502\nHttp response content:\n<!DOCTYPE html>\n<html lang=en>\n<meta
charset=utf-8>\n<meta name=viewport content="initial-scale=1,
minimum-scale=1,\nwidth=device-width">\n<title>Error 502 (Server
Error)!!1</title>\n<style>\n*{margin:0;padding:0}html,code{font:15px/22px\narial
,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7%\nauto
0;max-width:390px;min-height:180px;padding:30px 0 15px}*
>\nbody{background:url(//www.google.com/images/errors/robot.png) 100%
5px\nno-repeat;padding-right:205px}p{margin:11px
0\n22px;overflow:hidden}ins{color:#777;text-decoration:none}a
img{border:0}@media\nscreen and
(max-width:772px){body{background:none;margin-top:0;max-width:none;pa\ndding-rig
ht:0}}\n</style>\n<a href=//www.google.com/><img
src=//www.google.com/images/errors/logo_sm.gif\nalt=Google></a>\n<p><b>502.</b>
<ins>That\\ufffd\\ufffd\\ufffds an error.</ins>\n<p>The server encountered a
temporary error and could not complete your\nrequest.<p>Please try again in 30
seconds. <ins>That\\ufffd\\ufffd\\ufffds all we\nknow.</ins>'
[2013-10-29 19:00:44] <worker09> < Command result 1: 'BigQuery error in load
operation: Could not connect with BigQuery server.\nHttp response status:
503\nHttp response content:'
Additionally, the very few jobs that are successfully submitted generally fail.
Examples:
bqjob_r3a054f09f7c6b47_0000014205e35217_1 - failed with "Connection error.
Please try again."
bqjob_r105ecaaa4269450c_0000014205caa77a_1 - failed with "Unexpected. Please
try again."
What version of the product are you using? On what operating system?
Tried the BigQuery CLI versions 2.0.15 and 2.0.17 with identical results on
CentOS 5.
Please provide any additional information below.
Previously this seemed to be working fine. Then, over the past couple days I
noted some 502/503 failures that would occur sporadically and some instances
where it seemed to occur consistently for 30-90min. All load requests seem to
have failed today.
According to the documentation any sort of quota issues should cause a 4xx
error not 5xx, so I don't believe the problem could be that. Additionally, we
should be submitting load requests considerably below the threshold for
throttling. We use one BQ table per hour, and roughly 2 loads per minute (when
BQ is actually handling requests). This is 120 loads per table/day (limit is
1,000) and 2,880 loads per day (limit is 10,000).
I am attaching logs from the tool we use to submit queries. Hopefully the
format will be self-explanatory. The log will show the exact command executed
and the exact output from the BQ tool.
I will also attach a small excerpt of the files we are submitting.
We just deployed the project that depended on this data for production, so any
help would be appreciated!
Original issue reported on code.google.com by [email protected]
on 30 Oct 2013 at 1:21
Attachments:
I would like that the wizard used for importing new data ("Choose job
template", "Choose destination" .. ... ) validated the inserted data step after
step, instead of launching a big fail at the end of the entire process.
Fore example, I want to import a csv containing 40 fields.
I have to put a name for the table, to select the file, then to insert the
fields names.
If I select a non existent file (on Cloud Storage), I will have to start again
since the beginning.
If I insert a wrong number of fields (I have forty of them, there is not much
space in the text box, I can misspell one of them, or forget a comma, or give a
wrong type), I will have to start again.
These are very annoying problems.
Original issue reported on code.google.com by [email protected]
on 6 Apr 2013 at 4:15
I know I can do this with regular expressions, but like CONTAINS, it is sooo
much easier to write
WHERE path STARTSWITH '/some/url'
Seems like it would be a piece of cake to provide this. May not be traditional
SQL, but sure would be convenient!
Original issue reported on code.google.com by [email protected]
on 20 Sep 2013 at 8:11
We are using BQ browser and our queries are not returning result back. Is there
any known issue right now?
We noticed issue around 10:00 am PST May 14, 2013. We are still facing issues.
Original issue reported on code.google.com by [email protected]
on 14 May 2013 at 5:41
https://developers.google.com/bigquery/docs/queries#asyncqueries
The code example of Async Query for python looks strange. It mixed the
runAsyncQuery and the checkQueryResults together. The code block of "# Get
query results. Results will be available for about 24 hours." should not exist
in the function runAsyncQuery.
The code example of Async Query for Java looks fine.
def runAsyncQuery (service, projectId):
try:
jobCollection = service.jobs()
queryString = 'SELECT corpus FROM publicdata:samples.shakespeare GROUP BY corpus;'
jobData = {
'configuration': {
'query': {
'query': queryString,
}
}
}
insertResponse = jobCollection.insert(projectId=projectId,
body=jobData).execute()
# Get query results. Results will be available for about 24 hours.
currentRow = 0
queryReply = jobCollection.getQueryResults(
projectId=projectId,
jobId=insertResponse['jobReference']['jobId'],
startIndex=currentRow).execute()
while(('rows' in queryReply) and currentRow < queryReply['totalRows']):
printTableData(queryReply, currentRow)
currentRow += len(queryReply['rows'])
queryReply = jobCollection.getQueryResults(
projectId=projectId,
jobId=queryReply['jobReference']['jobId'],
startIndex=currentRow).execute()
except HttpError as err:
print 'Error in runAsyncTempTable:', pprint.pprint(err.resp)
except Exception as err:
print 'Undefined error' % err
Original issue reported on code.google.com by [email protected]
on 9 Jun 2013 at 2:59
What steps will reproduce the problem?
1. Is intermittent, but pulling from a table with num returned rows > 100,000
What is the expected output? What do you see instead?
Usually, I get a return from my query that pages by 100,000. This works just
fine. I expect to see:
s
What I have seen recently is:
Loading data...
current length: 100000
current length: 100512
current length: 101024
<then suddenly>
current length: 201024
current length: 301014
So i'm seeing a 512 byte row/buffer size returned at times for no reason.
or even more insidious:
Loading data...
Job not yet complete...
Undefined error: <python traceback to None return>
What version of the product are you using? On what operating system?
Running on Linux and Windows simultaneously, exclusively in the Python
environment.
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 8 Mar 2013 at 7:36
What steps will reproduce the problem?
1. Do an OAuth 2 grant
2. Login to Google an look at application security
3. See listing says "New Service"
What is the expected output? What do you see instead?
listing for bigquery should say "Google BigQuery"
What version of the product are you using? On what operating system? OAuth 2.0
Please provide any additional information below.
In Google's "Authorized Access to your Google Account" page, the grant is
listed as "New Service". This should be fixed to say "Google BigQuery"
Original issue reported on code.google.com by [email protected]
on 23 Jan 2013 at 8:45
http://stackoverflow.com/questions/10977969/using-bigquery-with-r-for-analyzing-
data
Unfortunately, the BigQuery R client shown there is for BigQuery version 1,
which has been turned down. There was some work on a BigQuery V2 client, but it
was never checked into CRAN. I'll investigate the status and get back to you.
– Jordan Tigani Jun 11 '12 at 17:43
I'm the person Jordan asked -- and unfortunately, there's still no ETA on the
V2 client. "Soon" is the best I can offer right now. – Craig Citro Jun 12 '12
at 6:5
Original issue reported on code.google.com by [email protected]
on 14 Mar 2013 at 4:55
In most database gui-based query tools, like SQL Server Management Studio,
pgAdmin3, Teradata SQL Assistant etc., when part of the query text is
highlighted and the query is run, only the highlighted portion of the script is
executed.
It would be very handy if the BigQuery Browser Tool also had this feature,
particularly when BigQuery SQL queries often have multiple levels of nested
subqueries.
Original issue reported on code.google.com by [email protected]
on 2 Oct 2013 at 4:56
What steps will reproduce the problem?
1. Create a new table with schema prefix:STRING, and give it some data.
2. Query `SELECT prefix FROM [table]`
What is the expected output? What do you see instead?
The expected output is the query success, showing all data.
Instead, I got
Error: Encountered " "SELECT" "SELECT "" at line 1, column 1. Was expecting:
<EOF>
Seems that it doesn't even correctly parse the query.
What version of the product are you using? On what operating system?
OS is Ubuntu 12.04.
Please provide any additional information below.
I tried using Query API and online query viewer, both failed with exactly the
same error message.
Original issue reported on code.google.com by [email protected]
on 6 Aug 2013 at 3:02
Shifts a UNIX timestamp in microseconds to the beginning of the quarter it
occurs in.
Original issue reported on code.google.com by [email protected]
on 21 Sep 2013 at 1:54
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.