Comments (10)
Your backup has different schema than the bigquery table.
Have you modified the entity schema?
If true, delete the bigquery table and creates it again.
Original comment by [email protected]
on 24 Feb 2014 at 12:50
from google-bigquery.
I think we're seeing the same thing. Occasionally we will see twice as many
rows in a datasource than we're expecting when we use `bq load` to update the
datasource, despite passing the `--replace` flag as an argument. I suspect
this occurs when we've made a modification to the datasource schema.
If we drop the datasource prior to uploading, there will be a period of several
minutes where any dashboards and other tools that are expecting the datasource
will break. If we do not drop the datasource and there is a schema change, it
looks like there will be duplicate rows until a follow-on update in which the
schema has not changed.
Is the behavior described in this issue considered a bug? If not, is there a
way to avoid this difficulty that we haven't thought of?
Original comment by [email protected]
on 10 Apr 2014 at 6:37
from google-bigquery.
I can now add that we are seeing this behavior independent of a schema change.
Occasionally when we create a load job, even when there has been no schema
change, the `--replace` command in `bq` appears to be ignored, and we see twice
as many results as expected.
This is a pretty serious error, and it is making us reconsider using BQ for our
dashboards. Has anyone in Google seen this?
Original comment by [email protected]
on 24 Apr 2014 at 8:04
from google-bigquery.
We'll look into it. Can you send us any relevant table IDs or job IDs? Feel
free to email [email protected] if you prefer (though I may hand off to
someone else, since I'm going to be out next week).
If this is indeed an issue with loading from a datastore backup, you could try
working around the problem by loading into a fresh table and then using a table
copy (with WRITE_TRUNCATE) to atomically replace the old table with the new
data.
Original comment by [email protected]
on 25 Apr 2014 at 6:41
from google-bigquery.
I can confirm the incorrect behavior reported by ewalker@ with WRITE_TRUNCATE,
and I believe we have fixed it, as of late yesterday PDT. Sometimes, data that
was on a table before WRITE_TRUNCATE would persist after the truncate,
alongside the data that was written. We'll be getting more detail on the exact
incidence of the problem.
bbassingthwaite's original content of the issue was a different thing, if I
understand correctly: WRITE_TRUNCATE does not allow you to import a Datastore
backup, even though it seems like it logically could? I'll look at that also.
Original comment by [email protected]
on 23 May 2014 at 10:19
from google-bigquery.
Spring cleaning, issue resolved per #5.
Original comment by [email protected]
on 22 Aug 2014 at 5:27
- Changed state: Fixed
from google-bigquery.
The original comment, that you can't replace a table created from a datastore
backup with WRITE_TRUNCATE enabled is still true:
"status": {
"errorResult": {
"message": "Cannot import a datastore backup to a table that already has a schema.",
"reason": "invalid"
},
"errors": [
{
"message": "Cannot import a datastore backup to a table that already has a schema.",
"reason": "invalid"
}
],
"state": "DONE"
}
Original comment by [email protected]
on 9 Oct 2014 at 5:21
from google-bigquery.
I have just hit the same issue. I initially tried to load Datastore backupfile
from GCS to BQ using BQ Load cli. Once I try to repeat, it had said me there is
already a schema available and BQ hinted me to use write disposition as write
truncate. After I include disposition, I am getting the following error to
include encode type for Datastore. I tried both UTF and ISO - still this wont
allow me.
bucket_name> load --source_format=DATASTORE_BACKUP --allow_jagged_rows=false
--encoding=UTF-8 --write_disposition=WRITE_TRUNCATE sample_red.t1estchallenge_1
gs://test.appspot.com/folder/ahFzfnZpcmdpbi1yZWQtdGVzdHJBCxIcX0FFX0RhdGFzdG9yZUF
kbWluX09wZXJhdGlvbhiBwLgCDAsSFl9BRV9CYWNrdXBfSW5mb3JtYXRpb24YAQw.enittykind.back
up_info
Error parsing command: flag --encoding=None: value should be one of
<UTF-8|ISO-8859-1>
How do I approach for this issue?
Original comment by [email protected]
on 1 Apr 2015 at 5:10
from google-bigquery.
I'm not sure if that's the correct error message, but try using --replace
instead of --write_disposition=WRITE_TRUNCATE.
Original comment by [email protected]
on 1 Apr 2015 at 4:50
from google-bigquery.
All,
As Jeremy pointed out, --replace flag works great if you prefer a full refresh
load. WRITE_TRUNCATE flag asks for encoding for Datastore which may not be
possible to supply.
Original comment by [email protected]
on 2 Apr 2015 at 6:05
from google-bigquery.
Related Issues (20)
- Single precision float storage option HOT 1
- Any plan to move to more modern issue tracking system? HOT 1
- BigQuery takes more than 5 minutes to process query with string matching and aggregations HOT 1
- BigQuery UI enhancements HOT 2
- data load with fieldDelimiter "%00" HOT 2
- count(*) behaves unpredictably with repeated fields HOT 1
- Ignore case not working in views HOT 4
- bq command line crash requested issue submission HOT 3
- DATEDIFF always returns null if using a date field that is the result of a LAG window function in a subquery / view HOT 1
- Loading csv file to bigquery failed HOT 1
- have ROUND(), FLOOR() and CEIL() return INTEGER type HOT 1
- Unable to copy federated tables HOT 2
- BigQuery UI do not update properly the tables list on "Refresh" action HOT 1
- Table details doesn't get updated when the table is modified HOT 2
- Cannot access shared dataset when using the python client api HOT 2
- BigQuery mistakenly flattens on nested field when it's not referenced HOT 1
- billingTierLimitExceeded appears for Load job HOT 6
- Data Duplication on Load HOT 14
- Error running query
- An internal error occurred and the request could not be completed. Error: 3144498
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from google-bigquery.