GithubHelp home page GithubHelp logo

Comments (10)

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
Your backup has different schema than the bigquery table.
Have you modified the entity schema?
If true, delete the bigquery table and creates it again.

Original comment by [email protected] on 24 Feb 2014 at 12:50

from google-bigquery.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
I think we're seeing the same thing.  Occasionally we will see twice as many 
rows in a datasource than we're expecting when we use `bq load` to update the 
datasource, despite passing the `--replace` flag as an argument.  I suspect 
this occurs when we've made a modification to the datasource schema.

If we drop the datasource prior to uploading, there will be a period of several 
minutes where any dashboards and other tools that are expecting the datasource 
will break.  If we do not drop the datasource and there is a schema change, it 
looks like there will be duplicate rows until a follow-on update in which the 
schema has not changed.

Is the behavior described in this issue considered a bug?  If not, is there a 
way to avoid this difficulty that we haven't thought of?

Original comment by [email protected] on 10 Apr 2014 at 6:37

from google-bigquery.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
I can now add that we are seeing this behavior independent of a schema change.  
Occasionally when we create a load job, even when there has been no schema 
change, the `--replace` command in `bq` appears to be ignored, and we see twice 
as many results as expected.

This is a pretty serious error, and it is making us reconsider using BQ for our 
dashboards.  Has anyone in Google seen this?

Original comment by [email protected] on 24 Apr 2014 at 8:04

from google-bigquery.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
We'll look into it. Can you send us any relevant table IDs or job IDs? Feel 
free to email [email protected] if you prefer (though I may hand off to 
someone else, since I'm going to be out next week).

If this is indeed an issue with loading from a datastore backup, you could try 
working around the problem by loading into a fresh table and then using a table 
copy (with WRITE_TRUNCATE) to atomically replace the old table with the new 
data.

Original comment by [email protected] on 25 Apr 2014 at 6:41

from google-bigquery.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
I can confirm the incorrect behavior reported by ewalker@ with WRITE_TRUNCATE, 
and I believe we have fixed it, as of late yesterday PDT.  Sometimes, data that 
was on a table before WRITE_TRUNCATE would persist after the truncate, 
alongside the data that was written.  We'll be getting more detail on the exact 
incidence of the problem.

bbassingthwaite's original content of the issue was a different thing, if I 
understand correctly: WRITE_TRUNCATE does not allow you to import a Datastore 
backup, even though  it seems like it logically could?  I'll look at that also.

Original comment by [email protected] on 23 May 2014 at 10:19

from google-bigquery.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
Spring cleaning, issue resolved per #5.

Original comment by [email protected] on 22 Aug 2014 at 5:27

  • Changed state: Fixed

from google-bigquery.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
The original comment, that you can't replace a table created from a datastore 
backup with WRITE_TRUNCATE enabled is still true:


  "status": {
    "errorResult": {
      "message": "Cannot import a datastore backup to a table that already has a schema.",
      "reason": "invalid"
    },
    "errors": [
      {
        "message": "Cannot import a datastore backup to a table that already has a schema.",
        "reason": "invalid"
      }
    ],
    "state": "DONE"
  }

Original comment by [email protected] on 9 Oct 2014 at 5:21

from google-bigquery.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
I have just hit the same issue. I initially tried to load Datastore backupfile 
from GCS to BQ using BQ Load cli. Once I try to repeat, it had said me there is 
already a schema available and BQ hinted me to use write disposition as write 
truncate. After I include disposition, I am getting the following error to 
include encode type for Datastore. I tried both UTF and ISO - still this wont 
allow me.

bucket_name> load --source_format=DATASTORE_BACKUP --allow_jagged_rows=false 
--encoding=UTF-8 --write_disposition=WRITE_TRUNCATE sample_red.t1estchallenge_1 
gs://test.appspot.com/folder/ahFzfnZpcmdpbi1yZWQtdGVzdHJBCxIcX0FFX0RhdGFzdG9yZUF
kbWluX09wZXJhdGlvbhiBwLgCDAsSFl9BRV9CYWNrdXBfSW5mb3JtYXRpb24YAQw.enittykind.back
up_info 

Error parsing command: flag --encoding=None: value should be one of 
<UTF-8|ISO-8859-1>

How do I approach for this issue?

Original comment by [email protected] on 1 Apr 2015 at 5:10

from google-bigquery.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
I'm not sure if that's the correct error message, but try using --replace 
instead of --write_disposition=WRITE_TRUNCATE.

Original comment by [email protected] on 1 Apr 2015 at 4:50

from google-bigquery.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
All,

As Jeremy pointed out, --replace flag works great if you prefer a full refresh 
load. WRITE_TRUNCATE flag asks for encoding for Datastore which may not be 
possible to supply.

Original comment by [email protected] on 2 Apr 2015 at 6:05

from google-bigquery.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.