GithubHelp home page GithubHelp logo

Comments (7)

 avatar commented on September 12, 2024 1

I really appreciate you help.
It works now. I have replaced all the double quotes in the filed and then it works.

The wired thing is still showing the same error log, however, it has been inserted into BigQuery successfully.

from fluent-plugin-bigquery.

joker1007 avatar joker1007 commented on September 12, 2024

Probably, you can use record_transformer plugin.

For example, record_transformer converts entire record to {"payload": <jsonized record>} and fluent-plugin-bigquery inserts to table that has only payload:STRING column.

<filter sample.*>
  @type record_transformer
  enable_ruby
  <record>
    payload ${record.to_json}
  </record>
</filter>

<match sample.*>
  @type bigquery_insert

  auth_method private_key   # default
  email xxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxx@developer.gserviceaccount.com
  private_key_path /home/username/.keys/00000000000000000000000000000000-privatekey.p12
  # private_key_passphrase notasecret # default

  project yourproject_id
  dataset yourdataset_id
  table   tablename

  schema [
    {"name": "payload", "type": "STRING"},
  ]
</match>

(This configuration is just sample)

from fluent-plugin-bigquery.

 avatar commented on September 12, 2024

Thanks for the quick response. I have tried this and it shows error as below

insert errors project_id="xxxx" dataset="xxx" table="edge_server" insert_errors="[#<Google::Apis::BigqueryV2::InsertAllTableDataResponse::InsertError:0x000055814a749778 @errors=[#<Google::Apis::BigqueryV2::ErrorProto:0x000055814a748940 @debug_info="", @location="topic", @message="no such field.", @Reason="invalid">], @Index=0>,

My json object is a nested json, so it seems the output plugin just parse the nested json and then try to load each field to Bigquery.

My config looks like below

<source>
    @type kafka_group
    brokers "kafka-1:9092,kafka-2:9092,kafka-3:9092"
    consumer_group "fluentd"
    topics "edge"
    format "text"
  </source>
  <filter edge>
    @type record_transformer
    enable_ruby
    renew_record true
    keep_keys payload
    <record>
      payload ${record.to_json}
    </record>
  </filter>
  <match edge>
    @type bigquery_insert
    auth_method compute_engine
    project "xxxx"
    dataset "xxxx"
    table "edge_server_aaa"
    fetch_schema true
    <buffer>
      @type "file"
      path "/data/bigquery.edge.buffer"
      flush_interval 0.1
      queue_length_limit 4096
      flush_thread_count 16
      chunk_records_limit 1000
    </buffer>
  </match>

My json data looks like
{ {\\\"@timestamp\\\":\\\"2020-02-25T13:04:49.853Z\\\",\\\"@metadata\\\":{\\\"beat\\\":\\\"filebeat\\\",\\\"type\\\":\\\"doc\\\",\\\"version\\\":\\\"6.7.2\\\",\\\"topic\\\":\\\"edge-server\\\"} The json schema is keep changing, that's why I want to load the entire json as a string in 1 column.

from fluent-plugin-bigquery.

 avatar commented on September 12, 2024

Wondering if this is related how does the plugin generate the rows in multi_json

from fluent-plugin-bigquery.

joker1007 avatar joker1007 commented on September 12, 2024

Does edge_server_aaa has payload column?
It seems not to have the column for me.

Or, extra fields may cause this error.
How about using keep_keys or remove_keys at record_transformer configuration.

from fluent-plugin-bigquery.

 avatar commented on September 12, 2024

Yes, we use the record_transformer configuration to add the payload column. I mean it may be due to how does BigQuery load the nested JSON. It is trying to load ['payload]['xxx'] into BigQuery which results in the error. As the configuration listed

  <filter edge>
    @type record_transformer
    enable_ruby
    renew_record true
    keep_keys payload
    <record>
      payload ${record.to_json}
    </record>
  </filter>

I renew the entire record as one field 'payload' to include the entire JSON object. Much appreciate for your help. It always shows an error no such field' which means it tries to load each field into BigQuery instead of loading the payload` as a single column.

from fluent-plugin-bigquery.

joker1007 avatar joker1007 commented on September 12, 2024

Hmm...
I'm very sorry, I can't guess what causes the error.
If the payload column becomes a string value, bigquery plugin should be able to send it properly.
I suggest stdout filter in order to see actual records sent by bigquery plugin.
It's my last resort.

from fluent-plugin-bigquery.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.