influxdata / influxdb Goto Github PK

View Code? Open in Web Editor NEW

27.7K 741.0 3.5K 191.47 MB

Scalable datastore for metrics, events, and real-time analytics

Home Page: https://influxdata.com

License: Apache License 2.0

Shell 0.22% Rust 97.29% Dockerfile 0.35% Python 0.58% HTML 1.56%

influxdb monitoring database time-series metrics go react rust

influxdb's Introduction

InfluxDB Edge

Note

On 2023-09-21 this repo changed the default branch from master to main. At the same time, we moved all InfluxDB 2.x development into the main-2.x branch. If you relied on the 2.x codebase in the former master branch, update your tooling to point to main-2.x, which is the new home for any future InfluxDB 2.x development. This branch (main) is now the default branch for this repo and is for development of InfluxDB 3.x.

For now, this means that InfluxDB 3.0 and its upstream dependencies are the focus of our open source efforts. We continue to support both the 1.x and 2.x versions of InfluxDB for our customers, but our new development efforts are now focused on 3.x. The remainder of this readme has more details on 3.0 and what you can expect.

InfluxDB is an open source time series database written in Rust, using Apache Arrow, Apache Parquet, and Apache DataFusion as its foundational building blocks. This latest version (3.x) of InfluxDB focuses on providing a real-time buffer for observational data of all kinds (metrics, events, logs, traces, etc.) that is queryable via SQL or InfluxQL, and persisted in bulk to object storage as Parquet files, which other third-party systems can then use. It is able to run either with a write ahead log or completely off object storage if the write ahead log is disabled (in this mode of operation there is a potential window of data loss for any data buffered that has not yet been persisted to object store).

The open source project runs as a standalone system in a single process. If you're looking for a clustered, distributed time series database with a bunch of enterprise security features, we have a commercial offering available as a managed hosted service or as on-premise software designed to run inside Kubernetes. The distributed version also includes functionality to reorganize the files in object storage for optimal query performance. In the future, we intend to have a commercial version of the single server software that adds fine-grained security, federated query capabilities, file reorganization for query optimization and deletes, and integration with other systems.

Project Status

Currently this project is under active prototype development without documentation or official builds. This README will be updated with getting started details and links to docs when the time comes.

Roadmap

The scope of this open source InfluxDB 3.0 is different from either InfluxDB 1.x or 2.x. This may change over time, but for now here are the basics of what we have planned:

InfluxDB 1.x and 2.x HTTP write API (supporting Line Protocol)
InfluxDB 1.x HTTP query API (InfluxQL)
Flight SQL (query API using SQL)
InfluxQL over Flight
Data migration tooling for InfluxDB 1.x & 2.x to 3.0
InfluxDB 3.0 HTTP write API (a new way to write data with a more expressive data model than 1.x or 2.x)
InfluxDB 3.0 HTTP query API (send InfluxQL or SQL queries as an HTTP GET and get back JSON lines, CSV, or pretty print response)
Persist event stream (subscribe to the Parquet file persist events, useful for downstream clients to pick up files from object store)
Embedded VM (either Python, Javascript, WASM, or some combination thereof)
- Individual queries
- Triggers on write
- On persist (run whatever is being persisted through script)
- On schedule
Bearer token authentication (all or nothing, token is set at startup through env variable, more fine-grained security is outside the scope of the open source effort)

What this means is that InfluxDB 3.0 can be pointed to as though it is an InfluxDB 1.x server with most of the functionality present. For InfluxDB 2.x users that primarily interact with the database through the InfluxQL query capability, they will also be able to use this database in a similar way. Version 3.0 will not be implementing the rest of the 2.x API natively, although there could be separate processes that could be added on at some later date that would provide that functionality.

Flux

Flux is the custom scripting and query language we developed as part of our effort on InfluxDB 2.0. While we will continue to support Flux for our customers, it is noticeably absent from the description of InfluxDB 3.0. Written in Go, we built Flux hoping it would get broad adoption and empower users to do things with the database that were previously impossible. While we delivered a powerful new way to work with time series data, many users found Flux to be an adoption blocker for the database.

We spent years of developer effort on Flux starting in 2018 with a small team of developers. However, the size of the effort, including creating a new language, VM, query planner, parser, optimizer and execution engine, was significant. We ultimately weren’t able to devote the kind of attention we would have liked to more language features, tooling, and overall usability and developer experience. We worked constantly on performance, but because we were building everything from scratch, all the effort was solely on the shoulders of our small team. We think this ultimately kept us from working on the kinds of usability improvements that would have helped Flux get broader adoption.

For InfluxDB 3.0 we adopted Apache Arrow DataFusion, an existing query parser, planner, and executor as our core engine. That was in mid-2020, and over the course of the last three years, there have been significant contributions from an active and growing community. While we remain major contributors to the project, it is continuously getting feature enhancements and performance improvements from a worldwide pool of developers. Our efforts on the Flux implementation would simply not be able to keep pace with the much larger group of DataFusion developers.

With InfluxDB 3.0 being a ground-up rewrite of the database in a new language (from Go to Rust), we weren’t able to bring the Flux implementation along. For InfluxQL we were able to support it natively by writing a language parser in Rust and then converting InfluxQL queries into logical plans that our new native query engine, Apache Arrow DataFusion, can understand and process. We also had to add new capabilities to the query engine to support some of the time series queries that InfluxQL enables. This is an effort that took a little over a year and is still ongoing. This approach means that the contributions to DataFusion become improvements to InfluxQL as well given it is the underlying engine.

Initially, our plan to support Flux in 3.0 was to do so through a lower level API that the database would provide. In our Cloud2 product, Flux processes connect to the InfluxDB 1 & 2 TSM storage engine through a gRPC API. We built support for this in InfluxDB 3.0 and started testing with mirrored production workloads. We quickly found that this interface performed poorly and had unforeseen bugs, eliminating it as a viable option for Flux users to bring their scripts over to 3.0. This is due to the API being designed around the TSM storage engine’s very specific format, which the 3.0 engine is unable to serve up as quickly.

We’ll continue to support Flux for our users and customers. But given Flux is a scripting language in addition to being a query language, planner, optimizer, and execution engine, a Rust-native version of it is likely out of reach. And because the surface area of the language is so large, such an effort would be unlikely to yield a version that is compatible enough to run existing Flux queries without modification or rewrites, which would eliminate the point of the effort to begin with.

For Flux to have a path forward, we believe the best plan is to update the core engine so that it can use Flight SQL to talk to InfluxDB 3.0. This would make an architecture where independent processes that serve the InfluxDB 2.x query API (i.e. Flux) would be able to convert whatever portion of a Flux script that is a query into a SQL query that gets sent to the InfluxDB 3.0 process with the result being post-processed by the Flux engine.

This is likely not a small effort as the Flux engine is built around InfluxDB 2.0's TSM storage engine and the representation of all data as individual time series. InfluxDB 3.0 doesn't keep a concept of series so the SQL query would either have to do a bunch of work to return individual series, or the Flux engine would do work with the resulting query response to construct the series. For the moment, we’re focused on improvements to the core SQL and (and by extension InfluxQL) query engine and experience both in InfluxDB 3.0 and DataFusion.

We may come back to this effort in the future, but we don’t want to stop the community from self-organizing an effort to bring Flux forward. The Flux runtime and language exists as permissively licensed open source here. We've also created a community fork of Flux where the community can self-organize and move development forward without requiring our code review process. There are already a few community members working on this potential path forward. If you're interested in helping with this effort, please speak up on this tracked issue.

We realize that Flux still has an enthusiastic, if small, user base and we’d like to figure out the best path forward for these users. For now, with our limited resources, we think focusing our efforts on improvements to Apache Arrow DataFusion and InfluxDB 3.0’s usage of it is the best way to serve our users that are willing to convert to either InfluxQL or SQL. In the meantime, we’ll continue to maintain Flux with security and critical fixes for our users and customers.

influxdb's People

Contributors

Stargazers

Watchers

Forkers

sebastianschuler wojons chjohnst inderjeet26 rudrapranay halentest sipunjaya ps2 auxesis tinco phosphore nvdnkpr xuanhan863 rudyli pborreli jondot nairboon smn crossbreeze timusg alemic pcn olorin arcktip zorkian liujianping oboodo shangyou bobbyzhu iamima fangli ieasydevops dieterbe lusis rn2dy charsyam srehlig xuzhe35 eugine2l lispython johann8384 dmitri123 torkelo yiwang rramos elcct lijinhui sploit anarcher vdumitrescu nealjc uikit0 sunjaec bloglovin qq101 imcom msrdic didip drohr bbinet qz267 freeformz dajiaji siqueries akloeber cffyh majst01 scr512 richthegeek mei-rune ravindranathakila rayleyva is00hcw peekeri agilexs sonots mistobaan chobie venkatraju bargeanton j0ni 9cat yekeqiang vividcortex rabdureh blacklightops pmenglund kiichi vdelia sirsepp cosim ranman codeslinger dgnorton minimedj hyc nivertech dhammika mark-rushakoff fossteam

influxdb's Issues

Admin interface - bad parsing of returned response data

I think the server now returns an application/json content-type which didn't exist before.

This causes jQuery to be smart and parse the response. Then a redundant call to JSON.parse over a valid Javascript object (instead of a stringified JSON) kills the parsing.

Implement subqueries

Queries should support an order clause and limit clause

Default config dir & data-dir

I don't think it's a great idea to use /tmp as a datadir. I personally used a VM to try influxdb and could not understand why my data was going away after each reboot. Then I found the configuration file (which was not inside /etc/influxdb/ as usual in Debian/Ubuntu) and understood my mistake -__-'.

  "DataDir":        "/tmp/influxdb/development/db",
  "RaftDir":        "/tmp/influxdb/development/raft"

"free(): invalid pointer" in RHEL 5.9

$ sudo /etc/init.d/influxdb start
Starting the process influxdb [ OK ]
Anomalous agent started [ OK ]

$ *** glibc detected *** /usr/bin/influxdb: free(): invalid pointer: 0x0000000000e30fa0 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3a4be7164f]
/lib64/libc.so.6(cfree+0x4b)[0x3a4be7587b]
/usr/lib64/libstdc++.so.6(_ZNSs7reserveEm+0x9e)[0x3a5e89cb9e]
/usr/lib64/libstdc++.so.6(_ZNSs6appendEmc+0x60)[0x3a5e89cc90]
/usr/bin/influxdb(_ZN7leveldb10WriteBatchC1Ev+0x1d)[0x42f20d]
/usr/bin/influxdb(_ZN7leveldb6DBImplC2ERKNS_7OptionsERKSs+0x1bf)[0x41b8bf]
/usr/bin/influxdb(_ZN7leveldb2DB4OpenERKNS_7OptionsERKSsPPS0_+0x3f)[0x41fa6f]
/usr/bin/influxdb(leveldb_open+0x32)[0x416fb2]
/usr/bin/influxdb(_cgo_d9589830f294_Cfunc_leveldb_open+0x14)[0x4169e4]
/usr/bin/influxdb[0x46b9c4]
======= Memory map: ========
00400000-00c11000 r-xp 00000000 68:03 7144311                            /opt/influxdb/versions/0.1.0/influxdb
00e10000-00e11000 r--p 00810000 68:03 7144311                            /opt/influxdb/versions/0.1.0/influxdb
00e11000-00e31000 rw-p 00811000 68:03 7144311                            /opt/influxdb/versions/0.1.0/influxdb
00e31000-00e52000 rw-p 00e31000 00:00 0
0ac13000-0ac34000 rw-p 0ac13000 00:00 0                                  [heap]
3a4ba00000-3a4ba1c000 r-xp 00000000 68:03 13664258                       /lib64/ld-2.5.so
3a4bc1c000-3a4bc1d000 r--p 0001c000 68:03 13664258                       /lib64/ld-2.5.so
3a4bc1d000-3a4bc1e000 rw-p 0001d000 68:03 13664258                       /lib64/ld-2.5.so
3a4be00000-3a4bf4f000 r-xp 00000000 68:03 13664260                       /lib64/libc-2.5.so
3a4bf4f000-3a4c14e000 ---p 0014f000 68:03 13664260                       /lib64/libc-2.5.so
3a4c152000-3a4c153000 rw-p 00152000 68:03 13664260                       /lib64/libc-2.5.so
3a4c153000-3a4c158000 rw-p 3a4c153000 00:00 0
3a4c600000-3a4c616000 r-xp 00000000 68:03 13664273                       /lib64/libpthread-2.5.so
3a4c616000-3a4c816000 ---p 00016000 68:03 13664273                       /lib64/libpthread-2.5.so
3a4c816000-3a4c817000 r--p 00016000 68:03 13664273                       /lib64/libpthread-2.5.so
3a4c817000-3a4c818000 rw-p 00017000 68:03 13664273                       /lib64/libpthread-2.5.so
3a4c818000-3a4c81c000 rw-p 3a4c818000 00:00 0
3a4ca00000-3a4ca82000 r-xp 00000000 68:03 13664291                       /lib64/libm-2.5.so
3a4ca82000-3a4cc81000 ---p 00082000 68:03 13664291                       /lib64/libm-2.5.so
3a4cc81000-3a4cc82000 r--p 00081000 68:03 13664291                       /lib64/libm-2.5.so
3a4cc82000-3a4cc83000 rw-p 00082000 68:03 13664291                       /lib64/libm-2.5.so
3a5b400000-3a5b40d000 r-xp 00000000 68:03 13664610                       /lib64/libgcc_s-4.1.2-20080825.so.1
3a5b40d000-3a5b60d000 ---p 0000d000 68:03 13664610                       /lib64/libgcc_s-4.1.2-20080825.so.1
3a5b60d000-3a5b60e000 rw-p 0000d000 68:03 13664610                       /lib64/libgcc_s-4.1.2-20080825.so.1
3a5e800000-3a5e8e6000 r-xp 00000000 68:03 3272237                        /usr/lib64/libstdc++.so.6.0.8
3a5e8e6000-3a5eae5000 ---p 000e6000 68:03 3272237                        /usr/lib64/libstdc++.so.6.0.8
3a5eae5000-3a5eaeb000 r--p 000e5000 68:03 3272237                        /usr/lib64/libstdc++.so.6.0.8
3a5eaeb000-3a5eaee000 rw-p 000eb000 68:03 3272237                        /usr/lib64/libstdc++.so.6.0.8
3a5eaee000-3a5eb00000 rw-p 3a5eaee000 00:00 0
c000000000-c000001000 rw-p c000000000 00:00 0
c20fff0000-c210100000 rw-p c20fff0000 00:00 0
2b85d0efa000-2b85d0efc000 rw-p 2b85d0efa000 00:00 0
2b85d0f15000-2b85d10e8000 rw-p 2b85d0f15000 00:00 0
2b85d10e8000-2b85d10e9000 ---p 2b85d10e8000 00:00 0
2b85d10e9000-2b85d1bea000 rw-p 2b85d10e9000 00:00 0
2b85d1bea000-2b85d1beb000 ---p 2b85d1bea000 00:00 0
2b85d1beb000-2b85d25eb000 rw-p 2b85d1beb000 00:00 0
2b85d25eb000-2b85d25ec000 ---p 2b85d25eb000 00:00 0
2b85d25ec000-2b85d2fec000 rw-p 2b85d25ec000 00:00 0
2b85d2fec000-2b85d2fed000 ---p 2b85d2fec000 00:00 0
2b85d2fed000-2b85d39ed000 rw-p 2b85d2fed000 00:00 0
2b85d39ed000-2b85d39ee000 ---p 2b85d39ed000 00:00 0
2b85d39ee000-2b85d43ee000 rw-p 2b85d39ee000 00:00 0
2b85d43ee000-2b85d43ef000 ---p 2b85d43ee000 00:00 0
2b85d43ef000-2b85d4e0f000 rw-p 2b85d43ef000 00:00 0
7fff294ec000-7fff29501000 rw-p 7ffffffe9000 00:00 0                      [stack]
7fff2955f000-7fff29562000 r-xp 7fff2955f000 00:00 0                      [vdso]
ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0                  [vsyscall]

$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.9 (Tikanga)

[Confiugration] use toml for configuration instead json

Add official Homebrew formula

Current one is pretty outdated (v0.0.7).

Calculate quantiles on rolled up data.

If the data is rolled up to 10 min intervals, we should be able to give some probabilistic answer to the median of the original time series in say hours intervals.

Here are some pointers to algorithms that can be used based on a conversation with cespare on irc:

http://www.cs.rutgers.edu/~muthu/bquant.pdf
http://www.cs.ucsb.edu/research/tech_reports/reports/2005-23.pdf
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.6.6513&rep=rep1&type=pdf
http://www.cs.ucsb.edu/~suri/psdir/ency.pdf
http://metamarkets.com/2013/histograms/

Refactor aggregation so we can register new functions easily without changing the engine

rpm problems

The RHEL x86_64 RPM seems to have some bad paths built-in:

# rpm -Uvh influxdb-latest-1.x86_64.rpm
Preparing...                ########################################### [100%]
   1:influxdb               ########################################### [100%]
influxdb process is not running [ FAILED ]
Starting the process influxdb [ OK ]
sudo: /usr/bin/influxdb-daemon: command not found
Anomalous agent started [ OK ]

rpm -ql shows everything is installed in /opt/influxdb/versions/0.0.9 (which is a weird choice, too).

Finally the rpm also has a .bak file (post_install.sh.bak) which I am guessing wasn't intentional.

The regex operator should be =~ not ~=

(request) more information on users from API

if i request a user list for a db I think it would be nice if the returned JSON also contained additional data like write and readPermission

ale@cu:~$ curl 'http://y:8086/db/site_dev/users?u=root&p=root'
[{"username":"al"},{"username":"public"}]

Proper Content-Type

curl "localhost:8086/db/foo/series?u=root&p=root&q=SELECT%20*%20FROM%20foobar%3B" -v

* About to connect() to localhost port 8086 (#0)
*   Trying ::1...
* connected
* Connected to localhost (::1) port 8086 (#0)
> GET /db/foo/series?u=root&p=root&q=SELECT%20*%20FROM%20foobar%3B HTTP/1.1
> User-Agent: curl/7.25.0 (x86_64-suse-linux-gnu) libcurl/7.25.0 OpenSSL/1.0.1e zlib/1.2.7 libidn/1.25 libssh2/1.4.0
> Host: localhost:8086
> Accept: */*
> 
< HTTP/1.1 200 OK
< Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept
< Access-Control-Allow-Methods: GET, POST, PUT, DELETE
< Access-Control-Allow-Origin: *
< Access-Control-Max-Age: 2592000
< Date: Tue, 12 Nov 2013 19:01:36 GMT
< Content-Length: 2
< Content-Type: text/plain; charset=utf-8
< 
* Connection #0 to host localhost left intact
[]* Closing connection #0

I think Content-Type should be application/json. It does follow a pattern though (which makes total sense to me), if request is not 2xx, the text (which is text/plain) is the error message

Support the negation of the regex matcher !~

Implement first and last aggregates

Queries with median cause the server to panic and throw BUG errors in the log

Running select median(cpu) from test_medians group by host; cause the following error to be thrown (test median has three points 1 second apart for two hosts hosta and hostb):

Error: runtime error: invalid memory address or nil pointer dereference. Stacktrace: goroutine 22 [running]:
engine.func·004()
    /home/jvshahid/codez/influxdb/src/engine/engine.go:27 +0xfd
runtime.panic(0x7b0be0, 0xe181c8)
    /home/jvshahid/bin/go/src/pkg/runtime/panic.c:248 +0x106
engine.(*QueryEngine).executeCountQueryWithGroupBy(0xc210037b50, 0x7f46884b77b0, 0xc2100cc070, 0xc210059726, 0x3, ...)
    /home/jvshahid/codez/influxdb/src/engine/engine.go:321 +0x133c
engine.(*QueryEngine).RunQuery(0xc210037b50, 0x7f46884b77b0, 0xc2100cc070, 0xc210059726, 0x3, ...)
    /home/jvshahid/codez/influxdb/src/engine/engine.go:39 +0x117
api/http.func·001(0x7f46884b77b0, 0xc2100cc070, 0x5e7c68, 0xc21012da30, 0xc21012d9b0)
    /home/jvshahid/codez/influxdb/src/api/http/api.go:198 +0x2e2
api/http.yieldUser(0x7f46884b77b0, 0xc2100cc070, 0x7f46840c3c18, 0xc21005972c, 0x4, ...)
    /home/jvshahid/codez/influxdb/src/api/http/api.go:465 +0x3b
api/http.(*HttpServer).tryAsDbUser(0xc2100593c0, 0x7f46884b74a0, 0xc2100ce500, 0xc21011e4e0, 0x7f46840c3c18, ...)
    /home/jvshahid/codez/influxdb/src/api/http/api.go:619 +0x291
api/http.

Admin website missing in repository?

Building from scratch it appears as though the admin website is missing. The contents of src/admin/site/ is a single index.html with the traditional "It Works!" content.

Distinct is implemented but not documented

Nice!

Influx> select distinct(serverId) from info;
┌───────────────┬─────────────────┬──────────────────────────┐
│ time          │ sequence_number │ distinct                 │
├───────────────┼─────────────────┼──────────────────────────┤
│ 1383993229423 │ 1               │ 514883646ff643576b000035 │
│ 1383993229423 │ 1               │ 526d71e23202e35309000001 │
└───────────────┴─────────────────┴──────────────────────────┘

However aliasing would be great:

Influx> select distinct(serverId) as serverId from info;
Influx> ✘ Error: Error at 0:671379484. syntax error, unexpected SIMPLE_NAME, expecting FROM

Column indexes/names getting off somehow

To reproduce, say you have a database foobar:

curl -X POST 'http://localhost:8086/db?u=root&p=root' -d '{"name": "foobar"}'
curl -vv -X POST 'http://localhost:8086/db/foobar/users?u=root&p=root' -d '{"username":"paul", "password":"password"}'

And write data into a series:

curl -X POST 'http://localhost:8086/db/foobar/series?u=paul&p=password&time_precision=s' -d '[{"name": "asdf", "columns": ["time", "val1", "val2"], "points":[[1384118307, "v1", 2]]}]'

And now pull it out:

curl 'http://localhost:8086/db/foobar/series?u=paul&p=password&q=select+%2A+from+asdf+where+time+%3E+now%28%29+-+20d%3B'

And the output is off:

{"name":"asdf","columns":["time","sequence_number","val1","val2"],"points":[[1384118307000,2,2,"v1"]]}]

Notice that val1 and val2 are mixed up.

getting started docs incomplete, not able to run on ubuntu, had to copy config.json to config.json.sample

http://influxdb.org/docs/ says

"Check the download page for details on how to install.
...
If you’ve installed locally, InfluxDB should be running on two ports. The API will be on port 8086 by default and the administrative interface will be running on 8083. Go to http://localhost:8083 to login and create a database. The default login and password are root and root (don’t worry you can change them later)."

The downloads page has (for ubuntu) a download and dpkg instruction, which installs but does not start influxdb.

On ubuntu:

/opt/influxdb/ver4sions/0.0.6/influxdb is the executable

It fails to start:
/opt/influxdb/versions/0.0.6$ ./influxdb
2013/11/06 17:39:01 Loading Config from config.json.sample
2013/11/06 17:39:01 Couldn't load configuration file: config.json.sample
panic: open config.json.sample: no such file or directory

goroutine 1 [running]:
runtime.panic(0x8321740, 0x18a43800)
/home/jvshahid/bin/go/src/pkg/runtime/panic.c:266 +0x9a
configuration.LoadConfiguration(0x83961a8, 0x12, 0x0)
/home/jvshahid/codez/influxdb/src/configuration/configuration.go:33 +0x2f4
main.main()
/home/jvshahid/codez/influxdb/src/server/server.go:87 +0x231

goroutine 3 [syscall]:
os/signal.loop()
/home/jvshahid/bin/go/src/pkg/os/signal/signal_unix.go:21 +0x1e
created by os/signal.init·1
/home/jvshahid/bin/go/src/pkg/os/signal/signal_unix.go:27 +0x31

goroutine 4 [chan receive]:
code.google.com/p/log4go.ConsoleLogWriter.run(0x18a4e0b0, 0xb7445808, 0x18a000a0)
/home/jvshahid/codez/influxdb/src/code.google.com/p/log4go/termlog.go:27 +0x62
created by code.google.com/p/log4go.NewConsoleLogWriter
/home/jvshahid/codez/influxdb/src/code.google.com/p/log4go/termlog.go:19 +0x62

There is a config.json, not config.json.sample
efm@gwen:/opt/influxdb/versions/0.0.6$ more config.json
{
"AdminHttpPort": 8083,
"AdminAssetsDir": "./src/admin/site/",
"ApiHttpPort": 8086,
"RaftServerPort": 8090,
"SeedServers": [],
"DataDir": "/tmp/influxdb/development/db",
"RaftDir": "/tmp/influxdb/development/raft"
}

That looks reasonable, so I fixed it by copying config.json to config.json.sample

Intermittent failures are happening in the raft test suite

Why password is posted as URL query?

The query is posted to port 9061 not encrypted and password is plain text.

Ascending order always return null for columns that have a null value

Select column_one, column_two from foo;

returns values for column_one and column_two while the following

select column_one, column_two from foo order asc;

returns null values for column_one.

Build breaks if home directory has a go directory

These lines:
https://github.com/influxdb/influxdb/blob/master/exports.sh#L7-L9

Cause OS X users that have a go directory in their home to not be able to build. I think it's because that is usually their GOPATH, but not their GOROOT, which is elsewhere. In any case, it causes them to not be able to build with failures about not being able to find all the things in stdlib

The query engine should support mean, median, and mode

ubuntu dpkg -i to upgrade appears to nuke existing databases

In issue #45 I updated by downloading via the .deb file
dpkg -i newfile

It appears to reset my databases and the data is missing (it's ok since it's just test stuff but..) What is correct way of doing an upgrade here? Or is this a bug? a Doc-bug OR a nickg-bug (me)?

thanks!
nickg

package the admin interface from influxdb-admin instead of influxdb-js

The admin site is broken in the debian package

'limit' does not seem to work correctly

Hi,
when i use limit with where, limit is applied before where filter.

For reproduce you can use this snippet: https://gist.github.com/william-p/c200a8b122bc7954790b

Test1:
 + Query: 'select * from hosta'
 + Nb points: 40

Test2:
 + Query: 'select * from hosta where source == 'all''
 + Nb points: 10

Test3:
 + Query: 'select * from hosta where source == 'all' limit 5'
 + Nb points: 2

Thanks

[Configuration] Limit the memory usage of influxdb

Hi guys!

I don't see any configuration option to limit the memory usage of influxdb, how can I achieve that?

If it's currently not configurable, it's definitely my first concern right now (in order to use it as a shadow-slave in production).

Submitting some bug stack traces

I cannot correlate with the query that was run unfortunately.
Project was built from source yesterday.

In any case here are the traces:

********************************BUG********************************
Error: runtime error: invalid memory address or nil pointer dereference. Stacktrace: goroutine 4191 [running]:
engine.func·004()
    /home/jondot/experiments/influxdb/src/engine/engine.go:27 +0xf2
protocol.(*FieldValue).GetValue(0x0, 0x1007f7b1932d948, 0x4)
    /home/jondot/experiments/influxdb/src/protocol/protocol_extensions.go:26 +0x1c
protocol.(*Point).GetFieldValue(0xc20043b9c0, 0x1, 0x4eb0d9ee8c600, 0xc2005f26c0)
    /home/jondot/experiments/influxdb/src/protocol/protocol_extensions.go:47 +0x44
engine.func·009(0xc20043b9c0, 0xc2005f26c0, 0x6ceba0)
    /home/jondot/experiments/influxdb/src/engine/engine.go:135 +0x63
engine.func·017(0xc2003edaf0, 0xc2003edaf0, 0xc2003edaf0)
    /home/jondot/experiments/influxdb/src/engine/engine.go:253 +0x1ca
datastore.(*LevelDbDatastore).executeQueryForSeries(0xc200135390, 0xc20018fd16, 0x7, 0xc20021e730, 0x6, ...)
    /home/jondot/experiments/influxdb/src/datastore/leveldb_datastore.go:355 +0x180c
datastore.(*LevelDbDatastore).ExecuteQuery(0xc200135390, 0xc2001741e0, 0xc2001bf070, 0xc20018fd16, 0x7, ...)
    /home/jondot/experiments/influxdb/src/datastore/leveldb_datasto

and

********************************BUG********************************
Error: runtime error: index out of range. Stacktrace: goroutine 4191 [running]:
engine.func·004()
    /home/jondot/experiments/influxdb/src/engine/engine.go:27 +0xf2
protocol.(*Point).GetFieldValue(0xc200249a00, 0xffffffffffffffff, 0x4eb0dd0fa2800, 0xb)
    /home/jondot/experiments/influxdb/src/protocol/protocol_extensions.go:47 +0x62
engine.func·013(0xc200249a00, 0xc2004e8500, 0x0)
    /home/jondot/experiments/influxdb/src/engine/engine.go:169 +0x7a
engine.func·017(0xc2004e8500, 0xc2004e8500, 0xc2004e8500)
    /home/jondot/experiments/influxdb/src/engine/engine.go:253 +0x1ca
datastore.(*LevelDbDatastore).executeQueryForSeries(0xc200135390, 0xc20075b326, 0x7, 0xc2001900a8, 0x6, ...)
    /home/jondot/experiments/influxdb/src/datastore/leveldb_datastore.go:355 +0x180c
datastore.(*LevelDbDatastore).ExecuteQuery(0xc200135390, 0xc2001741e0, 0xc2001bf070, 0xc20075b326, 0x7, ...)
    /home/jondot/experiments/influxdb/src/datastore/leveldb_datastore.go:137 +0x3bd
coordinator.(*CoordinatorImpl).DistributeQuery(0xc2000f8a40, 0xc2001741e0, 0xc2001bf070, 0xc20075b326, 0x7, ...)
    /home/jon

Hope it helps.

Syntax error in query.yacc

I am running into

query.yacc:56.14-21: syntax error, unexpected identifier, expecting string

when building with OSX 10.9 with CC=clang and go version go1.2rc3 darwin/amd64.

Any ideas?

admin API issues (solved)

Http admin API docs use the "name" parameter, but the correct paramter to pass is "username"

i am having a few issues using the admin api. for example:

(influx is running on host y)

i add a db called testing
ale@cu:$ curl -X POST 'http://y:8086/db?u=root&p=root' -d '{"name": "testing"}'
the following line should add a db admin according to docs, but returns 404.
ale@cu:$ curl -X POST 'http://y:8086/db/testing/admins?u=root&p=root' -d '{"username": "admin", "password": "whatever"}'
404 page not found

another issue i have is that i cannot seem to add more than one user to a db:

ale@cu:~$ curl -X POST 'http://y:8086/db/testing/users?u=root&p=root' -d '{"name":"fulluser","password":"fulluser", "readPermissions":[{"matcher":"."}],"writePermissions":[{"matcher":"."}]}'

first added user is ok, but if i add another one:

ale@cu:~$ curl -X POST 'http://y:8086/db/testing/users?u=root&p=root' -d '{"name":"rouser","password":"rouser", "readPermissions":[{"matcher":".*"}],"writePermissions":[{"matcher":"null"}]}'
User already exists

it is telling me the user already exists even though names are different. and in the end if i do:

get a list of users for db testing

ale@cu:~$ curl 'http://y:8086/db/testing/users?u=root&p=root'
[{"username":""}]

...always an empty list, unless i use db site_dev, for which i get:

ale@cu:~$ curl 'http://y:8086/db/site_dev/users?u=root&p=daje'
[{"username":"al"},{"username":"public"}]

Shouldn't require a semi-colon on the end of a query

rpm packages git -v doesn't have the git sha

Support table aliases

The user should be able to run this query

select * from foo as f where f.column_one < 50

Note: the alias is used in the rest of the query. Also, the same time series should be mergeable and joinable if two different aliases were used.

Debs aren't installing correctly on Debian wheezy

postinst REPLACE_VERSION is not replaced.

reuse Write and ReadOptions

levigo.WriteOptions and levigo.ReadOptions are immutable config options that you can make once for each config that you need, and reuse. There's a few places in LevelDbDatastore that needlessly create them on every method call.

Queries should support multiple aggregators

group-by on time returns data out of order

I'm using ubuntu package, downloaded today.

I have something putting stats in every 10s.

select * from idle limit 100'
1384345429000 51715
1384345419000 51609
1384345409000 51502
1384345399000 51397
1384345389000 51291
1384345379000 51185
1384345369000 51079
1384345359000 50968
1384345349000 50866
1384345339000 50761
1384345329000 50654
1384345319000 50549
1384345309000 50443
1384345299000 50337
1384345289000 50230
etc

everything nice, time descending, sequence descending

when I use group by, the timestamps come "mostly decreasing"

for example using the following query:

select count(value) from idle group by time(10s) limit 50

we get this. first col is time, second is difference from previous (should always be positive), third is count. Make no difference if I group by different time values.

1384345510000 185252.398926 1
1384345480000 30000 1
1384345450000 30000 1
1384345340000 110000 1
1384345300000 40000 1
1384345250000 50000 1
1384345160000 90000 1
1384345640000 -480000 1   <-- ??
1384345610000 30000 1
1384345270000 340000 1
1384345210000 60000 1
1384345530000 -320000 1 <-- ??
1384345260000 270000 1
1384345660000 -400000 1 <-- ??
1384345570000 90000 1

what's going on here?

The query engine should support percentiles aggregation

Don't call WriteHeader more than once

This message is all over the logs:

http: multiple response.WriteHeader calls

We shouldn't be calling WriteHeader more than once.

Writing null values via HTTP API fails

According to the HTTP API docs, it is possible to write null values into columns:

As you can see you can write to multiple time series names in a single POST. You can also write multiple points to each series. The values in points must have the same index as their respective column names in columns. However, not all points need to have values for each column, nulls are ok.

I have observed that when you do this, you get an Unknown type <nil> response from the HTTP API.

Here is some test code to reproduce:

set -e

echo "Delete database"
curl -X DELETE 'http://localhost:8086/db/no_nulls?u=root&p=root' \
  -w "%{http_code}\n"

echo "Create database"
curl -X POST 'http://localhost:8086/db?u=root&p=root' \
  -d '{"name": "no_nulls"}' \
  -w "%{http_code}\n"

echo "Create user"
curl -X POST 'http://localhost:8086/db/no_nulls/users?u=root&p=root' \
  -d '{"username": "foo", "password": "bar"}' \
  -w "%{http_code}\n"

echo "Set privileges"
curl -X POST 'http://localhost:8086/db/no_nulls/users/foo?u=root&p=root' \
  -d '{"admin": true}' \
  -w "%{http_code}\n"

echo "Write data"
curl -X POST 'http://localhost:8086/db/no_nulls/series?u=foo&p=bar&time_precision=s' \
  -d '[
  {
    "name": "response_times",
    "columns": ["time", "value"],
    "points": [
      [1382819388, 234.3],
      [1382819389, 120.1],
      [1382819380, 340.9]
    ]
  }
]' \
 -w "%{http_code}\n"

echo "Write null data"
curl -X POST 'http://localhost:8086/db/no_nulls/series?u=foo&p=bar&time_precision=s' -d \
'[
  {
    "name": "response_times",
    "columns": ["time", "value"],
    "points": [
      [1382819388, 234.3],
      [1382819389, null],
      [1382819380, 340.9]
    ]
  }
]' \
-w "%{http_code}\n"

And this is the output I currently see when running that script:

Delete database
204
Create database
201
Create user
200
Set privileges
200
Write data
200
Write null data
Unknown type <nil>400

Is this intended behaviour, or should I actually be able to write null values?

Group By does not seem to work

Influx> select instantaneous_ops_per_sec from info group by serverId limit 10;
┌───────────────┬─────────────────┬───────────────────────────┬──────────────────────────┐
│ time          │ sequence_number │ instantaneous_ops_per_sec │ serverId                 │
├───────────────┼─────────────────┼───────────────────────────┼──────────────────────────┤
│ 1383993817575 │ 242             │ 1                         │ 514883646ff643576b000035 │
│ 1383993817564 │ 241             │ 1                         │ 526d71e23202e35309000001 │
│ 1383993811560 │ 240             │ 1                         │ 514883646ff643576b000035 │
│ 1383993811555 │ 239             │ 1                         │ 526d71e23202e35309000001 │
│ 1383993805567 │ 238             │ 1                         │ 514883646ff643576b000035 │
│ 1383993805555 │ 237             │ 1                         │ 526d71e23202e35309000001 │
│ 1383993799557 │ 236             │ 1                         │ 514883646ff643576b000035 │
│ 1383993799547 │ 235             │ 1                         │ 526d71e23202e35309000001 │
│ 1383993793564 │ 234             │ 1                         │ 514883646ff643576b000035 │
│ 1383993793558 │ 233             │ 1                         │ 526d71e23202e35309000001 │

Influx> select count(instantaneous_ops_per_sec) from info group by serverId limit 10;
Influx> # nothing was returned

Mac OS install instructions for bazaar

On OS X 10.8 my first attempt at installing bazaar using homebrew left the influxdb build complaining it couldn't import the bzrlib Python module. I fixed that with sudo pip install bzr. I'm not sure if there's a way to set the PYTHONPATH to point at the homebrew installed location which could be easier to handle from the influxdb build.

Typo in homebrew formula creates wrong directory

infludxb should be influxdb

Table/Points not deleted when we drop database

Hi guys,
i have a issue when i use Python driver (https://github.com/influxdb/influxdb-python) but i think it's core issue.

How to reproduce:

create database and add correct rights
add points
query points (it's okay)
delete database
query points (no point, okay)
recreate database
query point, server return old points 😕

Interactive Python

Python 2.7.4 (default, Apr 19 2013, 18:28:01) 
[GCC 4.7.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from influxdb import InfluxDBClient
>>> 
>>> client = InfluxDBClient("localhost", 8086, "root", "root", "dbtest")
>>> 
>>> client.create_database("dbtest")
True
>>> client.add_database_user("root", "root")
True
>>> 
>>> json_body = [{
...     "points": [
...         ['value1']
...     ],
...     "name": "id1",
...     "columns": ["key1"]
...  }]
>>> 
>>> client.write_points(json_body)
True
>>> 
>>> print client.query("select * from id1")
[{"name":"id1","columns":["time","sequence_number","key1"],"points":[[1384282753,1,"value1"]]}]
>>> 
>>> client.delete_database("dbtest")
True
>>> 
>>> client.delete_database("dbtest")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/william/gits/checkmy.ws/opt/venv/local/lib/python2.7/site-packages/influxdb/client.py", line 243, in delete_database
    "{0}: {1}".format(response.status_code, response.content))
Exception: 400: Database dbtest doesn't exist
>>> 
>>> print client.query("select * from id1")
[]
>>> 
>>> client.create_database("dbtest")
True
>>> client.add_database_user("root", "root")
True
>>> 
>>> print client.query("select * from id1")
[{"name":"id1","columns":["time","sequence_number","key1"],"points":[[1384282753,1,"value1"]]}]

I have missing something ?