GithubHelp home page GithubHelp logo

alexkohler / prealloc Goto Github PK

View Code? Open in Web Editor NEW
627.0 8.0 24.0 61 KB

prealloc is a Go static analysis tool to find slice declarations that could potentially be preallocated.

License: MIT License

Go 100.00%
golang go static-code-analysis static-analyzer static-analysis prealloc-suggestions slice

prealloc's Introduction

prealloc

prealloc is a Go static analysis tool to find slice declarations that could potentially be preallocated.

Installation

go install github.com/alexkohler/prealloc@latest

Usage

Similar to other Go static analysis tools (such as golint, go vet), prealloc can be invoked with one or more filenames, directories, or packages named by its import path. Prealloc also supports the ... wildcard.

prealloc [flags] files/directories/packages

Flags

  • -simple (default true) - Report preallocation suggestions only on simple loops that have no returns/breaks/continues/gotos in them. Setting this to false may increase false positives.
  • -rangeloops (default true) - Report preallocation suggestions on range loops.
  • -forloops (default false) - Report preallocation suggestions on for loops. This is false by default due to there generally being weirder things happening inside for loops (at least from what I've observed in the Standard Library).
  • -set_exit_status (default false) - Set exit status to 1 if any issues are found.

Purpose

While the Go does attempt to avoid reallocation by growing the capacity in advance, this sometimes isn't enough for longer slices. If the size of a slice is known at the time of its creation, it should be specified.

Consider the following benchmark: (this can be found in prealloc_test.go in this repo)

import "testing"

func BenchmarkNoPreallocate(b *testing.B) {
	existing := make([]int64, 10, 10)
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		// Don't preallocate our initial slice
		var init []int64
		for _, element := range existing {
			init = append(init, element)
		}
	}
}

func BenchmarkPreallocate(b *testing.B) {
	existing := make([]int64, 10, 10)
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		// Preallocate our initial slice
		init := make([]int64, 0, len(existing))
		for _, element := range existing {
			init = append(init, element)
		}
	}
}
$ go test -bench=. -benchmem
goos: linux
goarch: amd64
BenchmarkNoPreallocate-4   	 3000000	       510 ns/op	     248 B/op	       5 allocs/op
BenchmarkPreallocate-4     	20000000	       111 ns/op	      80 B/op	       1 allocs/op

As you can see, not preallocating can cause a performance hit, primarily due to Go having to reallocate the underlying array. The pattern benchmarked above is common in Go: declare a slice, then write some sort of range or for loop that appends or indexes into it. The purpose of this tool is to flag slice/loop declarations like the one in BenchmarkNoPreallocate.

Example

Some examples from the Go 1.9.2 source:

$ prealloc go/src/....
archive/tar/reader_test.go:854 Consider preallocating ss
archive/zip/zip_test.go:201 Consider preallocating all
cmd/api/goapi.go:301 Consider preallocating missing
cmd/api/goapi.go:476 Consider preallocating files
cmd/asm/internal/asm/endtoend_test.go:345 Consider preallocating extra
cmd/cgo/main.go:60 Consider preallocating ks
cmd/cgo/ast.go:149 Consider preallocating pieces
cmd/compile/internal/ssa/flagalloc.go:64 Consider preallocating oldSched
cmd/compile/internal/ssa/regalloc.go:719 Consider preallocating phis
cmd/compile/internal/ssa/regalloc.go:718 Consider preallocating oldSched
cmd/compile/internal/ssa/regalloc.go:1674 Consider preallocating oldSched
cmd/compile/internal/ssa/gen/rulegen.go:145 Consider preallocating ops
cmd/compile/internal/ssa/gen/rulegen.go:145 Consider preallocating ops
cmd/dist/build.go:893 Consider preallocating all
cmd/dist/build.go:1246 Consider preallocating plats
cmd/dist/build.go:1264 Consider preallocating results
cmd/dist/buildgo.go:59 Consider preallocating list
cmd/doc/pkg.go:363 Consider preallocating names
cmd/fix/typecheck.go:219 Consider preallocating b
cmd/go/internal/base/path.go:34 Consider preallocating out
cmd/go/internal/get/get.go:175 Consider preallocating out
cmd/go/internal/load/pkg.go:1894 Consider preallocating dirent
cmd/go/internal/work/build.go:2402 Consider preallocating absOfiles
cmd/go/internal/work/build.go:2731 Consider preallocating absOfiles
cmd/internal/objfile/pe.go:48 Consider preallocating syms
cmd/internal/objfile/pe.go:38 Consider preallocating addrs
cmd/internal/objfile/goobj.go:43 Consider preallocating syms
cmd/internal/objfile/elf.go:35 Consider preallocating syms
cmd/link/internal/ld/lib.go:1070 Consider preallocating argv
cmd/vet/all/main.go:91 Consider preallocating pp
database/sql/sql.go:66 Consider preallocating list
debug/macho/file.go:506 Consider preallocating all
internal/trace/order.go:55 Consider preallocating batches
mime/quotedprintable/reader_test.go:191 Consider preallocating outcomes
net/dnsclient_unix_test.go:954 Consider preallocating confLines
net/interface_solaris.go:85 Consider preallocating ifat
net/interface_linux_test.go:91 Consider preallocating ifmat4
net/interface_linux_test.go:100 Consider preallocating ifmat6
net/internal/socktest/switch.go:34 Consider preallocating st
os/os_windows_test.go:766 Consider preallocating args
runtime/pprof/internal/profile/filter.go:77 Consider preallocating lines
runtime/pprof/internal/profile/profile.go:554 Consider preallocating names
text/template/parse/node.go:189 Consider preallocating decl
// cmd/api/goapi.go:301
var missing []string
for feature := range optionalSet {
	missing = append(missing, feature)
}

// cmd/fix/typecheck.go:219
var b []ast.Expr
for _, x := range a {
	b = append(b, x)
}

// net/internal/socktest/switch.go:34
var st []Stat
sw.smu.RLock()
for _, s := range sw.stats {
	ns := *s
	st = append(st, ns)
}
sw.smu.RUnlock()

// cmd/api/goapi.go:301
var missing []string
for feature := range optionalSet {
	missing = append(missing, feature)
}

Even if the size the slice is being preallocated to is small, there's still a performance gain to be had in explicitly specifying the capacity rather than leaving it up to append to discover that it needs to preallocate. Of course, preallocation doesn't need to be done everywhere. This tool's job is just to help suggest places where one should consider preallocating.

How do I fix prealloc's suggestions?

During the declaration of your slice, rather than using the zero value of the slice with var, initialize it with Go's built-in make function, passing the appropriate type and length. This length will generally be whatever you are ranging over. Fixing the examples from above would look like so:

// cmd/api/goapi.go:301
missing := make([]string, 0, len(optionalSet))
for feature := range optionalSet {
	missing = append(missing, feature)
}

// cmd/fix/typecheck.go:219
b := make([]ast.Expr, 0, len(a))
for _, x := range a {
	b = append(b, x)
}

// net/internal/socktest/switch.go:34
st := make([]Stat, 0, len(sw.stats))
sw.smu.RLock()
for _, s := range sw.stats {
	ns := *s
	st = append(st, ns)
}
sw.smu.RUnlock()

// cmd/api/goapi.go:301
missing := make ([]string, 0, len(optionalSet))
for feature := range optionalSet {
	missing = append(missing, feature)
}

Note: If performance is absolutely critical, it may be more efficient to use copy instead of append for larger slices. For reference, see the following benchmark:

func BenchmarkSize200PreallocateCopy(b *testing.B) {
	existing := make([]int64, 200, 200)
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		// Preallocate our initial slice
		init := make([]int64, len(existing))
		copy(init, existing)
	}
}
$ go test -bench=. -benchmem
goos: linux
goarch: amd64
BenchmarkSize200NoPreallocate-4     	  500000	      3080 ns/op	    4088 B/op	       9 allocs/op
BenchmarkSize200Preallocate-4       	 1000000	      1163 ns/op	    1792 B/op	       1 allocs/op
BenchmarkSize200PreallocateCopy-4   	 2000000	       807 ns/op	    1792 B/op	       1 allocs/op

TODO

  • Configuration on whether or not to run on test files
  • Support for embedded ifs (currently, prealloc will only find breaks/returns/continues/gotos if they are in a single if block, I'd like to expand this to supporting multiple if blocks in the future).
  • Globbing support (e.g. prealloc *.go)

Contributing

Pull requests welcome!

Other static analysis tools

If you've enjoyed prealloc, take a look at my other static analysis tools!

prealloc's People

Contributors

alexkohler avatar ankit-arista avatar dreamer-nitj avatar estensen avatar jackwilsdon avatar knweiss avatar polyfloyd avatar rittneje avatar rliebz avatar scop avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prealloc's Issues

Use copy() instead of append() for large arrays

If the array is quite large, it's more efficient to use copy() instead of append()

so for example, if len(optionalSet) > 100 and if optionalSet is a slice

var missing []string
for _, feature := range optionalSet {
	missing = append(missing, feature)
}

should become

missing := make([]string, len(optionalSet)
copy(missing, optionalSet)

instead of

missing := make([]string, 0, len(optionalSet))
for _, feature := range optionalSet {
	missing = append(missing, feature)
}

Benchmark:

For example, when len(existing) = 500, with this new test:

func BenchmarkPreallocateCopy(b *testing.B) {
	existing := make([]int64, 200, 200)
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		// Preallocate our initial slice
		init := make([]int64, len(existing))
		copy(init, existing)
	}
}

the results clearly shows that copy is faster:

BenchmarkNoPreallocate-2     	  300000	      4039 ns/op
BenchmarkPreallocate-2       	 1000000	      1664 ns/op
BenchmarkPreallocateCopy-2   	 1000000	      1303 ns/op

Maybe it should be mention somewhere in the README ?

Duplicate output lines for repeat append calls in a loop

Given this sample program:

package main

import "fmt"

func main() {
	var buf []byte

	strings := []string{"foo", "bar", "baz"}
	for i, s := range strings {
		buf = append(buf, s...)
		buf = append(buf, '\n')
	}

	fmt.Println(buf)
}

prealloc will output:

main.go:6 Consider preallocating buf
main.go:6 Consider preallocating buf

It appears to be outputting a line for each append in the loop, i.e. if I add one more buf = append(buf, x) there will be one more "main.go:6 Consider preallocating buf" line output.

There should be at most one warning regardless of multiple append calls.

Faulty prealloc

I currently work in a team and we had a code issue a few times that made it beyond code review.
It is pre-allocating a slice with size instead of cap, then use append on it. Example

ar := make([]int, 10)
for i:=0; i<10; i++ {
  ar = append(ar, i)
  }

The correct pre-alloc would be ar := make([]int, 0, 10). Because of the missing 0,, the faulty code produces an array of size 20.

Would it be possible to add a check for this into your linter? I could not find any linter that has this check.

Add unit tests

Even if they're more integration-y tests, there should be some sort of testing to pick up regressions and test new functionality.

Avoid false positive when slices are pre-allocated

If a pre-allocated slice is declared by var statement and explicit type, it would be incorrectly reported as a candidate for pre-allocation.

For instance, the code snippet following would be reported.

package main

func main() {
	var s []int = make([]int, 0, 100)
	for i := range "Hello" {
		s = append(s, i)
	}
}

// Output:
// .\testdata\sample.go:4 Consider preallocating s

prealloc confused by ranging over a channel

var jobs []job
for job := range jobc {
    jobs = append(jobs, job)
}

In the code above, jobc is an unbuffered channel. I don't know the number of jobs that will be pulled from it. So I can't preallocate it.

line of sight loops trick prealloc

When ranging over a list that contains conditional break/continue directives and a slice = append(slice, element) at the end, prealloc suggests preallocation.

No lint triggered

var a []int
for i := range []struct{}{{}, {}, {}} {
        if i < 1 {
                a = append(a, i)
        } 
}

Lint triggered

var a []int
for i := range []struct{}{{}, {}, {}} {
        if i < 1 {
              continue
        } 
        a = append(a, i)
}

is preallocating array with 0 length still get benefit on performance ?

Hi, i have case when i need to define array that supposed to received result from database. Previously i have done this:

var models []Model
result := db.Where("namespace = ?", namespace).Find(&clients)

where the result length is arbitrary. When i run prealloc, it suggests me to preallocating the array so i revised the code like this

models := make([]Model, 0) // i put 0 since i don't know the length yet
result := db.Where("namespace = ?", namespace).Find(&clients)

So, am i doing this correctly ? is preallocating the array with zero length because the length can be arbitrary still gaining the performance benefit rather than defining it with var?

Thanks!

Support for custom slice types or aliases

This is not detected at the moment:

type MySlice []string

var missing MySlice
for feature := range optionalSet {
	missing = append(missing, feature)
}

And this as well:

type MySlice = []string

var missing MySlice
for feature := range optionalSet {
	missing = append(missing, feature)
}

Prealloc confused by variable redeclaration

In the following function, prealloc is confused by buf being redeclared:

func copySort(w io.Writer, r io.Reader, sep rune, keys []int) error {
	// Copy the header line.
	var buf [1]byte // THATS LINE 275 WHERE THE ERROR MESSAGES POINT TO
	for {
		n, err := r.Read(buf[:])
		if n != 0 {
			if _, err := w.Write(buf[:]); err != nil {
				return err
			}
		}
		if err != nil {
			if err == io.EOF {
				return nil
			}
			return err
		}
		if buf[0] == '\n' {
			break
		}
	}
	...
	for i, k := range keys {
		buf := make([]byte, 0, 16)
		buf = append(buf, "-k"...)
		n := int64(1 + k)
		buf = strconv.AppendInt(buf, n, 10)
		buf = append(buf, ',')
		buf = strconv.AppendInt(buf, n, 10)
		args[2+i] = alloc.DirectString(buf)
	}
	...
}

preallocreports these 2 error messages:

...:275 Consider preallocating buf
...:275 Consider preallocating buf

If I rename the first buf declaration with any other variable name (e.g., cell) the error messages disappear.

It seems a variable redeclaration is not always managed correctly by prealloc.

Set exit status when suggestions are found

Setting the exit status of prealloc to something other than 0 would make it possible to use in CI builds as a means of verification.

If compatibility is a concern, a flag could be used to enable this behavior. In that case, the flag would ideally be -set_exit_status to be consistent with golint.

Map preallocation

Preallocation of maps can be useful, as shown by PR #23, would be nice to have prealloc suggest that too.

Just wondering how smart it can be made as things are not as straightforward as with slices because there's key uniqueness at play. Cases where the optimal preallocated capacity can be easily determined beforehand are more rare with maps, and in that sense suggesting it could have a worse signal to noise ratio. Maybe it should be made optional if it can't be made "smart enough" ๐Ÿค”

False positive when scanning a file and incorrect line number for suggested preallocation

Example setup to reproduce the issue.

foo.txt

some dummy text

/* some comment */
more dummy text

// some comment
even more dummy text

main.go

type FooLine struct {
	Parts     []string
	LineFound int
}

type Foo struct {
	Words     []string
	LineFound int
}

func main() {
	file, err := os.Open("foo.txt")
	if err != nil {
		panic(err)
	}
	defer file.Close()

	var (
		lines      = bufio.NewScanner(file)
		lineNumber int
		foos       []*Foo
	)
	for lines.Scan() {
		lineNumber++

		line := lines.Text()
		if line == "" || strings.HasPrefix(line, "//") || strings.HasPrefix(line, "/*") {
			continue
		}

		foos = append(foos, &Foo{
			Words:     strings.Split(line, " "),
			LineFound: lineNumber,
		})
	}

	if err := lines.Err(); err != nil {
		panic(err)
	}

	println(foos)
}

Running prealloc -forloops main.go outputs:
main.go:21 Consider preallocating foos

Two issues:

  1. Because I'm using a scanner, foos can't be preallocated. It's not possible to know information about the number of lines or data lines in the file.
  2. Minor issue: prealloc is identifying the problem on line 21, however, that is the opening declaration of the var block var (. foos is actually on line 24. Not sure if this is possible to remedy, and not a huge issue since prealloc names the slice to consider preallocating in the output.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.