GithubHelp home page GithubHelp logo

Comments (26)

myzhan avatar myzhan commented on June 2, 2024 1

Only for the macho engine, I will try to use the dwarf parser to locate the "__go_buildinfo" section, and search for the magic string in it, just like the Go std.

from kcov.

SimonKagstrom avatar SimonKagstrom commented on June 2, 2024

If there are no debug symbols, it will not work. Normally, you pass -g to the compiler, but I'm not sure how go does it?

The other issue seems to be with xcode. Perhaps it would be better to build it with make or ninja, i.e.,

cmake -G make -DOPENSS...

Does that work?

from kcov.

myzhan avatar myzhan commented on June 2, 2024

For the building issue, I switch to make and it's ok. And I think the version of dsymutil does not support go executable, which I installed via brew. Do we need a new go-parser for go executables?

from kcov.

SimonKagstrom avatar SimonKagstrom commented on June 2, 2024

Good, I'll update the build instructions for OSX.

If you comment out the "system("dsymutil...");" stuff in macho-parser.cc, does it work with Go then?

from kcov.

myzhan avatar myzhan commented on June 2, 2024

No, it doesn't. Can I reopen the previous issue?

from kcov.

myzhan avatar myzhan commented on June 2, 2024

OK, I manage to make it work with Go, and the new engine is as fast as ptrace in linux.

  1. remove dsymutil and let it parse the executable directly
  2. remove the check for hdr->filetype != MH_DSYM
  3. remove is_end_seq
  4. read and write the addr without adding m_imageBase

from kcov.

myzhan avatar myzhan commented on June 2, 2024

I wrote a tool in Go to help me test the parsing procedure, and found that is_end_seq will make us missing the last address of each compile unit.

package main

import (
	"debug/dwarf"
	"debug/macho"
	"fmt"
	"io"
)

func panicIfErr(err error) {
	if err != nil {
		panic(err)
	}
}

func main() {
	executable := "/Users/zhanqp/tmp/hello"
	exe, err := macho.Open(executable)
	panicIfErr(err)
	dw, err := exe.DWARF()
	panicIfErr(err)
	reader := dw.Reader()
	for index := 0; ; index++ {
		entry, err := reader.Next()
		panicIfErr(err)
		if entry == nil {
			break
		}
		if entry.Tag != dwarf.TagCompileUnit {
			continue
		}
		lrd, err := dw.LineReader(entry)
		panicIfErr(err)
		for {
			var e dwarf.LineEntry
			err := lrd.Next(&e)
			if err == io.EOF {
				break
			}
			panicIfErr(err)
			fmt.Printf("%s:%d %x\n", e.File.Name, e.Line, e.Address)
		}
	}
}

from kcov.

SimonKagstrom avatar SimonKagstrom commented on June 2, 2024

Good work!

is_end_seq actually needed for C code, since clang leaves some data for use in functions at the end of the function. Without is_end_seq, kcov sees that as a line of code (since it looks like that in the DWARF info), and tries to set a breakpoint there, thereby corrupting the data.

What is the filetype for the Go binaries? If one can identify them some way, it would be fairly easy to skip is_enq_seq and dsymutil etc for that case.

from kcov.

myzhan avatar myzhan commented on June 2, 2024

BTW, I reverted my changes and test a simple C binary, it failed to parse debug symbols.

$ brew install dwarfutils
$ cat hello.c
#include <stdio.h>

int main() {
	printf("hello\n");
	return 0;
}
$ clang --version
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: x86_64-apple-darwin22.1.0
Thread model: posix
$ clang -g -O0 hello.c -o hc
$ dsymutil ./hc
warning: (x86_64) /var/folders/d4/8t_88ltn76l44lv65wnx3q9c0000gp/T/hello-19fac8.o unable to open object file: No such file or directory
warning: no debug symbols in executable (-arch x86_64)

from kcov.

myzhan avatar myzhan commented on June 2, 2024

What is the filetype for the Go binaries?

The linker of Go stores build info in executable, can will search for this magic string?

https://github.com/golang/go/blob/e68c027204d410ebca5bf1a9660605f2bd737748/src/debug/buildinfo/buildinfo.go#L49C2-L49C16

from kcov.

SimonKagstrom avatar SimonKagstrom commented on June 2, 2024

If I remember correctly, clang uses a special case when building in one step, i.e., without compile + link. Then I think it might store the debug symbols in the binary directly, otherwise dsymutils will gather everything from the object files.

Perhaps that's how Go does it as well?

Then I think there should be some special-case to catch this. Not quite sure how to determine it though.

from kcov.

SimonKagstrom avatar SimonKagstrom commented on June 2, 2024

Can I look at your stashed changes? I tried to follow your steps (with a hello world compiled directly to an executable), and I just get a segfault when parsing it.

from kcov.

myzhan avatar myzhan commented on June 2, 2024

See this commit
myzhan@89ddc61

from kcov.

SimonKagstrom avatar SimonKagstrom commented on June 2, 2024

Thanks! I did basically the same changes yesterday, but still with a crash during parsing. However, it looks to me that the hello world binary still doesn't have the DWARF info embedded, so maybe there is still some other way used compared to how Go does it.

So basically your hello world example, compiled in the same way. Is that parseable with kcov, with your commit?

from kcov.

myzhan avatar myzhan commented on June 2, 2024

from kcov.

SimonKagstrom avatar SimonKagstrom commented on June 2, 2024

Yes, but I meant your hello world in c

from kcov.

myzhan avatar myzhan commented on June 2, 2024

Ah, yes, I get the same segmentation fault when parsing c executable.

from kcov.

myzhan avatar myzhan commented on June 2, 2024

The crash is in dwarf_errmsg(). The error is DW_DLV_NO_ENTRY.

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x00000001002b927f libdwarf.0.dylib`dwarf_errmsg + 9
    frame #1: 0x0000000100047a31 kcov`(anonymous namespace)::MachoParser::parse(this=0x00000001000c5278) at macho-parser.cc:135:13 [opt]
    frame #2: 0x00000001000063ff kcov`Collector::run(this=0x00000001004063d0, filename=Summary Unavailable) at collector.cc:46:16 [opt]
    frame #3: 0x000000010002db0a kcov`runKcov(runningMode=MODE_COLLECT_AND_REPORT) at main.cc:353:19 [opt]
    frame #4: 0x000000010002ae60 kcov`main(argc=3, argv=<unavailable>) at main.cc:611:9 [opt]
    frame #5: 0x00007ff812573310 dyld`start + 2432

from kcov.

myzhan avatar myzhan commented on June 2, 2024

After some digging, I realize that clang stores dwarf info into .dSYM instread of executable, that's why we can't directly parse the executable generated by clang. Apparently, the Go team doesn't follow the design.

See also:

  1. https://wiki.dwarfstd.org/Apple%27s_%22Lazy%22_DWARF_Scheme.md
  2. https://stackoverflow.com/questions/10044697/where-how-does-apples-gcc-store-dwarf-inside-an-executable/12827463#12827463

from kcov.

SimonKagstrom avatar SimonKagstrom commented on June 2, 2024

Yes, I saw the same backtrace myself.

The dSYM stuff is the reason for the dsymutil invocation. The go scheme is more like what e.g., Linux etc do it with ELF, but I'm not sure how the trivial case with the compile-direct-to-executable with clang actually works. lldb is able to debug it, so there is some way, but I'm not sure what...

from kcov.

myzhan avatar myzhan commented on June 2, 2024

So far, I can use https://pkg.go.dev/golang.org/x/[email protected]/cmd/splitdwarf to generate dSYM from go executeble before running kcov. I think the macho parser can check the dSYM is valid, before running dsymutil, nor it will report that "no debug sysbols in executable" and overwrite previous file generate by splitdawrf.

And Go has a proposal to generate dSYM directly by linker, golang/go#62577

from kcov.

myzhan avatar myzhan commented on June 2, 2024

BTW, I still need to comment out "m_imageBase" when running go executable.

from kcov.

SimonKagstrom avatar SimonKagstrom commented on June 2, 2024

OK, thanks for the progress!

I haven't looked at the mach-o stuff for a while, so I'm not don't know how to check the validity of the dSYM info. I guess there should be some checksum to match against the binary, but I'm not sure.

I guess we can check for the buildinfo string to identify go binaries, and then special case them. Alternatively, perhaps it's easier to just add a command-line option which does it for now?

E.g., with --parse-go-binary (or finding it directly) it would

  • Skip dsymutil invocation
  • ignore is_end_seq
  • set m_imageBase to 0

I guess that would work both for the splitdwarf variant and the produced-by-linker case?

from kcov.

myzhan avatar myzhan commented on June 2, 2024

I prefer to find it directly by search the binary for the buildinfo magic string, which is "\xff Go buildinf:". If you are ok with that, I can submit a PR later.

from kcov.

SimonKagstrom avatar SimonKagstrom commented on June 2, 2024

Yes, absolutely! Must the entire binary be parsed, or is it placed in a known place?

from kcov.

myzhan avatar myzhan commented on June 2, 2024

I have submitted a PR. And I can't reproduce the problem of is_seq_end, so I keep it.

from kcov.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.