GithubHelp home page GithubHelp logo

sailro / dexer Goto Github PK

View Code? Open in Web Editor NEW
91.0 91.0 28.0 3.37 MB

Dexer is an open source framework, written in C#, that reads and writes .DEX files (Dalvik Executable Format) used by the Android Open Source Project.

License: MIT License

C# 100.00%
android dalvik dex

dexer's People

Contributors

sailro avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dexer's Issues

Null descriptor strings

It appears possible to have null strings in the dex format, unsure if this is the result of a malformed dex or not.

2E 00 1C 41 77 61 6B 65 54 69 6D 65 53 69 6E 63 65 42 6F 6F 74 43 6C 6F 63 6B 2E 6A 61 76 61 00 00 00 01 42 00 13 42 41 43 4B 47 52 4F 55 4E 44 5F 4C 4F 43 41 54 49 4F 4E 00

note '00 00' at offset 32.

Subsequently, when DexReader:PrefetchTypeReferences is called, the function TypeDescriptor.Allocate is used to consume this null string, which in turn returns a null type descriptor, which is then added to the TypeReferences list.

This becomes problematic later on, when https://github.com/sailro/Dexer/blob/master/Dexer/IO/DexReader.cs#L201 is called, as it means Dex.TypeReferences[classIndex] can actually be null.

To combat this, we can improve checking on null values within DexReader (line 201), this way we maintain the requirement of keeping indexing into TypeReferences viable:

                    if (reference == null)
                    {
                        ClassDefinition cdef = new ClassDefinition();
                        var reference2 = Dex.TypeReferences[classIndex] as ClassReference;
                        // use empty object
                        if (reference2 == null)
                            cdef = null;
                        else
                            cdef = new ClassDefinition(reference2);

                        Dex.TypeReferences[classIndex] = cdef;
                        Dex.Classes.Add(cdef);
                    }

Unfortunately this wreaks havoc later on when TypeDescriptor.Fill (

internal static void Fill(string tdString, TypeReference item, Dex context)
) NPR's in context.Import as TypeReference (item) can be null.

I am not 100% sure how to progress this one, but will keep this issue to track the problem.

What is a proper way to add NEW method?

Hello!

I am trying to create a new DEX file from scratch but failing to add class and method implementation.
What is a proper way to do it?
I am getting KeyNotFoundException either in TypeLookup or MethodLookup collection.
The proper(?) key in each collection exists.

Thanks!

Crash InstructionReader on somes obfuscated dex

Hi!
On somes obfuscated dex (see attach) InstructionReader is crash.
If payloads of pseudo-instructions inject between normal instructions in body.
This case is not taken in your inplementation.
But the real Dalvik VM normal process this code.
Test2
Thank you!
classes3_mod.zip

InstructionReader.ExtractSparseSwitch Fails to extract switch.

The following exception is given:

System.Collections.Generic.KeyNotFoundException: The given key was not present in the dictionary.
>    at System.Collections.Generic.Dictionary`2.get_Item(TKey key)
   at Dexer.IO.InstructionReader.<>c__DisplayClass49_1.<ExtractSparseSwitch>b__0() in C:\Users\david\.paket\git\db\Dexer\Dexer\IO\InstructionReader.cs:line 594

The following Java was used to generate this:

    public void sparseSwitch() {
        int success = 511;

        switch (success) {
            case 500:
                System.out.println("No result");
                break;
            case 300:
                System.out.println("Final result: Fail");
                break;
            case 1:
                System.out.println("Final result: Success");
                break;
            case 63335:
                System.out.println("crap");
                break;
            default:
                System.out.println("Unknown result");
                break;
        }
    }

OpCodes.FilledNewArrayRange Overflow

The current code at https://github.com/sailro/Dexer/blob/master/Dexer/IO/InstructionReader.cs#L334 will cause an incorrect registerCount.

E.g. if Upper[idx] == 2, where 2 is an int, the left shift by 16 (to get 8bits?) will equal 131072, which causes the subsequent forloop that reads register values to exceed the register array bounds.

The fix should be to trap the registerCount variable to an upper bound of 0xFFFF, (which results in registerCount being 2), but this is my guess based off https://source.android.com/devices/tech/dalvik/dalvik-bytecode.html:

When reasonably possible, instructions allow references to up to the first 256 registers. In addition, some instructions have variants that allow for much larger register counts, including a pair of catch-all move instructions that can address registers in the range v0 โ€“ v65535

This will require someone to sanity check before it can be fixed, as I am not sure it is a robust fix.

registerCount = (Upper[_ip++] << 16) % 0xFFFF;

Fix InstructionWriter.WriteTo if exrta data present

Hi!
Thanks for your wonderful framework.
I found the little bug in InstructionWriter.WriteTo(BinaryWriter writer) method.

public void WriteTo(BinaryWriter writer)

If extra data present, then their offset must be alignment to 4-bytes.
But if the value of stats.CodeUnits is not even, size of Codes array may be not sufficient to write the last element (fill-array-data processing).
My fix way:

			int cusz = stats.CodeUnits;
			if (stats.ExtraCodeUnits > 0) 
			{
				if (cusz % 2 != 0) cusz++;
			}
			_extraOffset = cusz;
			Codes = new ushort[cusz + stats.ExtraCodeUnits]; //fix array size
..... skip more code ...
		if (_extraOffset != cusz + stats.ExtraCodeUnits)
				throw new MalformedException("Data pointer out of range");

Please optimize these codes

Google's Dalvik Exec Format Page clearly states that string_ids are sorted by lexicographical order.

string identifiers list. These are identifiers for all the strings used by this file, either for internal naming (e.g., type descriptors) or as constant objects referred to by code. This list must be sorted by string contents, using UTF-16 code point values (not in a locale-sensitive manner), and it must not contain any duplicate entries.

As well as type_ids,field_ids,method_ids,class_defs and so on.

But actually, I found very ungly code, which is used to find existed ClassDefinition and TypeReference in Dexer.

https://github.com/sailro/Dexer/blob/master/Dexer/Core/Dex.cs#L133-L162

internal ClassDefinition GetClass(string fullname, List<ClassDefinition> container)
{
    foreach (var item in container)
    {
        if (fullname.Equals(item.Fullname))
            return item;

        var inner = GetClass(fullname, item.InnerClasses);
        if (inner != null)
            return inner;
    }
    return null;
}

internal TypeReference Import(TypeReference tref, bool add)
{
    foreach (var item in TypeReferences)
    {
        if (tref.Equals(item))
        {
            return item;
        }
    }
    if (add)
    {
        // if !add see TypeDescriptor comment 
        TypeReferences.Add(tref);
    }
    return tref;
}

These places can obviously be optimized by using binary search.

Find a bug in ReadAnnotationSetRefList

annotation_set_ref_item format says

offset from the start of the file to the referenced annotation set or 0 if there are no annotations for this element.

https://github.com/sailro/Dexer/blob/master/Dexer/IO/DexReader.cs#L285-L286

    var size = reader.ReadUInt32();
    for (uint i = 0; i < size; i++)
    {
        var offset = reader.ReadUInt32();
        result.Add(ReadAnnotationSet(reader, offset));
    }

there should be

    for (uint i = 0; i < size; i++)
    {
        var offset = reader.ReadUInt();
        if (offset == 0)
            result.Add(new List<Annotation>(0));
        else
            result.Add(ReadAnnotationSet(offset));
    }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.