GithubHelp home page GithubHelp logo

sepinf-inc / iped Goto Github PK

View Code? Open in Web Editor NEW
842.0 64.0 205.0 189.7 MB

IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.

License: Other

Java 90.63% JavaScript 5.04% Python 0.70% HTML 3.09% CSS 0.43% XSLT 0.03% Cypher 0.07%
forensic recovery digital-forensics

iped's Introduction

IPED Digital Forensic Tool

IPED is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.

History

IPED - Digital Evidence Processor and Indexer (translated from Portuguese) is a tool implemented in java and originally and still developed by digital forensic experts from Brazilian Federal Police since 2012. Although it was always open source, only in 2019 its code was officially published.

Since the beginning, the goal of the tool was efficient data processing and stability. Some key characteristics of the tool are:

  • Command line data processing for batch case creation
  • Multiplatform support, tested on Windows and Linux systems
  • Portable cases without installation, you can run them from removable drives
  • Integrated and intuitive analysis interface
  • High multithread performance and support for large cases: up to 400GB/h processing speed using modern hardware and 135 million items in a (multi) case as of 12/12/2019

Currently IPED uses the Sleuthkit Library only to decode disk images and file systems, so the same image formats are supported: RAW/DD, E01, ISO9660, AFF, VHD, VMDK. There is also support for EX01, VHDX, UDF(ISO), AD1 (AccessData) and UFDR (Cellebrite) formats.

If you are new to the tool, please refer to the Beginner's Start Guide.

Building

To build from source, you need git, maven and Java JDK 11 + JavaFX (e.g. Liberica OpenJDK 11 Full JDK) installed. Set JAVA_HOME environment var to your java 11 installation folder, then run:

git clone https://github.com/sepinf-inc/IPED.git
cd IPED
mvn clean install

It will generate an snapshot version of IPED in target/release folder.

Attention: the default master branch is the development one and is unstable. If you want to build a stable version, checkout some of the release tags after the clone step.

On Linux you also must build The Sleuthkit and additional dependencies. Please refer to Linux Section

Contributions are very welcome! Before contributing please refer to Contributing

Features

Some of IPED several features are listed below:

  • Supported hashes: md5, sha-1, sha-256, sha-512 and edonkey. PhotoDNA is also available for law enforcement (please contact iped at pf dot gov dot br)
  • Supported hash sets: NIST NSRL, NIST CAID, ProjectVIC, Interpol ICSE, standard CSV format
  • Fast hash deduplication
  • Signature analysis
  • Categorization by file type and properties
  • Recursive container expansion of dozens of file formats
  • Embedded forensic/virtual disks expansion: supports splitted or single segment DD, E01, EX01, VHD, VHDX, VMDK (differential VMDKs are also supported)
  • Image and video gallery for hundreds of formats
  • Georeferencing of GPS data, using Google Maps, Bing or OpenStreetMaps
  • Regex searches with optional script validation for credit cards, emails, urls, ip & mac addresses, money values, bitcoin, ethereum, monero, ripple wallets and more...
  • Embedded hex, unicode text, metadata and native viewers
  • File content and metadata indexing and fast searching, including unknown files and unallocated space
  • Efficient data carving engine (takes < 10% processing time) that scans much more than unallocated, with support for +40 file formats, including videos, extensible by scripting
  • Optical Character Recognition powered by tesseract 5
  • Encryption detection for known formats and using entropy test
  • Processing profiles: forensic, pedo (csam), triage, fastmode (preview) and blind (for automatic data extraction)
  • Detection for +70 languages
  • Named Entity Recognition (needs Stanford CoreNLP models to be downloaded)
  • Customizable filters based on any file metadata
  • Similar document search with configurable threshold
  • Similar image search, using internal or external image
  • Similar face recognition, optimized to run without GPU, with configurable threshold
  • Unified table timeline view and event filtering for timeline analysis
  • Powerful file grouping (clustering) based on ANY metadata
  • Support for multicases up to 135 million items
  • Extensible with javascript and python (including cpython extensions) scripts
  • External command line tools integration for file decoding
  • Browser history for IE, Edge, Firefox, Chrome and Safari
  • Custom parsers for Emule, Shareaza, Ares, WhatsApp, Skype, Telegram, Bittorrent, ActivitiesCache, and more...
  • Fast nudity detection for images and videos using random forests algorithm (thanks to its author @tc-wleite)
  • Nudity detection using Yahoo open-nsfw deeplearning model (needs keras and tensorflow)
  • Audio Transcription, local and remote implementations with Azure and Google Cloud services
  • Graph analysis for communications (calls, emails, instant messages...)
  • Stable processing with out-of-process file system decoding and file parsing
  • Resuming or restarting stopped or aborted processing (--continue/--restart options)
  • Web API for searching remote cases, get file metadata, raw content, decoded text, thumbnails and posting bookmarks
  • Creation of bookmarks/tags for interesting data
  • HTML, CSV reports and portable cases with tagged data

Screenshots

Processing: image

Analysis: image

Data Carving & Video Thumbnails: image

Regex Results: image

Map: image

Communication links: image

Face search: image

Audio Transcription: image

Timeline: image

Time chart: image

iped's People

Contributors

aberenguel avatar arisjr avatar atilaromero avatar dependabot[bot] avatar dhoelz avatar felipecampanini avatar felipecostadesousa avatar felipefcosta avatar filipesimoes avatar flates avatar fmpfeifer avatar fsicoli avatar gfd2020 avatar hauck-jvsh avatar hugohmk avatar kraftdenker avatar leosol avatar lfcnassif avatar marcus6n avatar mbichara avatar mobab-th avatar patrickdalla avatar rodac5 avatar ruisantana avatar streeg avatar thalespr avatar wladimirleite avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

iped's Issues

Liberação do código fonte para contribuições?

Prezado,

Antes de mais nada parabéns pelo trabalho. Gostaria de saber se há interesse na liberação do código para que outras pessoas possam contribuir.
Há algumas visualizações e análises que poderiam ser feitas, por exemplo, com teoria dos grafos aplicadas aos contatos realizados, mensagens encontradas, etc. Além disso, com o código seria possível integrar bibliotecas que fazem vetorização de palavras para aumentar o escopo da pesquisa de regex. Enfim, há várias contribuições possíveis. Peço pra saber se há interesse na abertura do código para, então, fazer essas contribuições.
Muito obrigado,
Danilo

Skype v12 support

Initial implementation by Patrick Bernardina was pushed to SkypeV12 branch. Actually seems the implementation supports versions v13 and v14.

Monitor timeout for all processing modules

Currently only problematic modules (parsing, image and video thumbs, ocr) have timeout control. But even simple modules (like signature) could hang, eg after dependency updates (happened in the past) or when dealing with specifically crafted files. Buggy user modules could hang too.

So a general timeout control is desired.

Dependência da imagem

Bom dia.
Após a indexação do caso, sempre necessitarei da imagem, por exemplo, E01, para poder visualizar os arquivos? Em qual arquivo de configuração eu posso indicar o caminho da imagem, caso a mude de local?

Incremental processing

Make it possible to run any processing module (hash, signature, container expansion, carving, ocr, indexing...) after initial processing. This will also allow to resume processing with errors.

Probably the index will be broken into 2 (metadata and text indexes), so metadata index with processing flags will be easily and efficiently updated. Will break back compatibility, so is scheduled for v4.0.

Add an easier way to change the number of thumbnail columns in the gallery.

Add an alternative (easier) way to change the number of columns displayed in the thumbnail gallery.
Currently this option is available as an item application menu.
My suggestion is to use a nice feature called “Action” available in the DockingFrames library.
It allows associating controls (e.g. buttons) to a “dockable” element, in this case the “Gallery”. These controls are shown in its title bar, close to the already existing window controls.
This feature (adding buttons to the title bar) is used by several applications (like Eclipse).
Whenever the gallery is active, it would be possible to change the number of thumbnail columns directly.
In my tests, this feature works better with small icon buttons (without text), so it would use tooltips.
If this option works fine, in the future, the current way of changing the number of thumbnail columns could be removed from the menu, as it contains a large number of items already.
An example from the library manual:
dockable-actions

Arquivo *.iped

No manual, no tópico processsamento, informa-se que " -d: dados diversos (pode ser usado varias vezes): pasta, imagem DD, 001, E01, AFF (apenas linux), ISO, disco físico, ou arquivo *.iped (contendo seleção de itens a exportar e reindexar)". Qual o formato de um arquivo *.iped? Que tipos de itens eu posso ter nesse arquivo? Nomes de arquivos? Nomes de pastas? Obrigado.

Implement a view and export of all findings

We need to be able to visualize and export all findings from all the files.

Simple use case would be to check if all findings are the same.
It is easy to find more complicated scenarios.

User interface to configure options and start processing

A lot of third parties have developed user interfaces to configure and start processing. We have heard about 7 of them, at least. So this is a needed feature, very important for non tech users. It will be needed when additional/post processing is implemented.

Parsers for phone artifacts integrating ALeapp/iLeapp

Currently we just have parsers for WhatsApp and Skype (edited: and Telegram). To decrease the dependency of other tools (UFDR reports), it is important to have parsers for calls, contacts, calendar, sms/mms, notes, locations, other instant message apps (facebook, telegram, instagram, twitter, snapchat...), custom email containers. Android and iOS will need different parsers. This ticket could be broken in smaller ones for each artifact.

Contributions are very welcome :)

Problem importing NSRL

Boa tarde lfcnassif

Estou com problemas para importar o NSLR no IPED, no meu caso nem chega a iniciar a importação. aparece o erro:
"ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console."
Poderia dar uma ajuda por favor?
Obrigado!

Problem importing NSRL

Boa tarde,

Como utilizar a base NSRL? Baixei os arquivos do site deles, mas não contém arquivos .db, como sugere o arquivo de configuração. Tentei importar o que estão lá, mas dá erro. Qual o procedimento correto?

Optimize bookmarking of duplicated items

Currently an index search is done to find hash duplicates of items being bookmarked to include those duplicates together in the bookmark. It could be a lot faster, using the same approach used by the fast DynamicDuplicateFilter (Lucene DocValues) that filters duplicates on the fly.

Use classic rectangular tabs instead of curved ones, to save space.

Currently the dockable tabs waste a lot of horizontal space because of their "curved" aspect.
@lfcnassif , do you have a personal preference or any other reason to keep this particular feature of the current look (Docking Frames Eclipse theme)?!

In smaller screens, this can be annoying as the divider between the left and the central controls can't be moved too far to the left without hiding one of the tab titles (e.g. Categories / Evidences).

My suggestion is to change it to use classic rectangular tabs and with reduced insets, as can be seen in the lower part of the picture.
CurvedTabs

Allow image rotation in the viewer

Sometimes it is useful to rotate images that contain rotated text (e.g. scanned documents with wrong orientation).
My suggestion is to add a small vertical toolbar to the image viewer, which would only be visible when the mouse is inside the viewer area, to avoid wasting space.
The rotation would be available with two buttons: rotate right, rotate left.
As there will be a toolbar, I suggest also adding two basic functions:

  1. Zooming buttons (in, out, and fit-to-window), as an alternative way to control zoom (other than existing mouse scroll button).
  2. A slider to control image brightness (useful for dark images).

Store all email headers as metadata

Currently just subject, date, from, to, cc and bcc are stored. It would be useful to index all header fields as individual metadata, so user could filter or sort by those fields. EML, PST and OST parsers should be updated.

Temporary sqlite file leak

Usually temporary sqlite files are left behind in temporary folder, sometimes they are opened in write mode and uses WAL logs, so closing db file handles does not clear wal logs and mmap files.

Export to CSV sometimes generates malformed CSV

Seems it started to happen after automatic column management was turned on by default. As export to CSV includes all visible columns, some custom columns eventually have invalid chars, like image:comments, and that cause the issue.

Index (and store?) files into ElasticSearch cluster

ElasticSearch cluster has many advantages over local Lucene indexes: remote access api well defined and documented, scalability, replication, load balancing... Also, that will allow developing a web analysis ui totally decoupled and independently from the processing engine.

Drawback is those cases will not be portable. Probably local indexes will need to continue being supported, at least for reports.

Refactor internationalization

Currently there are profiles for each language and index fields change with language. This is bad and must be changed. Will break back compatibility, so will be done for 4.0 version.

Store small extracted files into container format

Currently iped generates a lot of small files (thumbnails, container subitems and ocr text) into case folder. That makes copying or deleting case very slow. Storing small files into a container would be better. Big files (>50MB?) will remain extracted into case folder because some modules need an actual file to process, otherwise processing or UI will be slower waiting for big temp files to be extracted from the new container format.

SQLite is a storage option, but performance needs to be evaluated because only one thread can write at a time into the database.

Organize evidence folders

Is there a way to organize the evidences into folders? For example, lets say i have data from two custodians, each has a forensic image of a laptop in E01 format and a forensic report of cellebrite in xml format.

Can i put these evidences according to the custodian folder? Like the example bellow.

Evidences

Custodian1_Folder

Laptop.e01
SmartphoneCellebrite

Custodian2_Folder

Laptop.e01
SmartphoneCellebrite

Make IPED Viewer DPI-aware

When IPED search app is used in a high-resolution monitor (e.g. 4K), fonts and most of the controls look too small.
@lfcnassif , has anyone ever complained about this?

There are a few workarounds (at least in Windows 10), but all of them will need to scale up the window content, losing quality.
Currently there is the "CTRL+" / "CTRL-" to increase font sizes, but it doesn't work very well (and is "hidden").
The ideal solution, in my opinion, would be to review all user interface related code (iped-viewer?) to be "resolution aware", i.e. to know which is the current resolution of the monitor.
In practice, a single class would take care of the calculations to scale, but existing code would need to be reviewed to use that class.
Once the application is converted to be resolution aware, it would be possible to add a local parameter to allow scaling (e.g. 120%, if the user wants larger fonts/controls).
This is how it looks like now in a 4K monitor:
4K

Filter Manager

Create a central filter manager dialog or tab to list all applied or last filters, so users could enable/disable specific filters, last one or all of them. Currently user needs to go to each tab to disable its filter, that's annoying.

Problem with -ocr option

Estou tentando usar o -ocr mas não estou conseguindo, poderia mostrar a sintaxe? Estou fazendo como abaixo, mas o resultado não está indexando:
java -jar iped.jar -d S:\teste2\IPED\marcados\03-01.iped -ocr Documentos -o S:\teste2\IPED\teste-03-01.

Outra dúvida, há como marcar o -ocr para somente alguns ítens marcados? Algo tipo ... [-ocr marcador1] ?

Grato.

Graph/link analysis

Show graphs of communications (calls, emails, messages) or relations between entities (emails, phones, accounts, contacts, person ids). Initial implementation done by @filipesimoes. Will be pushed after dependency problems are solved.

Currently both entities and items (files, emails, messages) are represented as nodes. Probably the model will be refactored, so nodes will represent only entities and edges will represent case items (communications and files).

Continue aborted processing

Got an ideia while working on #26. I think this don't need #24 to be implemented before. The current problem to recover processing is that item IDs change between different runs because of multithreading.

But we can create some persistent ID between runs like (path + sleuthkitID + subitemID) and do partial commits into the index. If processing is aborted or crashes, we can load those persistent IDs from the index and skip already commited items. There are some details, but I think it will work.

Como obter o IPED ?

Caro Nassif , sou perito do TJRJ e gostaria de obter o sofrware , perdoe minha ignorancia mas não consegui achar aqui na página . Como faço para baixar ?

Problem with 3gp exporting

Boa tarde.
Processei com sucesso um caso no IPED. Configurei para exportar todas as categorias exceto:
#Peer-to-peer
#Arquivos OLE
#Registro do Windows
#Programas e Bibliotecas
#Tamanho Zero
No IPED, selecionei os arquivos 3gp, todos com tamanho maior que 0 e não deletados e mandei exportar para uma pasta no meu computador. Todos eles foram copiados com tamanho 0, que certamente como verifiquei na listagem do próprio IPED e na própria mídia original, não estavam deletados e não possuem tamanho 0, pois os visualizei com o Encase. Será que houve problema na importação dos arquivos pelo IPED?

Optimize PST decoding

While working on #62, I noted java-libpst was wasting a lot of time in PSTNodeInputStream.seek(long) iterating over a list of skipPoints to find the correct position. Using a Set should be much faster.

Records from sqlites in wal mode may be missed

If sqlite is in wal mode, last writes are written to a separate wal file. This file is usually merged and deleted when last connection to database is closed. But if application suddenly terminates (computer power cable unplugged or phone battery removed), wal logs will retain last records written. Currently wal logs are not checked nor processed by SQLite parses.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.