GithubHelp home page GithubHelp logo

certtools / intelmq Goto Github PK

View Code? Open in Web Editor NEW
936.0 76.0 294.0 30.01 MB

IntelMQ is a solution for IT security teams for collecting and processing security feeds using a message queuing protocol.

Home Page: https://docs.intelmq.org/latest/

License: GNU Affero General Public License v3.0

Python 98.57% Shell 0.27% HTML 0.61% Makefile 0.04% Sieve 0.45% PLpgSQL 0.04% Jinja 0.02%
cybersecurity threat ioc malware phishing cert csirt intelligence incident-response alerts

intelmq's People

Contributors

aaronkaplan avatar bernhardreiter avatar cncs-pt avatar creideiki avatar dargen3 avatar e3rd avatar elsif2 avatar gethvi avatar gsiv avatar jgedeon120 avatar kamil-certat avatar mauroasilva avatar monoidic avatar navtej avatar pedromreis avatar phantasus avatar pharook avatar rafiot avatar robcza avatar sebix avatar sebkuf avatar sinus-x avatar stone-z avatar swilde avatar synchroack avatar th-certbund avatar tomas321 avatar tux78 avatar wagner-intevation avatar waldbauer-certat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

intelmq's Issues

Pipeline RabbitMQ Improvements

Aaron: we need to have a proper producer -> exchange -> queues system here
Tomás Lima: explain a little bit more
Tomás Lima: code is here -> https://github.com/certtools/intelmq/blob/master/intelmq/lib/pipeline.py#L29
Tomás Lima: connect method
Aaron: line 37
Aaron: you generate an exchange for every pipeline
Aaron: and destination queue
Aaron: instead:
Aaron: do *one* pipeline for all but use keys 
Aaron: like in the first demo tutorial example

Input Bots (feed, parser, harmonizer) - Changing architecture

To be easy the administration etc, may be we should merge 'parser bot' with 'harmonizer bot'.

EDITED: it will be reevaluated in release 2 because we need to have more input bots to see if makes sense to merge or not. May be shadowserver bot could be a good example to prove that we really need 'feed','parser' and 'harmonizer'

intelmqhelper

Tool that will help sysadmin to run bots and configure them. This tool will use INFO file from bots folder.

Encoding Errors

Some data sources has some encoding problems like DragonResearchGroup SSH Feed (http://dragonresearchgroup.org/insight/sshpwauth.txt)

Check last line (incompleted encode):

262517       |  NIQTURBO PIMENTEL E MOREIRA LT  |      177.67.99.9  |  2014-07-31 08:36:08  |  sshpwauth
262594       |  4INET Comercio e Serviços Ltd  |    177.84.241.66  |  2014-07-31 08:39:00  |  sshpwauth
262621       |  LIMA & VICENTE Diversões Elet  |    177.86.32.206  |  2014-07-31 08:43:50  |  sshpwauth
262631       |  ILIG TELECOM LTDA.,BR           |   177.86.145.182  |  2014-07-31 08:37:53  |  sshpwauth
262770       |  Heliodora Online Ltda,BR        |     186.232.71.3  |  2014-07-31 08:42:05  |  sshpwauth
262865       |  (DELTA TELECOM INTERNET) Anjos  |    177.36.81.204  |  2014-07-31 08:37:47  |  sshpwauth
262869       |  G1Telecom Provedor de Internet  |     177.11.19.17  |  2014-07-27 01:27:27  |  sshpwauth
263039       |  Interminas - Provedor de Servi  |     177.23.74.32  |  2014-07-31 08:36:14  |  sshpwauth
263426       |  F.L Networks ltda,BR            |    177.91.42.153  |  2014-07-31 08:36:22  |  sshpwauth
263548       |  portal provedor de comunicaçà |     191.6.73.216  |  2014-07-31 08:38:47  |  sshpwauth

I need some feedback from you guys.
I'm thinking add a new parameter in decode util to force the decoding in utf-8 and ignore this problems:

Code:

def decode(text, encodings=["utf-8", "ascii"], force=False):
    for encoding in encodings:
        try:
            return text.decode(encoding)
        except ValueError as e:
            pass

    if force:
        for encoding in encodings:
            try:
                return text.decode(encoding, 'ignore')
            except ValueError as e:
                pass

    raise Exception("Found a problem when decoding.")

CymruLib Problem

if item == "-" or item == "":
    result.append(None)
else:
    result.append(item)
2014-07-03 15:27:32,807 - cymru-expert - ERROR - Traceback (most recent call last):
  File "lib/bot.py", line 41, in start
    self.process()
  File "/opt/intelmq/src/bots/experts/cymru/cymru.py", line 50, in process
    query_result = Cymru.query(ip, ip_version)
  File "bots/experts/cymru/cymrulib.py", line 20, in query
    return " | ".join([ asn, bgp, cc, registry, allocated, as_name ])
TypeError: sequence item 4: expected string, NoneType found

Feature - Domain 2 IP

Since some feeds just report a domain, cymru expert and others cant augment any information because the event doesnt have an IP. This feature/method should implement a domain lookup to return an IP.

File: lib/utils.py

RabbitMQ Connection Timeout ?

After 1 hour (since processing_interval is 3600 seconds), it seems that RabbitMQ closed connection and bot crash. Try connect operation should solve the problem....

2014-07-04 18:33:35,934 - malwaredomainlist-feed - INFO - Bot is starting
2014-07-04 18:33:35,941 - malwaredomainlist-feed - DEBUG - Loading configuration in default section from 'conf/bots.conf' file
2014-07-04 18:33:35,942 - malwaredomainlist-feed - DEBUG - Parameter 'processing_interval' loaded with the value '0'
2014-07-04 18:33:35,942 - malwaredomainlist-feed - DEBUG - Parameter 'cache_host' loaded with the value '127.0.0.1'
2014-07-04 18:33:35,942 - malwaredomainlist-feed - DEBUG - Parameter 'cache_port' loaded with the value '6379'
2014-07-04 18:33:35,943 - malwaredomainlist-feed - DEBUG - Parameter 'cache_id' loaded with the value '10'
2014-07-04 18:33:35,943 - malwaredomainlist-feed - DEBUG - Parameter 'cache_ttl' loaded with the value '86400'
2014-07-04 18:33:35,944 - malwaredomainlist-feed - DEBUG - Loading configuration in malwaredomainlist-feed section from 'conf/bots.conf' file
2014-07-04 18:33:35,944 - malwaredomainlist-feed - DEBUG - Parameter 'processing_interval' loaded with the value '3600'
2014-07-04 18:33:35,945 - malwaredomainlist-feed - DEBUG - Loading pipeline queues from 'conf/pipeline.conf' file
2014-07-04 18:33:35,946 - malwaredomainlist-feed - INFO - Source queue 'None'
2014-07-04 18:33:35,946 - malwaredomainlist-feed - INFO - Destination queue(s) '['malwaredomainlist-parser-queue']'
2014-07-04 18:33:35,948 - malwaredomainlist-feed - DEBUG - Connecting to pipeline queues
2014-07-04 18:33:36,015 - malwaredomainlist-feed - INFO - Bot start processing


2014-07-04 19:33:37,789 - malwaredomainlist-feed - ERROR - Traceback (most recent call last):
  File "lib/bot.py", line 41, in start
    self.process()
  File "/opt/intelmq/src/bots/inputs/malwaredomainlist/feed.py", line 13, in process
    self.send_message(report)
  File "lib/bot.py", line 123, in send_message
    self.pipeline.send(message)
  File "lib/pipeline.py", line 33, in send
    self.destination_channel.basic_publish(exchange=self.destination_exchange, routing_key='', body=unicode(message))
  File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pika/adapters/blocking_connection.py", line 540, in basic_publish
    (properties, body), False)
  File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pika/adapters/blocking_connection.py", line 1121, in _send_method
    self.connection.send_method(self.channel_number, method_frame, content)
  File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pika/adapters/blocking_connection.py", line 249, in send_method
    self._send_method(channel_number, method_frame, content)
  File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pika/connection.py", line 1489, in _send_method
    self._send_frame(frame.Method(channel_number, method_frame))
  File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pika/adapters/blocking_connection.py", line 388, in _send_frame
    super(BlockingConnection, self)._send_frame(frame_value)
  File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pika/connection.py", line 1476, in _send_frame
    self._flush_outbound()
  File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pika/adapters/blocking_connection.py", line 348, in _flush_outbound
    if self._handle_write():
  File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 338, in _handle_write
    return self._handle_error(error)
  File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 279, in _handle_error
    self.socket.fileno(), error_code)
  File "/opt/rh/python27/root/usr/lib64/python2.7/logging/__init__.py", line 1175, in error
    self._log(ERROR, msg, args, **kwargs)
  File "/opt/rh/python27/root/usr/lib64/python2.7/logging/__init__.py", line 1268, in _log
    self.handle(record)
  File "/opt/rh/python27/root/usr/lib64/python2.7/logging/__init__.py", line 1278, in handle
    self.callHandlers(record)
  File "/opt/rh/python27/root/usr/lib64/python2.7/logging/__init__.py", line 1325, in callHandlers
    " \"%s\"\n" % self.name)
IOError: [Errno 5] Input/output error

2014-07-04 19:33:37,789 - malwaredomainlist-feed - ERROR - Bot found an error. Exiting

Observation Time

'observation time' field:

  • should be a mandatory field
  • should be defined as soon as possible (parser bot may do this)

We need to think how can we do without add a new extra detail in all system that will create more complexity for user.

Accepting suggestions :)

Pipeline Layer - Improvements

Problem: is important to have a system independent from the MessageQueue system. Should be possible to execute the system with zeromq and rabbitmq just changging the pipeline.py. With this in mind, the system should also support different type of messages.

Short Example:

import pika
from message import from_string, to_string

class RabbitMQPipeline():
    '''
        This class do not support object messages.
        Class will always convert to string messages.
    '''

    def send(self, message_obj):
        message_str = to_string(message_obj)
        pika.sendpubish(message_str)

    def receive(self):
        message_str = pika.recvpubish()
        return from_string(message_str)

import zeromq
from message import from_string, to_string

class ZeroMQPipeline():
    '''
        This class support object messages.
        Class will always use object messages.
    '''

    def send(self, message_obj):
        zeromq.sendpubish(message_obj)

    def receive(self):
        return zeromq.recvpubish()

Log all fetch url

Always log when download and check if something went wrong and raise an exception

Cymru Expert Configurations

Updated:

Sysadmin should be able to configure an expert to query just specific events which does not contain information regarding some confidential information (private & confidential feed).

Package geoip2 - speedups not enabled

Steps:

# cat /etc/*release*
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=12.04
DISTRIB_CODENAME=precise
DISTRIB_DESCRIPTION="Ubuntu 12.04.4 LTS"
NAME="Ubuntu"
VERSION="12.04.4 LTS, Precise Pangolin"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu precise (12.04.4 LTS)"
VERSION_ID="12.04"
# git clone https://github.com/certtools/intelmq.git /tmp/intelmq
# cd /tmp/intelmq
# python setup.py install

Error:

Searching for maxminddb>=0.3.2
Reading http://pypi.python.org/simple/maxminddb/
Best match: maxminddb 0.3.3
Downloading https://pypi.python.org/packages/source/m/maxminddb/maxminddb-0.3.3.tar.gz#md5=36e1ca4e46e3220aa6a2aee1b58d9d88
Processing maxminddb-0.3.3.tar.gz
Running maxminddb-0.3.3/setup.py -q bdist_egg --dist-dir /tmp/easy_install-3Oz029/maxminddb-0.3.3/egg-dist-tmp-UWjNuu
warning: no files found matching 'requirements.txt'
unable to execute gcc: No such file or directory
***************************************************************************
command 'gcc' failed with exit status 1
WARNING: The C extension could not be compiled, speedups are not enabled.
Failure information, if any, is above.
Retrying the build without the C extension now.
***************************************************************************
warning: no files found matching 'requirements.txt'
zip_safe flag not set; analyzing archive contents...
***************************************************************************
WARNING: The C extension could not be compiled, speedups are not enabled.
Plain-Python build succeeded.
***************************************************************************
Adding maxminddb 0.3.3 to easy-install.pth file

Installed /usr/local/lib/python2.7/dist-packages/maxminddb-0.3.3-py2.7.egg

PostgreSQL Encoding Error

2014-08-01 19:45:52,379 - postgresql - ERROR - Current Message(event): 'feed=dragonresearchgroup, city=Gda\xc5\x84sk, reported_ip=83.3.130.101, feed_ur
l=http://dragonresearchgroup.org/insight/vncprobe.txt, source_time=2014-07-28T02:48:02, cc=PL, ip=83.3.130.101, observation_time=2014-08-01T15:25:10.49
9358, source_ip=83.3.130.101, reported_asn=5617, as_name="TPNET Orange Polska Spolka Akcyjna,PL", bgp_prefix=83.0.0.0/13, longitude=18.6583, reported_a
s_name="TPNET Orange Polska Spolka Akc", taxonomy="Intrusion Attempts", latitude=54.3608, allocated=2003-12-03, cymru_cc=PL, type=brute-force, asn=5617
, registry=ripencc'

2014-08-01 19:45:52,379 - postgresql - ERROR - Check the following exception:
Traceback (most recent call last):
  File "intelmq/lib/bot.py", line 51, in start
    self.process()
  File "/root/intelmq/intelmq/bots/outputs/postgresql/postgresql.py", line 35, in process
    self.cur.execute(query, values)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0144' in position 3: ordinal not in range(128)

2014-08-01 19:45:52,379 - postgresql - ERROR - Pipeline connection failed (UnicodeEncodeError('ascii', u'Gda\u0144sk', 3, 4, 'ordinal not in range(128)
'))

2014-08-01 19:45:52,379 - postgresql - INFO - Pipeline will reconnect in 30 seconds
postgres@intelmq:~$ psql -l
                             List of databases
   Name    |  Owner   | Encoding  | Collate | Ctype |   Access privileges   
-----------+----------+-----------+---------+-------+-----------------------
 intelmq   | intelmq  | SQL_ASCII | C       | C     | 
 postgres  | postgres | SQL_ASCII | C       | C     | 

FIX

postgres@intelmq:~$ psql 
psql (9.1.13)
Type "help" for help.

postgres=# update pg_database set encoding = pg_char_to_encoding('UTF8') where datname = 'intelmq';
postgres@intelmq:~$ psql -l
                             List of databases
   Name    |  Owner   | Encoding  | Collate | Ctype |   Access privileges   
-----------+----------+-----------+---------+-------+-----------------------
 intelmq   | intelmq  | UTF8      | C       | C     | 
 postgres  | postgres | SQL_ASCII | C       | C     | 

Defaults config

{
    "module": "intelmq.bots.collectors.url.collector",
    "description": "Arbor Collector is the bot responsible to get the report from source of information.",
    "parameters": {
        "url": {
            required: True,
            default: None
        },
        "processing_interval": {
            required: True,
            default: "0"
        }
    }
    "module": "intelmq.bots.collectors.url.collector",
    "description": "Arbor Collector is the bot responsible to get the report from source of information.",
    "parameters": {
        "url": {
            required: True,
            default: None
        }
    }
}

Remove unnecessary imports

lib/bot.py example:

from lib.pipeline import *
from lib.utils import *
from lib.cache import *
from lib.event import *

Sanitize Arch

I think /lib/ should have a sanitize lib(utils) and each input bot should use sanitize utils to sanitize all information. It needs to be a sanitizer specific to each bot due the input data details.

Parameters Values logged

Write a note in UserGuide regarding the debug log mode write all passwords from bots.conf configuration into log file.

2014-07-04 19:01:25,275 - abusehelperbot - DEBUG - Parameter 'password' loaded with the value xxxxxxxxxxxxx

Messages counter - bot

def events_counter(self):
    num = 50

    if not hasattr(self, '__count'):
        self.__count = 0

    self.__count += 1
    if (self.__count % num) == 0:
        self.logger.info("Processed %s messages." % self.__count)

Controller Tool

Commands List

Run Bots (one by one)

# python controller.py --start --bot=experts.cymru.cymru --id=cymru-expert
# python controller.py --stop --id=cymru-expert
# python controller.py --restart --id=cymru-expert
# python controller.py --reload --id=cymru-expert

List and Modify Pipeline

# python controller.py --list-pipeline
    [1] < bot_id > = < source queue > | < destination queues >
    [2] < bot_id > = < source queue > | < destination queues >
    [3] < bot_id > = < source queue > | < destination queues >

# python controller.py --edit-pipeline=2
    Bot: experts.geoip.geoip
    Bot ID: geoip-expert

    Source Queue [Current: taxonomy-queue]: 
    Destination Queue [Current: cymru-queue]:

Run Botnet

# python controller.py --start-botnet

    [1] ID: geoip-expert
    [2] ID: arbor-feed
    [3] ID: arbor-parser
    [4] ID: cymru-expert
    [5] ID: archive

    Choose bots to run: 1-2,4-5

        > Starting geoip-expert...
        > Starting arbor-feed...
        > Starting cymru-expert...
        > Starting archive...
        > Leave arbor-parser...

# python controller.py --stop-botnet
# python controller.py --restart-botnet
# python controller.py --reload-botnet

Detailed Information

  • Logger configuration from SYSTEM.conf must be removed because LOGGER-MODE should be an option independent for each bot.
  • Pipeline.conf must be removed because Source Queue and Destination Queues must be stored in each bot configuration file located at botsconf/ folder.
  • To list the pipeline like the old pipeline.conf file, you should use:
# python controller.py --list-pipeline
    [1] < bot_id > = < source queue > | < destination queues >
    [2] < bot_id > = < source queue > | < destination queues >
    [3] < bot_id > = < source queue > | < destination queues >
  • To edit directly pipeline use the following command. Keep in mind that this changes will change the queues configurations in the specific botid configuration and not in a suppposed pipeline.conf file.
# python controller.py --edit-pipeline=2
    Bot: experts.geoip.geoip
    Bot ID: geoip-expert

    Source Queue [Current: taxonomy-queue]: 
    Destination Queue [Current: cymru-queue]: 
  • Bot library (bot.py) should receive by init the following information:
    • bot id
    • bot-conf-path (use bot id to select the file with the same name in this folder)
  • Should exists a DEFAULT-PARAMETERS configuration file in intelmq/src/bots directory.
processing_interval = 0
cache_host = 127.0.0.1
cache_port = 6379
cache_id = 10
cache_ttl = 86400
rabbitmq_host = 127.0.0.1
rabbitmq_port = ...
rabbitmq_ssl = True
etc...
  • To start a bot use the following command that will use also the following algorithm:
# python controller.py --start --bot=experts.cymru.cymru --id=cymru-expert

ALGORITHM:

            1. Check if bot exists (check if folder and file exists etc)
            2. Check if id already exists in botconfs/ folder.
                    - IF IS RUNNING -> Do nothing. Send message that is running.
                    - IF NOT RUNNING
                    1. Ask if user wants to load configuration or create a new one
                            if load, show the config and ask if he is sure about the load
                            go to 2.2
            2.2 ID doest exists or user wants to create a new config
            - load PARAMETERS from bot.experts.geoip.PARAMETERS to present the default ones and ask one by one

Other commands:

python controller.py --stop --id=cymru-expert
python controller.py --restart --id=cymru-expert
python controller.py --reload --id=cymru-expert
  • To start botnet use the following command.

ALGORITHM: for each bot already configured, ask if want to start it and start it or leave stopped.

# python controller.py --start-botnet

    [1] ID: geoip-expert
    [2] ID: arbor-feed
    [3] ID: arbor-parser
    [4] ID: cymru-expert
    [5] ID: archive

    Choose bots to run: 1-2,4-5

        > Starting geoip-expert...
        > Starting arbor-feed...
        > Starting cymru-expert...
        > Starting archive...
        > Leave arbor-parser...

Other commands:

python controller.py --stop-botnet
python controller.py --restart-botnet
python controller.py --reload-botnet

MQ Configurations

Add section in system.conf to change the default options of MQ System.

Ex:
Host
Port

Output to Database by default

All events should be stored by default into a database. Feedback received from multiple people. (MongoDB or PostgreSQL.)

csv / tsv parsing

We need to create generic method for parsing csv, tsv and/or other types of unformatted data

PostgreSQL Init DB

The file 'initdb.sql' should be generate autocatically from 'docs/DataHarmonization-table.md'.

Reason: if we need to add/change/remove a specific field in Data Harmonization is it ok because the procedure will parse autocatically DataHarmonization with new field and with that, we dont need to change in all bots that depend on this.

Advanced Dedup Expert

Evaluate how to create an dedup expert for every data source because each data source has a different way to remove deduplicates.

AbuseHelper Bot (input)

AbuseHelper bot is not following the pattern:

def init(self):
    code...

def process(self):
    code...

Note: carefull with encoding

Reported Fields

Change field keys in input bots to 'reported_***' and copy them to normal fields like source_* in harmonization bot

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.