facebook / threatexchange Goto Github PK

Trust & Safety tools for working together to fight digital harms.

Home Page: https://developers.facebook.com/docs/threat-exchange

License: Other

Python 13.70% PHP 0.82% JavaScript 0.47% Makefile 0.39% Ruby 0.95% Batchfile 0.09% Jupyter Notebook 0.35% Go 0.14% C++ 77.68% Shell 0.29% Java 4.05% M4 0.01% C 0.58% Starlark 0.03% Dockerfile 0.04% HTML 0.07% CSS 0.05% CMake 0.09% C# 0.13% Cython 0.07%

hashing image image-hashing image-similarity ncmec perceptual-hashing threatexchange video video-hashing stopncii

threatexchange's People

Stargazers

Watchers

Forkers

ladellm diegovallely brainonboardsl jxljf tpott davemarchevsky he0x tiegz magnologan jessek reedloden sethkontny flora barn xxdesmus atticusliu zmallen misterajc j-p-c thecatwisel paul-pearce cnbird1999 weixu8 maus- acochenour fuzzball5000 dozieamajoyi5 arirubinstein phulc jgarman walkndude pklazer mbarbine dbrockus brentonchang 9b slietz loganding mathlemon linearregression motiwarifacebook pkdevboxy bryonglodencissp jiwanlimbu meirwah triplekill sevenyy niemesrw apolkosnik-old abhuyan jalcaldea elafonizi terry2012 luisalima airyhy jjsahalf awesome-security rypeck juliannagler juehuizh-zz zafarnasir mrmugiwara chenmoshushi runt18 joarleymoraes alexxnica kryndex ankurjain41282 smrojas ph3ar opportunitylivetv heikipikker 1iget rainser limkokholefork jdlehman thezedwards bhanditz warrelis miragshin facebook-gad thedanielsun zhuomingliang abhisuri97 rdeggau nulledexceptions marplemr prassanna-ravishankar quinndiggity haojunyu chenxingshuang horacexd ytsafe imelwin hussienliban berryrb amar-kamat kennyadenat iikovalenko leisaint

threatexchange's Issues

pytx fetch error in fetching malware_connection_families

I was able to pull the families about two weeks ago. But now when I am running the same code it is giving following error, please advise.

Traceback (most recent call last):
File "pull_malware_test3.py", line 93, in
for details in Malware.details(id=result['id'], connection="families", dict_generator=True):
File "/usr/local/lib/python2.7/site-packages/pytx/request.py", line 272, in get_generator
results = cls.get(url, params)
File "/usr/local/lib/python2.7/site-packages/pytx/request.py", line 202, in get
return cls.handle_results(resp)
File "/usr/local/lib/python2.7/site-packages/pytx/request.py", line 113, in handle_results
resp.text))
pytx.errors.pytxFetchError: Response code: 500: {"error":{"message":"An unknown error has occurred.","type":"OAuthException","code":1}}

RFC: Python Library for ThreatExchange

I wanted to build a Python Library that would allow developers to quickly integrate with ThreatExchange. I've started the work here:

https://github.com/mgoffin/ThreatExchange/tree/pytx
(comparison: https://github.com/facebook/ThreatExchange/compare/master...mgoffin:pytx)

The goals I have for this are:

Easy to install (pip install pytx).
Easy to work with for quick prototyping and production-quality development.
- Not just a wrapper around the API. Provide easy-to-use methods for tasks.
- Rich vocabulary list to make code more resilient against name-changes.
- Thorough documentation (both in code and elsewhere) so getting up-to-speed is quick and painless.
Useful results that are easy to loop over, parse, and ingest.
Classful interfaces to the different object types and results.
Flexible for future API enhancements.
Test cases to keep code in working order and automate builds for validation.
Example scripts to get people started.

The code is still very much a WIP but I wanted to get it out there for visibility, comments, and direction. I think it is a decent foundation to work from, but a lot can be done to make this better. Some of the goals haven't been started because I think the foundation needs to mature a bit before they can be worked on (like Classful objects and results, example scripts, test cases, more documentation, etc.).

Ultimately I'd like to get this in a state where it's stable enough to submit a PR, get some development interest from other community members, then get it up on PyPi for quick installation.

Here's an example of using this library in its current form:

import pytx
import pytx.vocabulary as v

p = pytx.pytx('<app-id>', '<app-secret>')

# Find malware using approximate matching for "www.facebook.com"
i = p.malware_analyses(text="www.facebook.com")

# Find Indicators using strict matching for "www.facebook.com"
i = p.threat_indicators(text="www.facebook.com", strict_text=True)

# Get a list of ThreatExchange members.
i = p.threat_exchange_members()

# Get a list of malware objects associated with a specific object.
i = p.objects('<object-id>', connection=v.Connection.MALWARE_ANALYSES)

# Quickly inspect and loop over results
print i
for x in i:
    print x

Thanks!

Convert sample Python scripts to Python 3

Python 2 is legacy; please convert the existing sample scripts to Python 3 (3.5 and up?)

API call to identify if "x was added to ThreatExchange" already

In order to identify if some indicator/descriptor has already been added to ThreatExchange for my app, I need to run a complicated query and then parse the results in an effort to surface the potential item. If I don't do this, I run into the potential of overwriting my existing input.

As an example use case, take the following:

I want to identify if I currently have a threat descriptor for evil.com. My query would need to include the following to potentially surface any existing TDs: owner as me, indicator as evil.com, potentially type and strict_text to avoid fuzzy matching. Even with a detailed query like that, it's entirely possible that I would get more than just my evil.com threat descriptor as evil.com could lie in the description of other descriptors I have sent in. So, I would need to parse the results, looking to ensure the "indicator" field was an exact match.

The way to combat this for now is to store FBIDs locally inside of some localized database, but assuming I have already sent data in, the above process is what I need to go through just to identify if I have already pushed data for a specific indicator. There's a lot of friction there.

Export ThreatExchange data in STIX/TAXI formats

Much of the community uses exchange formats like STIX and TAXI. We should create sample code which pulls data From ThreatExchange and converts it to these formats.

Add version somewhere.

It would be sweet if I could do:

import pytx
print pytx.__version__

And have it print the current pytx version number.

Odd response count inconsistencies

I wrote the following script:

#!/usr/bin/env python

from pytx import ThreatDescriptor
import time

def foo():
    start = time.time()
    c = 0
    for t in ThreatDescriptor.objects(text="facebook.com", dict_generator=True):
        c += 1
    end = time.time()
    print("time: ", end - start)
    print("count: ", c)

foo()

Several executions resulted in the following:

('time: ', 10.145052909851074)
('count: ', 200)

('time: ', 6.017927169799805)
('count: ', 175)

('time: ', 120.53107500076294)
('count: ', 6141)

('time: ', 61.75989103317261)
('count: ', 3275)

('time: ', 36.055991888046265)
('count: ', 2575)

('time: ', 40.71333408355713)
('count: ', 1600)

('time: ', 18.10604500770569)
('count: ', 825)

It seems very odd that the times and counts are so far off. Can anyone else reproduce the wide array of times and counts that I am seeing?

Documentation for threat_descriptors submission should specify URL path correctly

https://developers.facebook.com/docs/threat-exchange/reference/submitting/v2.5

Where it says:

You may submit data to the graph via an HTTP POST request the following URL:
https://graph.facebook.com/threat_descriptors

Turns out nothing will work unless you specify the platform version you want to use:
https://graph.facebook.com/v2.4/threat_descriptors

Took me several hours figuring that one out, proxying the request, looking at the postdata... :( I finally found a URL in some code that had the platform version in the URL path and decided to try it out.

[pytx] Added privacy_type to Malware objects

Privacy_type is missing on pytx.Malware objects and in pytx.vocabulary.Malware, despite being a field in TX for Malware Analyses

Dateutil version

Hi guys,

I ran:

python2.7 setup.py install

Which ended successfully.

Running your example as is yields the following error in my system:

> python ./get_all_data/get_threat_indicators.py --text="facebook" --days_back=10
Traceback (most recent call last):
  File "./get_all_data/get_threat_indicators.py", line 65, in <module>
    main()
  File "./get_all_data/get_threat_indicators.py", line 33, in main
    utils.get_time_params(s.end_date, day_counter, format_)
  File "build/bdist.macosx-10.11-x86_64/egg/pytx/utils.py", line 62, in get_time_params
AttributeError: 'module' object has no attribute 'parser'

To fix it, I changed utils.py as follows:

diff --git a/pytx/pytx/utils.py b/pytx/pytx/utils.py
index a42dc50..5538f3d 100644
--- a/pytx/pytx/utils.py
+++ b/pytx/pytx/utils.py
@@ -1,4 +1,4 @@
-import dateutil
+import dateutil.parser as dateutil_parser
 import datetime


@@ -59,7 +59,7 @@ def get_time_params(end_date, day_counter, format_):
     """
     # We use dateutil.parser.parse for its robustness in accepting different
     # datetime formats
-    until_param = dateutil.parser.parse(end_date) - \
+    until_param = dateutil_parser.parse(end_date) - \
         datetime.timedelta(days=day_counter)
     until_param_string = until_param.strftime(format_)

And then I ran python2.7 setup.py install again, after which the get_threat_indicators.py ran successfully.

I noticed that, in setup.py, and also in requirements.txt (why the duplication?) you are specifying the required version of dateutil as 2.5.2, but that seems to be ignored by the system. I'm not an expert in python builds so maybe all of this is not that useful for you, but I thought I'd give you guys a heads up. Please close if irrelevant. Thanks for the very interesting work 😄

ThreatExchange Privacy Controls Doc v2.5 says "feature that was removed after Graph API v2.3"

https://developers.facebook.com/docs/threat-exchange/reference/privacy/v2.5

When I visit I see (!) This document refers to a feature that was removed after Graph API v2.3.

I assume that is incorrect?

Problem with filter threat_type

Good Morning,

I'm trying to get all descriptors of a given date. For this I am doing get this URL:

https://graph.facebook.com/threat_descriptors?access_token=xxx|YYY&since=2015-08-03T07:00:00&until=2015-08-03T07:10:00&limit=100

Can i filter by threat_type? Is not working

Add field to control which field is searched with the 'text' parameter

By default, we search all of the available fields using what's supplied in the 'text' parameter. Currently, the 'strict_text' parameter limits the search to the primary field (indicator or name). We should create a new field, 'text_field', which specifies which field should be searched using the 'text' parameter. For example,

/threat_descriptors?text="Jesse's best C2"&text_field=description&strict_text=1

should search the description field for exact matches to "Jesse's best C2".

See the discussion in #84 for background.

Allow pytx to read creds from a file

Rather than having to modify scripts, pytx could read the necesary creds from a file or environment var, like boto does.

pytx ThreatTag.objects() call uses ThreatDiscriptor parameters

@mgoffin

In pytx 0.5.3, I noticed that ThreatTag.objects() actually uses the same parameters as ThreatDiscriptor.objects(). e.g strict_text=false&include_expired=false, rather than the correct ones for ThreatTag. This causes an HTTP 500 error.

Allow custom headers for pytx

Similar to how we allow people to set a proxy, or setup logging, we should allow them to setup custom headers and then use those headers for all requests going forward.

Example code:

headers = {
    'User-Agent': 'Foo'
}

response = requests.get(url, headers=headers)

Failed malware analysis example on documentation

Attempting to run the example in the documentation (https://developers.facebook.com/docs/threat-exchange/reference/apis/threat-indicator/v2.5) fails inside the graph explorer.

https://graph.facebook.com/v2.4/768629009848617/malware_analyses/?access_token=555|aSdF123GhK

Provide also GOLANG example ?

Is it possible to provide GOLANG examples too?

Edits to the example code (in bold)

from pytx.access_token import access_token
from pytx import ThreatDescriptor
from pytx.vocabulary import ThreatDescriptor as td
from pytx.vocabulary import ThreatIndicator as ti
access_token('', '')

results = ThreatDescriptor.objects(text='www.facebook.com')
for result in results:
print result.get(td.THREAT_TYPE)

results = ThreatDescriptor.objects(type_='IP_ADDRESS',
text='127.0.0.1')
for result in results:
print result.get(ti.INDICATOR)

PYTX: Some objects don't support all query parameters

ThreatDescriptor.objects should support queries by REVIEW_STATUS and SHARE_LEVEL
Malware.objects should support queries by SAMPLE_TYPE

Control where "strict_text" is applied when searching

The current "strict_text" parameter is great for reducing fuzzy matches, but I've noticed that while it uses my exact query, it seems to apply across all fields. It would be helpful to specify the exact field I would like my "strict_text" query to run against.

Example use case:

from pytx import init
from pytx import ThreatDescriptor
from pytx.vocabulary import Types

init(app_id='<app-id>', app_secret='<app-secret>')
results = ThreatDescriptor.objects(
    text='haberko.com',
    strict_text=True,
    type_=Types.DOMAIN
)

When executing the above example, two different descriptors come back. One is an exact match (based on indicator) while the other was found just because the query was found in the comment. My use case is to get information about the exact indicator I queried for and nothing else, so having to loop over results to check where the match occurring is an extra step and potentially long one if the item being queried is more popular. Maybe allow for the user to specify the field to match against and just default to _all like it currently does now.

HTTP 500 Error - Internal Error for huge samples

When trying to request "huge" samples it will cause a HTTP 500 - Internal Error.

Here are some example requests:

ThreatExchange IOC

Are ready to export the system alerts to IOC format?

Expose the descriptor creation time using the added_on field

Under v2.3, we had a field 'added_on', which was the creation time for an indicator. With descriptors in v2.4, we have no notion of when a descriptor was created. We should expose this creation time with the field 'added_on'.

duplicated python scripts are confusing

[ivanlei@some_machine:~/source/ThreatExchange] $ find . -name 'get_*.py'
./malware/get_samples.py
./members/get_members.py
./pytx/scripts/get_compromised_credentials.py
./pytx/scripts/get_indicators.py
./pytx/scripts/get_members.py
./threat_indicators/get_compromised_credentials.py
./threat_indicators/get_indicators.py

So there's:

2 copies of get_compromised_credentials.py
2 copies of get_indicators.py

and there's basically 4 directories with scripts in them:

malware/
members/
pytx/scripts/
threat_indicators/

I suggest all the scripts be moved into pytx/scripts/ so they can be installed with as part of the pytx python package.

Don't remove old versions of pytx from pypi

pytx is awesome and we've started using it to talk to ThreatExchange at Bitly. We had pinned our pytx version to 0.1.0, which is no longer available in pypi.

I propose keeping older versions in pypi unless there's a security hole or other glaring need to remove. Of course, the ThreatExchange API might change to a degree such that older versions of pytx aren't useful. Not sure what to do in that case.

Thanks!

Updates to Threat Descriptors Fail

I am attempting to update the status of a threat descriptor and can't seem to get it to stick despite being returned a success. I have tried this through PyTX, and have used requests below as an example. Note, I am including the params as both data (POST) and params (GET) just in case, but have tried each independently with no luck.

import logging
import requests
from pytx.vocabulary import ThreatExchange as te

logging.basicConfig(level=logging.DEBUG)

auth = {
    'app_token': 'APP_TOKEN',
    'app_secret': 'APP_SECRET'
}
params = {'status': 'MALICIOUS'}
access_token = auth['app_token'] + '|' + auth['app_secret']
url = te.URL + str('907174782669741')
url = "%s?access_token=%s" % (url, access_token)
response = requests.post(
    url,
    params=params,
    data=params
)
print response.content

Here's the debugging output I am seeing:

INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): graph.facebook.com
DEBUG:requests.packages.urllib3.connectionpool:"POST /907174782669741?access_token=<APP_TOKEN>%7C<APP_SECRET>&status=MALICIOUS HTTP/1.1" 200 16
Response: {"success":true}

InsecurePlatformWarning from requests on OSX with default python

$ python scripts/get_compromised_credentials.py --since=2015-05-01 --until=2015-06-01
READING https://graph.facebook.com/threat_indicators/
/Users/ivanlei/virtual_envs/ThreatExchange/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning

This is easily fixed. PR incoming.

Invalid field "type" when querying ThreatDescriptors

When searching for ThreatDescriptors using the "objects" interface in pytx, it does not seem to match the documentation displayed on the Facebook developers site - https://developers.facebook.com/docs/threat-exchange/reference/apis/threat-descriptors.

from pytx import init
from pytx import ThreatDescriptor
from pytx.vocabulary import Types

init(app_id='<app-id>', app_secret='<app-secret>')
results = ThreatDescriptor.objects(
    text='37.59.224.217',
    strict_text=True,
    type=Types.IP_ADDRESS
)

When running the code above, the following comes back:

Traceback (most recent call last):
  File "app/tools/threat_exchange.py", line 35, in <module>
    type=Types.IP_ADDRESS
TypeError: objects() got an unexpected keyword argument 'type'

pytx support for batch queries

The Graph API allows for this and right now the only way to do so is using the raw argument to .objects() (I'm not sure if that works without also having to set full_response or dict_generator to True).

pytx should have a simple interface for making these types of requests.

MalwareFamilies in pytx.vocabulary is plural, should be singular for consistency

see title

Fields for Threat Descriptors do not match documentation

Several of the fields listed in the documentation for v2.4 threat descriptors don't work. To wit:

added_on
submitter_count
threat_types

pytx: Searching for a ThreatDescriptor shows 'privacy_type': None

I added a ThreatDescriptor with a privacy_type of "HAS_PRIVACY_GROUP", but searching for it in pytx is returning "None" as the privacy_type. Just to note, the indicator is not sensitive. I was just testing. Details below.

Using pytx, I added the following ThreatDescriptor:

# Let's try to submit a new Threat Descriptor
from pytx import ThreatDescriptor
from pytx.vocabulary import ThreatDescriptor as tdv
from pytx.vocabulary import Types, Precision, PrivacyType, ReviewStatus, Severity, ShareLevel, Status

params = {
    tdv.INDICATOR : 'http://212.154.211.81/giz.exe',
    tdv.TYPE : Types.URI,
    tdv.CONFIDENCE : 75,
    tdv.DESCRIPTION : 'Ransomware download URL',
    tdv.PRECISION : Precision.MEDIUM,
    tdv.PRIVACY_MEMBERS : '1125937020771155', # CatFanciers ID
    tdv.PRIVACY_TYPE : PrivacyType.HAS_PRIVACY_GROUP,
    tdv.REVIEW_STATUS : ReviewStatus.REVIEWED_AUTOMATICALLY,
    tdv.SEVERITY : Severity.SUSPICIOUS,
    tdv.SHARE_LEVEL : ShareLevel.AMBER,
    tdv.STATUS : Status.MALICIOUS,
    tdv.TAGS : 'sage,ransomware,http_request,malware',
}

result = ThreatDescriptor.new(params=params)
print(result)

The following response returned: {'id': '1447134161986113', 'success': True}

Then I took a look at the indicator directly in my browser with: https://graph.facebook.com/v2.8/1447134161986113/?access_token=[REDACTED]

It showed what I would expect:

{
   "added_on": "2017-02-23T16:38:27+0000",
   "id": "1447134161986113",
   "indicator": {
      "indicator": "http://212.154.211.81/giz.exe",
      "type": "URI",
      "id": "1447134155319447"
   },
   "owner": {
      "id": "1678314142420566",
      "email": "nlhausrath\u0040ashland.com",
      "name": "Ashland CIRT"
   },
   "type": "URI",
   "raw_indicator": "http://212.154.211.81/giz.exe",
   "description": "Ransomware download URL",
   "status": "MALICIOUS",
   "privacy_type": "HAS_PRIVACY_GROUP",
   "share_level": "AMBER"
}

However, when I did the following, privacy_type was set to 'None':

from pytx import ThreatDescriptor

results = ThreatDescriptor.objects(
    text='giz.exe',
    owner='1678314142420566', # me
)

for result in results:
    print(result.to_dict())

The following was printed:

{'privacy_members': None, 'severity': 'SUSPICIOUS', 'owner': {'id': '1678314142420566', 'name': 'Ashland CIRT', 'email': '[email protected]'}, 'privacy_type': None, 'source_uri': '', 'id': '1447134161986113', 'share_level': 'AMBER', 'expired_on': None, 'precision': 'MEDIUM', 'review_status': 'REVIEWED_AUTOMATICALLY', 'metadata': None, 'indicator': {'type': 'URI', 'id': '1447134155319447', 'indicator': 'http://212.154.211.81/giz.exe'}, 'status': 'MALICIOUS', 'my_reactions': None, 'raw_indicator': 'http://212.154.211.81/giz.exe', 'type': 'URI', 'description': 'Ransomware download URL', 'added_on': '2017-02-23T16:38:27+0000', 'last_updated': '2017-02-23T16:38:28+0000', 'tags': {'data': [{'id': '1382721905133632', 'text': 'http_request'}, {'id': '1375757795798370', 'text': 'ransomware'}, {'id': '1318516441499594', 'text': 'malware'}, {'id': '595090370615714', 'text': 'sage'}]}, 'confidence': 75}

Am I doing something wrong or is this the wrong expectation? Thanks!

Search does not recognize '@' symbol

The ThreatExchange search does not appear to recognize the '@' symbol in searches. Looking @evilevil, for example, would return some results which contain "evilevil", but not the '@' symbol.

Github hook for ReadTheDocs

pytx is setup on readthedocs (pytx.readthedocs.org) but in order for it to kick off a new build it needs to be notified (they don't seem to rebuild on their own).

There are ways to hook Github into readthedocs: https://docs.readthedocs.org/en/latest/webhooks.html

readthedocs gives me the following post-commit hook:

curl -X POST http://readthedocs.org/build/pytx

I am not sure how the ReadTheDocs Github app works with the above directions, but it would make life a lot easier knowing when a commit makes it to master the docs are rebuilt automagically :)

Unexpected keyword "status" when searching ThreatDescriptors

from pytx import init
from pytx import ThreatDescriptor
from pytx.vocabulary import Status

init(app_id='<app-id>', app_secret='<app-secret>')
results = ThreatDescriptor.objects(
    text='37.59.224.217',
    strict_text=True,
    status=Status.MALICIOUS
)

When running the code above, the following comes back:

Traceback (most recent call last):
  File "app/tools/threat_exchange.py", line 35, in <module>
    status=Status.MALICIOUS
TypeError: objects() got an unexpected keyword argument 'status'

Name change for ruby lib.

I'd like to propose changing the Ruby lib name from ThreatExchange to threat_exchange everywhere.

Snakecase is the most common convention in Ruby (e.g. list of gems on Rubygems).

It's mostly a cosmetic change, but could help adoption of the library.

common.py/add_connection - not working

Please let us know once add_connection method works and share an example on how to use it.

pytx- why does a day return more ThreatDiscriptors than a month?

Hi everyone,

I'm just starting to explore ThreatExchange with pytx, and I'm getting odd results.

ThreatDescriptor.objects(since="yesterday",
until="now")

Returns 500+ results

ThreatDescriptor.objects(since="-1 month",
until="now")

Returns 4 results

What am I missing? Shouldn't that return up to 1000 results?

Python 3 support?

I'm just wondering if supporting Python 3 is even on the map.

Thanks!

Nuke UI code?

The UI code originally written by @mgoffin was put together on a whim as a PoC and has not been touched in over a year. It is outdated and I have privately seen one bug report in it, though I am unable to replicate the bug. I think it may be time to nuke that code as it isn't being maintained. If anyone agrees I'll toss up a PR and kill it.

pytx support for previous versions

When the 2.4 Graph API came out there were some significant changes to the ThreatExchange API. pytx accurately reflects those changes - but doesn't look to make an effort to enable use of the previous version.

I would like to use pytx to query the old /threat_indicators endpoint. But I don't see a parameter or method to override the API version used by Common.

Recent changes also blew away the fields that were associated with the ThreatIndicators class as well.

I understand the value of staying current. But for me to use the deprecated but still available v2.3 API Call /threat_indicators I would need to downgrade my pytx module.

Support could be maintained for previous version - or perhaps the line should be drawn that the module only supports the latest version.

pytx Read the Docs link not working

https://readthedocs.org/projects/pytx/ referenced in the README.rst isn't working. Getting a "page doesn't exist yet" error.

object "type" field only included in metadata

I had this conversation with @jessek, @mrichard91, and @hammem, but I don't think we got anywhere.

The API allows you to query a specific object by identifier. The URL looks something like this:

https://graph.facebook.com/<object-id>

With the above context it is unknown to the person querying the API what type of object is going to be in the response. One could add ?metadata=1 to the query which will bloat the response with a ton of extra information about each field and what it means, as well as the object's type.

From a development standpoint this becomes a bit cumbersome. Specifically in pytx it causes a bit of a problem when querying for details of an object as can be found here:

https://github.com/facebook/ThreatExchange/blob/master/pytx/pytx/common.py#L209

Users have run into situations where they felt that pytx was "stripping" fields out of the response. This looked like the case because they were filling one class with the fields from another so only the common fields were being applied.

There is the issue that a ThreatIndicator has a type field already, so adding the object's type field would clobber the name, but I still recommend adding a field somewhere which denotes the type of object a developer is working with. In many cases context will prevail (ex: I queried /threat_indicators so I'm getting ThreatIndicator objects back) but there are many cases where it won't and an automated system needs to determine what it's working with.

[RFC] Removing classful POST capability from pytx

With 2.4 there is a very large disparity between how things are uploaded and how things are downloaded. Prior to 2.4 we were able to both GET and POST with the same class attributes and for the most part it worked. Now the differences are so large that representing a descriptor in a class doesn't in any way look like what is necessary for adding a new one to the system.

I'm thinking that we change the following:

GET requests can still be classful and use the generator to return instantiated objects.
classes no longer have a .save() method.
classes get a .new() method (where applicable) which takes either a dict of values or arguments which are acceptable POST parameters for adding a new object.
new ThreatDescriptor class
- once POSTing to /threat_indicators/ is fully deprecated, we can remove the .new() from ThreatIndicators and people will move their code to use ThreatDescriptor instead.
all of the above still leverages the vocabulary.

Any c&c as to whether or not this would be a good evolution to pytx?

pytx should set sane defaults for fields returned and use them by default

Each main class has an _default_fields attribute which is a list of fields that are supposed to be "default fields returned by ThreatExchange when you perform a query."

We should change this so that _default_fields is a list of "useful" fields determined by those that use the data. Once those lists are fixed up, we should change .objects() to use what is in cls._default_fields if the fields argument is still None. This will allow people to override and specify the fields they want to use, otherwise they will get what pytx considers the default ones.

Copy/Paste code examples in API documentation

The value of a copy/paste example when looking at API documentation cannot be overstated. In the current API documents, endpoints and GET/POST content is shown, but it would be nice to have an example I could just copy/paste (think how Google does it) in the top 3-4 languages (Python, Java, PHP).

While it's not terribly difficult to convert the outlined requests into code, it's yet another set of steps that stop me from instantly plugging ThreatExchange into my project. Additionally, it could also be useful to include a hosted version of the JSON code, so I could load it directly from the FB developer site for testing or download it locally.

Add method to search for exact text

There should be a way to search for an exact match. For example, when searching for evilevillabs.com, the query should match for exactly evilevillabs.com, not test.evilevillabs.com or i-know-somebody-at-evilevillabs.com.

Long duration in pulling Malware Connections

Reopening the issue reported in bug #46.

I did another experiment today and following are the stats of pulling dropped, dropped_by and malware, for only 3 minutes period. are below Each of the following data was pulled with since=1433167932 and until=1433168114. (i.e. 3 minutes duration). Note that malware took less than a minute, but dropped and dropped by took 5 and 6 minutes to pull 3 minutes of data. Is this expected or something we are aware of and working? Thanks!

$ ./malware.py
Execution Started: 2015/08/28 20:26:43
Execution Completed: 2015/08/28 20:27:23

$ ./dropped_by.py
Execution Started: 2015/08/28 20:29:32
Execution Completed: 2015/08/28 20:34:39

$ ./dropped.py
Execution Started: 2015/08/28 20:38:52
Execution Completed: 2015/08/28 20:44:18

npm integration for node

It'd be useful to have the node app added to npm. I have an account setup and can push it to npm, but I didnt know if threat exchange owners wanted to own this one

facebook / threatexchange Goto Github PK

threatexchange's People

Stargazers

Watchers

Forkers

threatexchange's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs