GithubHelp home page GithubHelp logo

bgt-m / spartan2 Goto Github PK

View Code? Open in Web Editor NEW
76.0 9.0 19.0 915 KB

A collection of data mining algorithms on big graphs and time series

License: BSD 3-Clause "New" or "Revised" License

Shell 0.02% Python 99.86% Cython 0.12%
big-graphs data-mining time-series anomaly-detection sparse-tensors

spartan2's Introduction

Welcome to spartan2

Introduction

spartan2 is a collection of data mining algorithms on big graphs and time series, providing three basic tasks: anomaly detection, forecast, and summarization. (see readthedocs, and tutorials )

Graphs and time series are fundamental representations of many key applications in a wide range of

  • online user behaviors, e.g. following in social media, shopping, and downloading Apps,
  • finance, e.g. stock tradings, and bank transfers,
  • sensor networks, e.g. sensor readings, and smart power grid, and
  • health, e.g. electrocardiogram, photoplethysmogram, and respiratory inductance plethysmography.

In practice, we find that thinking graphs and time series as matrices or tensors can enable us to find efficient (near linear), interpretable, yet accurate solutions in many applications. Therefore, our goal is developping a collectioin of algorithms on graphs and time series based on tensors (matrix is a 2-mode tensor).

In real world, those tensors are sparse, and we are required to make use of the sparsity to develop efficient algorithms. That is why we name the package as spartan: sparse tensor analytics.

The package named spartan can be imported and run independently as a usual python package. Everything in package spartan is viewed as a tensor (sparse).

Install requirements

This project requires Python 3.7 and upper. We suggest recreating the experimental environment using Anaconda through the following steps.

  1. Install the appropriate version for Anaconda from here - https://www.anaconda.com/distribution/

  2. Create a new conda environment named "spartan"

        conda create -n spartan python=3.7
        conda activate spartan
  3. If you are a normal USER,

    # install spartan using pip
    pip install spartan2
  4. If you want to contribute, or prefer to run directly on the code,

    Please do the following setup
    • 4.1 Clone the project from github

      git clone [email protected]:BGT-M/spartan2.git
    • 4.2 Install requirements.

      # [not recommended]# pip install --user --requirement requirements
      # using conda tool
      conda install --force-reinstall -y --name spartan -c conda-forge --file requirements

      or use the following way

      # this may not work in ubuntu 18.04
      python setup.py install
    • 4.3 Install code in development mode

      # in parent directory of spartan2
      pip install -e spartan2
    • 4.4 Since you install your package to a location other than the user site-packages directory, you will need to add environment variable PYTHONPATH in ~/.bashrc

      export PYTHONPATH=/<dir to spartan2>/spartan2:$PYTHONPATH

      or prepend the path to that directory to your PYTHONPATH environment variable.

      import sys
      sys.path.append("/<dir to spartan2>/spartan2")

      or do as follows

      #find directory of site-packages
      python -c 'import site; print(site.getsitepackages())'
      
      #add \<name\>.pth file in your site-packages directory with string '/<dir to spartan2>/spartan2'
      

Table of Modules

Type Abbr Paper Year Tutorials
Graph spartan.HoloScope [1] HoloScope: Topology-and-Spike Aware Fraud Detection [pdf] [bib]
[2] A Contrast Metric for Fraud Detection in Rich Graphs [pdf] [bib]
2017
2019
HoloScope
Graph spartan.Eigenspokes [3] Eigenspokes: Surprising patterns and scalable community chipping in large graphs [pdf] [bib] 2010 Eigenspokes
Graph spartan.EagleMine [4] EagleMine: Vision-guided Micro-clusters recognition and collective anomaly detection [pdf] [bib]
Beyond outliers and on to micro-clusters: Vision-guided anomaly detection [pdf] [bib]
2021
2019
EagleMine
Graph spartan.Fraudar [5] Fraudar: Bounding graph fraud in the face of camouflage [pdf] [bib] 2016 Fraudar
Graph spartan.DPGS [6] DPGS: Degree-Preserving Graph Summarization [pdf] [bib] 2021 DPGS
Graph spartan.EigenPulse [7] EigenPulse: Detecting Surges in Large Streaming Graphs with Row Augmentation [pdf] [bib] 2019 EigenPulse
Graph spartan.FlowScope [8] FlowScope: Spotting Money Laundering Based on Graphs [pdf] [bib] 2020 FlowScope
Graph spartan.kGrass [9] GraSS: Graph structure summarization [pdf] [bib] 2010 kGrass
Graph spartan.IAT [10] RSC: Mining and modeling temporal activity in social media [pdf] [bib] 2015 IAT
Graph spartan.CubeFlow [11] CubeFlow: Money Laundering Detection with Coupled Tensors [pdf] [bib] 2021 CubeFlow
Graph spartan.SpecGreedy [12] Specgreedy: unified dense subgraph detection [pdf] [bib] 2020 SpecGreedy
Graph spartan.MonLAD [13] MonLAD: Money Laundering Agents Detection in Transaction Streams [pdf] 2022 MonLAD
Time Series spartan.BeatLex [14] BEATLEX: Summarizing and Forecasting Time Series with Patterns [pdf] [bib] 2017 Beatlex
Time Series spartan.BeatGAN [15] BeatGAN: Anomalous Rhythm Detection using Adversarially Generated Time Series [pdf] [bib]
[16] Time Series Anomaly Detection with Adversarial Reconstruction Networks [pdf] [bib]
2019
2022
BeatGAN

References

  1. Shenghua Liu, Bryan Hooi, and Christos Faloutsos, "HoloScope: Topology-and-Spike Aware Fraud Detection," In Proc. of ACM International Conference on Information and Knowledge Management (CIKM), Singapore, 2017, pp.1539-1548.

  2. Shenghua Liu, Bryan Hooi, Christos Faloutsos, A Contrast Metric for Fraud Detection in Rich Graphs, IEEE Transactions on Knowledge and Data Engineering (TKDE), Vol 31, Issue 12, Dec. 1 2019, pp. 2235-2248.

  3. Prakash, B. Aditya, Ashwin Sridharan, Mukund Seshadri, Sridhar Machiraju, and Christos Faloutsos. "Eigenspokes: Surprising patterns and scalable community chipping in large graphs." In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 435-448. Springer, Berlin, Heidelberg, 2010.

  4. Wenjie Feng, Shenghua Liu, Christos Faloutsos, Bryan Hooi, Huawei Shen, and Xueqi Cheng. EagleMine: Vision-guided Micro-clusters recognition and collective anomaly detection, Future Generation Computer Systems, Vol 115, Feb 2021, pp.236-250.

    Wenjie Feng, Shenghua Liu, Christos Faloutsos, Bryan Hooi, Huawei Shen, Xueqi Cheng, Beyond outliers and on to micro-clusters: Vision-guided anomaly detection, In Proc. of the 23rd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2019), 2019, Macau, China, pp541-554.

  5. Hooi, Bryan, Hyun Ah Song, Alex Beutel, Neil Shah, Kijung Shin, and Christos Faloutsos. "Fraudar: Bounding graph fraud in the face of camouflage." In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 895-904. 2016.

  6. Houquan Zhou, Shenghua Liu, Kyuhan Lee, Kijung Shin, Huawei Shen and Xueqi Cheng. "DPGS: Degree-Preserving Graph Summarization." In SDM, 2021.

  7. Zhang, Jiabao, Shenghua Liu, Wenjian Yu, Wenjie Feng, and Xueqi Cheng. "Eigenpulse: Detecting surges in large streaming graphs with row augmentation." In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 501-513. Springer, Cham, 2019.

  8. Li, Xiangfeng, Shenghua Liu, Zifeng Li, Xiaotian Han, Chuan Shi, Bryan Hooi, He Huang, and Xueqi Cheng. "Flowscope: Spotting money laundering based on graphs." In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 04, pp. 4731-4738. 2020.

  9. LeFevre, Kristen, and Evimaria Terzi. "GraSS: Graph structure summarization." In Proceedings of the 2010 SIAM International Conference on Data Mining, pp. 454-465. Society for Industrial and Applied Mathematics, 2010.

  10. Ferraz Costa, Alceu, Yuto Yamaguchi, Agma Juci Machado Traina, Caetano Traina Jr, and Christos Faloutsos. "Rsc: Mining and modeling temporal activity in social media." In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp. 269-278. 2015.

  11. Sun, Xiaobing, Jiabao Zhang, Qiming Zhao, Shenghua Liu, Jinglei Chen, Ruoyu Zhuang, Huawei Shen, and Xueqi Cheng. "CubeFlow: Money Laundering Detection with Coupled Tensors." In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 78-90. Springer, Cham, 2021.

  12. Feng, Wenjie, Shenghua Liu, Danai Koutra, Huawei Shen, and Xueqi Cheng. "Specgreedy: unified dense subgraph detection." In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 181-197. Springer, Cham, 2020.

  13. Sun, Xiaobing, Wenjie Feng, Shenghua Liu, Yuyang Xie, Siddharth Bhatia, Bryan Hooi, Wenhan Wang, and Xueqi Cheng. "MonLAD: Money Laundering Agents Detection in Transaction Streams." arXiv preprint arXiv:2201.10051 (2022).

  14. Bryan Hooi, Shenghua Liu, Asim Smailagic, and Christos Faloutsos, “BEATLEX: Summarizing and Forecasting Time Series with Patterns,” The European Conference on Machine Learning & Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), Skopje, Macedonia, 2017.

  15. Zhou, Bin, Shenghua Liu, Bryan Hooi, Xueqi Cheng, and Jing Ye. "BeatGAN: Anomalous Rhythm Detection using Adversarially Generated Time Series." In IJCAI, pp. 4433-4439. 2019.

  16. Liu, Shenghua, Bin Zhou, Quan Ding, Bryan Hooi, Zheng bo Zhang, Huawei Shen, and Xueqi Cheng. "Time Series Anomaly Detection with Adversarial Reconstruction Networks." IEEE Transactions on Knowledge and Data Engineering (2022).

spartan2's People

Contributors

cmlfexponential avatar endlesslethe avatar fresh-meet avatar hi-bingo avatar hqjo avatar kabochueng avatar shenghua-liu avatar tdingquan avatar viki623 avatar wenchieh avatar xbingsun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spartan2's Issues

Add applications for Cubeflow.

This issue is created for triggering the actions that publish spartan to testPyPI.

Please link an issue when merge into master.

No definition of some variables

In model\specgreedy\Specgreedy.py sps is not defined. Does it mean scipy.sparse?
image
In the function fast_greedy_decreasing_monosym of model\fraudar\greedy.py, copy is not defined.
image

pip install spartan2 error

Collecting spartan2
Using cached spartan2-0.1.3.post4.tar.gz (183 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [6 lines of output]
Traceback (most recent call last):
File "", line 36, in
File "", line 34, in
File "C:\Users\11381\AppData\Local\Temp\pip-install-5t_ej234\spartan2_82ce834d43d44be19f5f75529d918a3e\setup.py", line 7, in
from Cython.Build import cythonize, build_ext
ModuleNotFoundError: No module named 'Cython'
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

这个是啥问题啊

数据集

你好,请问用到的数据集能分享一下吗

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.