GithubHelp home page GithubHelp logo

yzhao062's Introduction

😄 I am on the market with expected graduation in Summer 2023. I am broadly interested in machine learning, data mining and science, and information science and systems positions. I can work in the U.S., Canada, and China without sponsorship; please reach out if you have an open opportunity in either academia or industry! Please reach out by email (zhaoy [AT] cmu.edu)

🌱 Short Bio: My name is Yue ZHAO (赵越 in Chinese). I am a rising 4-th year Ph.D. student at Carnegie Mellon University (CMU). Before joining CMU, I earned my Master degree from University of Toronto (2016) and Bachelor degree from University of Cincinnati (2015), and worked as a senior consultant at PwC Canada (2016-19). I am an expert on anomaly detection (a.k.a outlier detection) algorithms, systems, and its applications in security, healthcare, and Finance, with more than 7-year professional experience and 20+ papers (in JMLR, TKDE, NeurIPS, etc.). I appreciate the support from Norton Labs Graduate Fellowship. See my homepage and CV for more information.

Contributions to outlier detection systems, benchmarks, and applications: I build automated, scalable, and accelerated machine learning systems (MLSys) to support large-scale, real-world outlier detection applications in security, finance, and healthcare with millions of downloads. I designed CPU-based (PyOD), GPU-based (TOD), distributed detection systems (SUOD) for tabular (PyOD), time-series (TODS), and graph data (PyGOD). To understand the characteristics of OD algorithms, I co-author large-scale benchmarks for tabular data (ADBench), time-series data (paper), and graph data (UNOD). My work has been widely used by thousands of projects and applications, including leading firms like IBM, Morgan Stanley, and Tesla. See more applications.

🔭 Research outcomes (related to outlier detection if not specified):

Primary field Secondary Method Year Venue Lead author
large-scale Benchmark tabular anomaly detection ADBench 2022 Preprint Y
large-scale Benchmark graph anomaly detection UNOD 2022 Preprint Y
large-scale Benchmark sequence anomaly detection TODS 2021 NeurIPS
automated machine learning outlier model selection MetaOD 2021 NeurIPS Y
automated machine learning outlier model selection ELECT 2022 ICDM Y
automated machine learning outlier HP optimization HPOD 2022 Preprint Y
automated machine learning outlier evaluation IPM 2021 Preprint Y
machine learning systems PyOD 2019 JMLR Y
machine learning systems time series TODS 2020 AAAI
machine learning systems SUOD 2021 MLSys Y
machine learning systems distributed systems TOD 2022 Preprint Y
machine learning systems graph neural networks PyGOD 2022 Preprint Y
ensemble learning semi-supervised XGBOD 2018 IJCNN Y
ensemble learning LSCP 2019 SDM Y
ensemble learning machine learning systems combo 2020 AAAI Y
ensemble learning interpretable ML COPOD 2020 ICDM Y
ensemble learning interpretable ML ECOD 2022 TKDE Y
graph mining finance AutoAudit 2020 BigData
graph neural networks contrastive learning CONAD 2022 PAKDD
Diffusion Models survey 2022 Preprint
AI x Science synthetic data SynC 2020 ICDMW
AI x Science healthcare AI PyHealth 2020 Preprint Y
AI x Science Datasets & Benchmark TDC 2021 NeurIPS
AI x Science Datasets & Benchmark TDC V2 2022 NCHEMB

At CMU, I work with Prof. Leman Akoglu, Prof. Zhihao Jia, and Prof. George H. Chen. Externally, I collaborate with Prof. Jure Leskovec at Stanford, Prof. Xia "Ben" Hu at Rice University, and Prof. Philip S. Yu at UIC.


Open-source Contribution: I have led or contributed as a core member to more than 10 ML open-source initiatives, receiving 14,000 GitHub stars (top 0.002%: ranked 800 out of 40M GitHub users) and >10,000,000 total downloads. Popular ones:

  • PyOD: A Python Toolbox for Scalable Outlier Detection (Anomaly Detection).
  • ADBench: The most comprehensive tabular anomaly detection benchmark (30 anomaly detection algorithms on 55 benchmark datasets).
  • TOD: Tensor-based outlier detection--First large-scale GPU-based system for acceleration!
  • SUOD: An Acceleration System for Large-scale Heterogeneous Outlier Detection.
  • anomaly-detection-resources: The most starred resources (books, courses, etc.)!
  • Python Graph Outlier Detection (PyGOD): A Python Library for Graph Outlier Detection.
  • Therapeutics Data Commons (TDC): Machine learning for drug discovery.
  • PyTorch Geometric (PyG): Graph Neural Network Library for PyTorch. Contributed to profiler & benchmarking, and heterogeneous data transformation.
  • combo: A Python Toolbox for ML Model Combination (Ensemble Learning).
  • TODS: Time-series Outlier Detection. Contributed to core detection models.
  • MetaOD: Automatic Unsupervised Outlier Model Selection (AutoML).

📫 Contact me by:


💬 News & Travel:


Yue's github stats Top Langs

yzhao062's People

Contributors

yzhao062 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.