GithubHelp home page GithubHelp logo

Social Media IE

Social Media information extraction tool. It supports the following tasks:

  • Sequence Tagging: Named Entity Recognition, Part of Speech, Chunking, CCG Supersense Tagging (List of datasets at: https://socialmediaie.github.io/datasets.html
  • Classification: Sentiment classification, Abusive Speech Classification, Uncertainity indicator classification
  • Active Learning: Classification tasks using active learning

Tutorial on using SocialMediaIE can be found at our IC2S2 2020 tutorial website

Please cite the following if using the tool:

  • Shubhanshu Mishra. 2019. Multi-dataset-multi-task Neural Sequence Tagging for Information Extraction from Tweets. In Proceedings of the 30th ACM Conference on Hypertext and Social Media (HT '19). ACM, New York, NY, USA, 283-284. DOI: https://doi.org/10.1145/3342220.3344929
  • Shubhanshu Mishra. 2019. Information extraction from digital social trace data with applications to social media and scholarly communication data. PhD Dissertation, University of Illinois at Urbana-Champaign. https://shubhanshu.com/phd_thesis/

Pretrained multi-task models and experimental models

  • Mishra, Shubhanshu (2019): Trained models for multi-task multi-dataset learning for text classification as well as sequence tagging in tweets. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1094364_V1
  • Mishra, Shubhanshu (2019): Trained models for multi-task multi-dataset learning for text classification in tweets. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1917934_V1
  • Mishra, Shubhanshu (2019): Trained models for multi-task multi-dataset learning for sequence prediction in tweets. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-0934773_V1

SociaMediaIE

Main library for doing the analyis

Notebooks

Example applications of the library and additional experiments

Experiments

Run experiments based on dataset

Usage

For developers

Install in editable mode:

pip install -e .

For users

Install as pip package:

pip install .

Create documents

https://samnicholls.net/2016/06/15/how-to-sphinx-readthedocs/

cd docs/
sphinx-apidoc -o source/ ../SocialMediaIE

Installing python kernel to jupyter

python -m ipykernel install --user --name ${CONDA_DEFAULT_ENV} --display-name "Python (${CONDA_DEFAULT_ENV})"

Acknowledgements

This library builds upon AllenNLP and Pytorch. Some of the mutli-task learning code is based on the multi-task learning examples in allennlp.

SocialMediaIE's Projects

ednil2020 icon ednil2020

Submission titled Non-neural Structured Prediction for Event Detection from News in Indian Languages for EDNIL 2020 - Event Detection from News in Indian Languages

image-crop-analysis icon image-crop-analysis

Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency

meta icon meta

Data and code related to managing this organization

pytail icon pytail

PyTAIL - Interactive and Incremental Learning of NLP Models with Human in the Loop for Online Data

socialmediaie icon socialmediaie

A toolkit for social media information extraction using multi-task learning and active learning

trac2020 icon trac2020

Multilingual Joint Fine-tuning of Transformer models for identifying Trolling, Aggression and Cyberbullying at TRAC 2020

tutorials icon tutorials

Hands on advanced machine learning for information extraction from tweets tasks, data, and open source tools

twitterner icon twitterner

Twitter named entity extraction for WNUT 2016 http://noisy-text.github.io/2016/ner-shared-task.html

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.